Skip to content

Commit

Permalink
Merge pull request #168 from chainer/muupan-patch-1
Browse files Browse the repository at this point in the history
Add PPO to README as an implemented algorithm
  • Loading branch information
muupan committed Nov 13, 2017
2 parents 04e938e + 36110ce commit 6027634
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Expand Up @@ -52,6 +52,7 @@ Following algorithms have been implemented in ChainerRL:
- DDPG (Deep Deterministic Poilcy Gradients) (including SVG(0))
- PGT (Policy Gradient Theorem)
- PCL (Path Consistency Learning)
- PPO (Proximal Policy Optimization)

Q-function based algorithms such as DQN can utilize a Normalized Advantage Function (NAF) to tackle continuous-action problems as well as DQN-like discrete output networks.

Expand Down

0 comments on commit 6027634

Please sign in to comment.