From 36110ce4a3af199a55967afbb028d4932435f014 Mon Sep 17 00:00:00 2001 From: Yasuhiro Fujita Date: Mon, 13 Nov 2017 19:35:18 +0900 Subject: [PATCH] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 212875ef2..0eecaf873 100644 --- a/README.md +++ b/README.md @@ -52,6 +52,7 @@ Following algorithms have been implemented in ChainerRL: - DDPG (Deep Deterministic Poilcy Gradients) (including SVG(0)) - PGT (Policy Gradient Theorem) - PCL (Path Consistency Learning) +- PPO (Proximal Policy Optimization) Q-function based algorithms such as DQN can utilize a Normalized Advantage Function (NAF) to tackle continuous-action problems as well as DQN-like discrete output networks.