Benchmarking for PPO and TRPO #61

miriaford · 2017-07-21T00:48:05Z

Thanks to the OpenAI team for the latest release!

Are there any benchmark results (like Atari score) on PPO and TRPO? DQN has a report here: https://github.com/openai/baselines-results. It's super useful. Thanks again!

Twinko56X · 2017-07-21T12:51:24Z

I did not see any in the repo, but as a general indication PPO has a general benchmark at page 11 in the paper: https://openai-public.s3-us-west-2.amazonaws.com/blog/2017-07/ppo/ppo-arxiv.pdf#page=11

miriaford · 2017-07-21T20:23:27Z

@Twinko56X thanks for the link! It's actually on arxiv now: https://arxiv.org/pdf/1707.06347.pdf

I wonder if this repo is the same code used to produce those plots.

ViktorM · 2017-07-23T21:28:05Z

The DQN baselines results https://github.com/openai/baselines-results looks great, missed them. It would be nice to have at some point similar ipython notebook for the PPO vs TRPO vs DDPG vs IPG for continuous control problems and PPO vs DQN for Atari.

joschu · 2017-08-28T05:54:00Z

I'll add an ipython notebook with the atari an mujoco benchmarks soon.

doviettung96 · 2019-01-21T02:47:24Z

Hi @joschu ,
Currently, I try to replicate the result of the PPO paper on RoboschoolHumanoidFlagrunHarder-v1. Did you use the PPO algorithm in this Openai baselines? I have tried to modified it to include Adaptive learning rate based on KL divergence. Other hyperparameters are set the same as in the paper except the logstd of the action distribution to be zeros (not LinearAnneal(-0.7, -1.6). I have used the policy and value network as (512, 256, 128) and relu activation. However, I could not raise the mean episode reward to 2000. Is there any suggestion? Thanks.

pzhokhov pushed a commit that referenced this issue Aug 30, 2018

instructions for tensorboard (#61)

e5de29a

huiwenn pushed a commit to huiwenn/baselines that referenced this issue Mar 20, 2019

instructions for tensorboard (openai#61)

1a690c8

kkonen pushed a commit to kkonen/baselines-1 that referenced this issue Sep 26, 2019

instructions for tensorboard (openai#61)

42febfa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarking for PPO and TRPO #61

Benchmarking for PPO and TRPO #61

miriaford commented Jul 21, 2017

Twinko56X commented Jul 21, 2017

miriaford commented Jul 21, 2017

ViktorM commented Jul 23, 2017 •

edited

joschu commented Aug 28, 2017

doviettung96 commented Jan 21, 2019

Benchmarking for PPO and TRPO #61

Benchmarking for PPO and TRPO #61

Comments

miriaford commented Jul 21, 2017

Twinko56X commented Jul 21, 2017

miriaford commented Jul 21, 2017

ViktorM commented Jul 23, 2017 • edited

joschu commented Aug 28, 2017

doviettung96 commented Jan 21, 2019

ViktorM commented Jul 23, 2017 •

edited