Run trpo_swimmer in stub mode #65

zhuojw10 · 2016-12-09T15:12:00Z

''python example/trpo_swimmer.py'' works well. In the default setting, after 40 iterations it produces 55.72 average reward.

When I try to run trpo_swimmer.py in the ''stub'' mode (I simply add ''stub(globals())'' at the begining and replace ''algo.train()'' with ''run_experiment_lite(...)" just following ddpg_cartpole and ddpg_cartpole_stub), it still work. However, in the same default setting, it produces 49.59 average reward. I try different random SEED the difference remained.

I'm wondering why the difference exists?

dementrock · 2016-12-13T18:35:59Z

Try setting the scale_reward option in DDPG to 0.1 or 0.01.

zhuojw10 · 2016-12-16T08:18:06Z

@dementrock Thanks

Upgrades Theano to 1.0.1

jonashen pushed a commit to jonashen/rllab that referenced this issue May 29, 2018

Upgrade Theano to 1.0.1 (rll#65)

3ddb23d

Upgrades Theano to 1.0.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run trpo_swimmer in stub mode #65

Run trpo_swimmer in stub mode #65

zhuojw10 commented Dec 9, 2016

dementrock commented Dec 13, 2016

zhuojw10 commented Dec 16, 2016

Run trpo_swimmer in stub mode #65

Run trpo_swimmer in stub mode #65

Comments

zhuojw10 commented Dec 9, 2016

dementrock commented Dec 13, 2016

zhuojw10 commented Dec 16, 2016