Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run trpo_swimmer in stub mode #65

Open
zhuojw10 opened this issue Dec 9, 2016 · 2 comments
Open

Run trpo_swimmer in stub mode #65

zhuojw10 opened this issue Dec 9, 2016 · 2 comments

Comments

@zhuojw10
Copy link

zhuojw10 commented Dec 9, 2016

''python example/trpo_swimmer.py'' works well. In the default setting, after 40 iterations it produces 55.72 average reward.

When I try to run trpo_swimmer.py in the ''stub'' mode (I simply add ''stub(globals())'' at the begining and replace ''algo.train()'' with ''run_experiment_lite(...)" just following ddpg_cartpole and ddpg_cartpole_stub), it still work. However, in the same default setting, it produces 49.59 average reward. I try different random SEED the difference remained.

I'm wondering why the difference exists?

@dementrock
Copy link
Member

Try setting the scale_reward option in DDPG to 0.1 or 0.01.

@zhuojw10
Copy link
Author

@dementrock Thanks

jonashen pushed a commit to jonashen/rllab that referenced this issue May 29, 2018
Upgrades Theano to 1.0.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants