Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't get results for LunarLander #4

Closed
TrentBrick opened this issue Jul 13, 2020 · 6 comments
Closed

Can't get results for LunarLander #4

TrentBrick opened this issue Jul 13, 2020 · 6 comments

Comments

@TrentBrick
Copy link

Hi thanks for sharing your code and implementation.

However, when running your notebook with the LunarLander-v2, even with 1,000 epochs it doesn't seem to be learning anything:

image

Can you share the hyperparameters necessary to reproduce the LunarLander?

Thank you.

@BY571
Copy link
Owner

BY571 commented Jul 14, 2020

hello @TrentBrick !
Sure, here the hyperparameter:
"horizon_scale" : 0.01,
"return_scale" : 0.025,
"replay_size" : 500,
"n_warm_up_episodes" : 10,
"n_updates_per_iter" : 100,
"n_episodes_per_iter" : 20,
"last_few" : 75,
"batch_size" : 768,
"layer_size" : 128,
"learning_rate" : 1e-3

@TrentBrick
Copy link
Author

Thanks for getting back to me and sharing these parameters. I just tried them but they also dont seem to be working for me.

image

@BY571
Copy link
Owner

BY571 commented Jul 17, 2020

oh, I just noticed that there is an older network architecture. try to add two linear layers, it should work then.
Thanks for noticing, ill update the code!

@TrentBrick
Copy link
Author

Just heads up I've moved on to using other repos for this. Spent too long trying to run yours. You should double check that everything now works though for future people that come across this repo.

@tangzk
Copy link

tangzk commented Nov 22, 2022

oh, I just noticed that there is an older network architecture. try to add two linear layers, it should work then. Thanks for noticing, ill update the code!

@BY571 Have you updated the latest code? I got results similar to TrentBrick in LunarLander-v2 env.

@TrentBrick
Copy link
Author

@tangzk if you want to use my repo for this... https://github.com/TrentBrick/RewardConditionedUDRL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants