Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't reproduce riverraid's results #39

Closed
luizapozzobon opened this issue Jun 4, 2022 · 2 comments
Closed

Can't reproduce riverraid's results #39

luizapozzobon opened this issue Jun 4, 2022 · 2 comments

Comments

@luizapozzobon
Copy link

Hello, danijar! First of all, thanks for your work :)

I've been trying out dreamerv2 this past week and tried to reproduce riverraid's results. However, I was unsuccessful and the agent only reaches about ~5k reward after almost 1e6 train steps. This is the latest result I got. If you need, I can attach tensorboard graphs later this week.

train_return 5190 / train_length 982 / train_total_steps 9.5e5 / train_total_episodes 1220 / train_loaded_steps 9.5e5 / train_loaded_episodes 1220

I did a small modification to the original code so it runs on multiple GPUs (tf.distribute.MirroredStrategy). Then, I trained the agent to play Pong and the return plot was similar to the one you posted on #8, so I figured out it was ok. Also, in the riverraid's output attached above, half of it ran with precision=16 and half with precision=32 since it was mentioned in a few other issues that precision 32 helped, especially #30. I did not did a full run with precision=32, though.

Do you have any tips on what could be going wrong or what could I do to debug it?

Thanks so much!

@danijar
Copy link
Owner

danijar commented Jun 4, 2022

What training curves are you getting? It's easy to make mistakes with tf.distribute. For debugging, I recommend to run a few seeds on a single GPU with the original code from the repository here.

@luizapozzobon
Copy link
Author

Yeah, I'm pretty sure it is some problem with my tf.distribute implementation. I ran the original code and it got to higher scores than I was getting much sooner. Thanks for the insight!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants