Can't reproduce riverraid's results #39

luizapozzobon · 2022-06-04T19:57:03Z

Hello, danijar! First of all, thanks for your work :)

I've been trying out dreamerv2 this past week and tried to reproduce riverraid's results. However, I was unsuccessful and the agent only reaches about ~5k reward after almost 1e6 train steps. This is the latest result I got. If you need, I can attach tensorboard graphs later this week.

train_return 5190 / train_length 982 / train_total_steps 9.5e5 / train_total_episodes 1220 / train_loaded_steps 9.5e5 / train_loaded_episodes 1220

I did a small modification to the original code so it runs on multiple GPUs (tf.distribute.MirroredStrategy). Then, I trained the agent to play Pong and the return plot was similar to the one you posted on #8, so I figured out it was ok. Also, in the riverraid's output attached above, half of it ran with precision=16 and half with precision=32 since it was mentioned in a few other issues that precision 32 helped, especially #30. I did not did a full run with precision=32, though.

Do you have any tips on what could be going wrong or what could I do to debug it?

Thanks so much!

The text was updated successfully, but these errors were encountered:

danijar · 2022-06-04T20:05:06Z

What training curves are you getting? It's easy to make mistakes with tf.distribute. For debugging, I recommend to run a few seeds on a single GPU with the original code from the repository here.

luizapozzobon · 2022-06-08T13:19:40Z

Yeah, I'm pretty sure it is some problem with my tf.distribute implementation. I ran the original code and it got to higher scores than I was getting much sooner. Thanks for the insight!!

luizapozzobon closed this as completed Jun 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't reproduce riverraid's results #39

Can't reproduce riverraid's results #39

luizapozzobon commented Jun 4, 2022

danijar commented Jun 4, 2022

luizapozzobon commented Jun 8, 2022

Can't reproduce riverraid's results #39

Can't reproduce riverraid's results #39

Comments

luizapozzobon commented Jun 4, 2022

danijar commented Jun 4, 2022

luizapozzobon commented Jun 8, 2022