Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dreamer for Atari #33

Closed
michaelzhiluo opened this issue Aug 2, 2020 · 4 comments
Closed

Dreamer for Atari #33

michaelzhiluo opened this issue Aug 2, 2020 · 4 comments

Comments

@michaelzhiluo
Copy link

In short, here's the bug when I ran atari_breakout:

  File "dreamer.py", line 463, in <module>
    main(parser.parse_args())
  File "dreamer.py", line 443, in main
    functools.partial(agent, training=False), test_envs, episodes=1)
  File "/home/mluo/dreamer/tools.py", line 124, in simulate
    obs, _, done = zip(*[p()[:3] for p in promises])
  File "/home/mluo/dreamer/tools.py", line 124, in <listcomp>
    obs, _, done = zip(*[p()[:3] for p in promises])
  File "/home/mluo/dreamer/wrappers.py", line 350, in step
    obs, reward, done, info = self._env.step(action)
  File "/home/mluo/dreamer/wrappers.py", line 162, in step
    obs, reward, done, info = self._env.step(action)
  File "/home/mluo/dreamer/wrappers.py", line 211, in step
    obs, reward, done, info = self._env.step(action)
  File "/home/mluo/dreamer/wrappers.py", line 320, in step
    raise ValueError(f'Invalid one-hot action:\n{action}')
ValueError: Invalid one-hot action:
[ 0.999  -0.9995  0.9995  0.9995] 

I was wondering what changes are needed to get atari to work in your much cleaner Dreamer codebase and what possible hyperparameter changes would be needed to match the results reported in the paper.

@IcarusWizard
Copy link

Same problem as #29.

@michaelzhiluo
Copy link
Author

Ty! Does Atari learn well (replicate results) with the hyperparameters in dreamer.py?

@IcarusWizard
Copy link

Sorry, I didn't fully run the atari experiment, since I don't have enough resource to run it 😟 (by calculation, it needs roughly 1T RAM and weeks of training on my environment).
If you have enough resource and want to replicate the results, I suggest you to try the parameters in Appendix A of the paper. My setting is --expl epsilon_greedy --horizon 10 --kl_scale 0.1 --action_dist onehot --expl_amount 0.4 --expl_min 0.1 --expl_decay 100000 --pcont 1 --time_limit 1000000. Here time_limit is set to be large enough to prevent early stop of rollout in atari environment.
You may also need to change the hidden size of the network as mentioned by Danijar in #7.
Good Luck!

@xlnwel
Copy link

xlnwel commented Jan 19, 2021

DreamerV2 for Atari games is out. Check this repo: https://github.com/danijar/dreamerv2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants