Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running exps with Dreamer-V3 #147

Open
belerico opened this issue Dec 4, 2023 · 3 comments
Open

Running exps with Dreamer-V3 #147

belerico opened this issue Dec 4, 2023 · 3 comments

Comments

@belerico
Copy link

belerico commented Dec 4, 2023

Hi guys, first of all what an awesome video you've done on YT!
I'm one of the maintainers of sheeprl and I'm here just to tell you that we're running experiments with Dreamer-V3 on the standard env.
Right now I have modified your env code inside sheeprl and in the future we want to try out also the v2.
This is what I'm getting right now in terms of rewards:

image

This is the configuration I'm using:

headless: True
    save_final_state: True
    early_stop: False
    action_freq: 24
    max_steps: 20480
    print_rewards: True
    save_video: False
    fast_video: True
    debug: False
    sim_frame_dist: 2_000_000.0
    use_screen_explore: True
    reward_scale: 4
    extra_buttons: False
    explore_weight: 3 # 2.5

I don't know if those are good results, but I wanted to share them.
If you wanna try out something with SheepRL let us now 🐑.
Thank you again!

@xinpw8
Copy link

xinpw8 commented Dec 4, 2023

Wow! That's pretty cool, although return/reward is really only a tertiary measure of how well the agent does. Also, I haven't been running the Baseline version for quite a while. Would you be able to give the Pufferlib version a whirl so we can compare results? Specifically, we are interested in how far it can get through the game. This is visualized nicely on a weird coldmap (wandb.ai/jsuarez) or heatmap (wandb.ai/xinpw8). Check our current runs' Overviews for the run parameters. Clone https://github.com/PufferAI/Pufferlib (current branch is 0.5) and https://github.com/PufferAI/pokegym (current branch is main). Or grab the Dockerized version, Puffertank.
PufferTank is a one-stop-shop for RL tools/framework - Pufferlib is contained therein. It has some really nice features. Anyway, change hyperparameters in config.py, run with kwargs (python demo.py --train --track --env pokemon_red --vectorization multiprocessing) and/or change default run parameters in demo.py, found in pufferlib folder. Environment changes can be made in pokegym/pokegym/environment.py. You'll need the kanto_map_dsv.png map file too, and ofc the pokemon_red.gb rom, both of which go in pufferlib folder. I'll add the map here after work or hop in the discord channel for that.

@Iron-Bound
Copy link

You'd have to let it run for 11M-20M before you can really tell the status, see experiments here:
https://wandb.ai/iron-bound/pufferlib/runs/sjwhhk4r?workspace=user-iron-bound

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants