Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes actor_loss shape for SAC continuous #383

Merged
merged 1 commit into from
Oct 4, 2023

Conversation

dosssman
Copy link
Collaborator

@dosssman dosssman commented May 8, 2023

Description

Address issues pointed out in #379

Types of changes

  • Bug fix

Checklist:

  • I've read the CONTRIBUTION guide (required).
  • I have ensured pre-commit run --all-files passes (required).
  • I have updated the tests accordingly (if applicable).
  • I have updated the documentation and previewed the changes via mkdocs serve.
    • I have explained note-worthy implementation details.
    • I have explained the logged metrics.
    • I have added links to the original paper and related papers.

If you need to run benchmark experiments for a performance-impacting changes:

  • I have contacted @vwxyzjn to obtain access to the openrlbenchmark W&B team.
  • I have used the benchmark utility to submit the tracked experiments to the openrlbenchmark/cleanrl W&B project, optionally with --capture-video.
  • I have performed RLops with python -m openrlbenchmark.rlops.
    • For new feature or bug fix:
      • I have used the RLops utility to understand the performance impact of the changes and confirmed there is no regression.
    • For new algorithm:
      • I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
    • I have added the learning curves generated by the python -m openrlbenchmark.rlops utility to the documentation.
    • I have added links to the tracked experiments in W&B, generated by python -m openrlbenchmark.rlops ....your_args... --report, to the documentation.

@vercel
Copy link

vercel bot commented May 8, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
cleanrl ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 8, 2023 3:35am

@dosssman
Copy link
Collaborator Author

dosssman commented May 8, 2023

Didn't manage to get the rlops working yet, so regression report was done manually:

https://api.wandb.ai/links/openrlbenchmark/on2vqz6u

pseudo-rnd-thoughts added a commit to pseudo-rnd-thoughts/cleanrl that referenced this pull request May 8, 2023
@timoklein
Copy link
Collaborator

https://api.wandb.ai/links/openrlbenchmark/on2vqz6u

So the version that has the bug fixed is actually worse? That's odd.

@dosssman
Copy link
Collaborator Author

dosssman commented May 11, 2023

Happened a few times before. Probably due to stochasticity that occurs during the sampling process, or due to the difference en environment / hardware.
Might add more runs to ascertain it, if you feel like it is necessary.
Performance regression is only on Walker2d it seems, the rest has very close performance to the rl-pilot baseline.

@timoklein
Copy link
Collaborator

Might add more runs to ascertain it, if you feel like it is necessary.

I can run a couple of experiments if you like but not before May 18th. But to me, it looks OK.

Probably due to stochasticity

I agree. We'd probably need 50+ runs to properly verify anything anyway, that's a little excessive :D

@dosssman
Copy link
Collaborator Author

All good on my side too.

@dosssman dosssman requested a review from vwxyzjn May 11, 2023 07:14
@timoklein timoklein mentioned this pull request Aug 23, 2023
18 tasks
@dosssman
Copy link
Collaborator Author

dosssman commented Oct 4, 2023

Fixes #379

@dosssman dosssman merged commit 0fceeef into vwxyzjn:master Oct 4, 2023
sdpkjc added a commit that referenced this pull request Oct 15, 2023
* Update sac_atari.py and sac_continuous_action.py to gymnasium's api

* Add testing

* #383

* move test file

* fix final_info bug

* clean up mujoco tests

* update ci

* fix tests scripts

* Comment out test-mujoco-envs-mac

* fix final_observation

* test_pybullet.py

---------

Co-authored-by: Adam Zhao <pazyx728@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants