Fixes actor_loss shape for SAC continuous #383

dosssman · 2023-05-08T03:35:18Z

Description

Address issues pointed out in #379

Types of changes

Bug fix

Checklist:

I've read the CONTRIBUTION guide (required).
I have ensured pre-commit run --all-files passes (required).
~~I have updated the tests accordingly (if applicable).~~
I have updated the documentation and previewed the changes via mkdocs serve.
- I have explained note-worthy implementation details.
- I have explained the logged metrics.
- I have added links to the original paper and related papers.

If you need to run benchmark experiments for a performance-impacting changes:

I have contacted @vwxyzjn to obtain access to the openrlbenchmark W&B team.
I have used the benchmark utility to submit the tracked experiments to the openrlbenchmark/cleanrl W&B project, optionally with --capture-video.
I have performed RLops with python -m openrlbenchmark.rlops.
- For new feature or bug fix:
  - I have used the RLops utility to understand the performance impact of the changes and confirmed there is no regression.
- For new algorithm:
  - I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
- I have added the learning curves generated by the python -m openrlbenchmark.rlops utility to the documentation.
- I have added links to the tracked experiments in W&B, generated by python -m openrlbenchmark.rlops ....your_args... --report, to the documentation.

…wxyzjn#379

vercel · 2023-05-08T03:35:21Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
cleanrl	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	May 8, 2023 3:35am

dosssman · 2023-05-08T03:52:48Z

Didn't manage to get the rlops working yet, so regression report was done manually:

https://api.wandb.ai/links/openrlbenchmark/on2vqz6u

timoklein · 2023-05-10T12:57:37Z

https://api.wandb.ai/links/openrlbenchmark/on2vqz6u

So the version that has the bug fixed is actually worse? That's odd.

dosssman · 2023-05-11T00:18:01Z

Happened a few times before. Probably due to stochasticity that occurs during the sampling process, or due to the difference en environment / hardware.
Might add more runs to ascertain it, if you feel like it is necessary.
Performance regression is only on Walker2d it seems, the rest has very close performance to the rl-pilot baseline.

timoklein · 2023-05-11T06:46:57Z

Might add more runs to ascertain it, if you feel like it is necessary.

I can run a couple of experiments if you like but not before May 18th. But to me, it looks OK.

Probably due to stochasticity

I agree. We'd probably need 50+ runs to properly verify anything anyway, that's a little excessive :D

dosssman · 2023-05-11T07:14:37Z

All good on my side too.

dosssman · 2023-10-04T06:15:45Z

Fixes #379

* Update sac_atari.py and sac_continuous_action.py to gymnasium's api * Add testing * #383 * move test file * fix final_info bug * clean up mujoco tests * update ci * fix tests scripts * Comment out test-mujoco-envs-mac * fix final_observation * test_pybullet.py --------- Co-authored-by: Adam Zhao <pazyx728@gmail.com>

Fixed incorrect actor_loss shape for SAC continuous, addresses issue v…

5aaf9c5

…wxyzjn#379

vercel bot deployed to Preview May 8, 2023 03:35 View deployment

pseudo-rnd-thoughts mentioned this pull request May 8, 2023

Bug in actor loss for sac_continuous_action.py #379

Closed

pseudo-rnd-thoughts added a commit to pseudo-rnd-thoughts/cleanrl that referenced this pull request May 8, 2023

https://github.com/vwxyzjn/cleanrl/pull/383/

022485c

dosssman requested a review from vwxyzjn May 11, 2023 07:14

timoklein mentioned this pull request Aug 23, 2023

use alpha not log alpha in autotune #414

Merged

18 tasks

dosssman merged commit 0fceeef into vwxyzjn:master Oct 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes actor_loss shape for SAC continuous #383

Fixes actor_loss shape for SAC continuous #383

dosssman commented May 8, 2023

vercel bot commented May 8, 2023 •

edited

Loading

dosssman commented May 8, 2023 •

edited

Loading

timoklein commented May 10, 2023

dosssman commented May 11, 2023 •

edited

Loading

timoklein commented May 11, 2023

dosssman commented May 11, 2023

dosssman commented Oct 4, 2023

Fixes actor_loss shape for SAC continuous #383

Fixes actor_loss shape for SAC continuous #383

Conversation

dosssman commented May 8, 2023

Description

Types of changes

Checklist:

vercel bot commented May 8, 2023 • edited Loading

dosssman commented May 8, 2023 • edited Loading

timoklein commented May 10, 2023

dosssman commented May 11, 2023 • edited Loading

timoklein commented May 11, 2023

dosssman commented May 11, 2023

dosssman commented Oct 4, 2023

vercel bot commented May 8, 2023 •

edited

Loading

dosssman commented May 8, 2023 •

edited

Loading

dosssman commented May 11, 2023 •

edited

Loading