Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TD3/DDPG/DQN defaults for consistency #1785

Merged
merged 3 commits into from
Jan 12, 2024
Merged

Conversation

araffin
Copy link
Member

@araffin araffin commented Dec 13, 2023

Description

closes #1769
closes #1562

EDIT: WIP report is here: https://wandb.ai/openrlbenchmark/sbx/reports/SBX-TD3-RL-Zoo-v2-3-0a0-vs-SB3-TD3-RL-Zoo-2-2-1---Vmlldzo2MjUyNTQx

Note: I didn't change the default network architecture because it would break all pre-trained models where net_arch was not specified properly.

TODO:

  • A benchmark still need to be done for TD3/DDPG to check the impact of the change, but I've got initial results with SBX where no big change is expected
  • Update the RL Zoo default hyperparams (independent of this PR)

Motivation and Context

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)

Checklist

  • I've read the CONTRIBUTION guide (required)
  • I have updated the changelog accordingly (required).
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.
  • I have opened an associated PR on the SB3-Contrib repository (if necessary)
  • I have opened an associated PR on the RL-Zoo3 repository (if necessary)
  • I have reformatted the code using make format (required)
  • I have checked the codestyle using make check-codestyle and make lint (required)
  • I have ensured make pytest and make type both pass. (required)
  • I have checked that the documentation builds using make doc (required)

Note: You can run most of the checks using make commit-checks.

Note: we are using a maximum length of 127 characters per line

@araffin
Copy link
Member Author

araffin commented Dec 13, 2023

Small update, it seems that we should keep lr=1e-3, will continue more quantitative experiments in Jan (also comparing with different network sizes), have nice holidays =)

See report https://wandb.ai/openrlbenchmark/sbx/reports/SBX-TD3-RL-Zoo-v2-3-0a0-vs-SB3-TD3-RL-Zoo-2-2-1---Vmlldzo2MjUyNTQx

cc @Kallinteris-Andreas

@Kallinteris-Andreas
Copy link
Contributor

I would like to also see SAC tested with lr=1e-3

@araffin araffin added the Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;) label Dec 14, 2023
@araffin araffin removed the Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;) label Jan 11, 2024
@araffin
Copy link
Member Author

araffin commented Jan 11, 2024

I would like to also see SAC tested with lr=1e-3

Here you go (3 seeds only for now), there seems to be a small improvement for HC and Ant but need more seeds for confirmation.
https://wandb.ai/openrlbenchmark/sbx/reports/SBX-SAC-influence-of-learning-rate--Vmlldzo2NDg1MjUz

Influence of the neural network size for TD3:
https://wandb.ai/openrlbenchmark/sbx/reports/SBX-TD3-Influence-of-policy-net--Vmlldzo2NDg1Mzk3

Not much to say, small impact only.

@araffin araffin marked this pull request as ready for review January 11, 2024 16:00
@araffin araffin merged commit a9273f9 into master Jan 12, 2024
4 checks passed
@araffin araffin deleted the feat/update-td3-defaults branch January 12, 2024 15:05
@araffin araffin mentioned this pull request Mar 27, 2024
14 tasks
friedeggs pushed a commit to friedeggs/stable-baselines3 that referenced this pull request Jul 22, 2024
* Update TD3/DDPG/DQN defaults for consistency

* Update changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants