Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to use float64 actions for off policy algorithms #1572

Merged

Conversation

tobirohrer
Copy link
Contributor

@tobirohrer tobirohrer commented Jun 22, 2023

Description

Using float64 actions in continuous off-policy algorithms like TD3 and DDPG caused an error.
The core of the issue is how ReplayBuffer and RolloutBuffer are storing actions. The RolloutBuffer for On-Policy algorithms always casts the actions to float32. The ReplayBuffer on the other hand doesn´t. This MR fixes this issue for now by casting actions to float32 in the ReplayBuffer as well.

Motivation and Context

closes #1145

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)

Checklist

  • I've read the CONTRIBUTION guide (required)
  • I have updated the changelog accordingly (required).
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.
  • I have opened an associated PR on the SB3-Contrib repository (if necessary)
  • I have opened an associated PR on the RL-Zoo3 repository (if necessary)
  • I have reformatted the code using make format (required)
  • I have checked the codestyle using make check-codestyle and make lint (required)
  • I have ensured make pytest and make type both pass. (required)
    Disclaimer: The test test_vec_envs.py::test_render[SubprocVecEnv] fails on my local machine. But it fails on the master branch too.
  • I have checked that the documentation builds using make doc (required)

@tobirohrer tobirohrer marked this pull request as draft June 29, 2023 07:59
tests/test_spaces.py Outdated Show resolved Hide resolved
@tobirohrer tobirohrer marked this pull request as ready for review June 29, 2023 13:37
@araffin araffin changed the title Bugfix for #1145 float64 actions for continuous off policy algorithms Fix to use float64 actions for off policy algorithms Jul 4, 2023
@araffin araffin added the Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;) label Jul 14, 2023
@araffin araffin removed the Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;) label Jul 24, 2023
Copy link
Member

@araffin araffin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks =)
(and sorry for the delay)

@araffin araffin merged commit ba77dd7 into DLR-RM:master Jul 24, 2023
4 checks passed
@araffin araffin mentioned this pull request Oct 8, 2023
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] SAC crashes when float64 action is used
2 participants