Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix replay buffer compatibility with mujoco envs #113

Merged
merged 8 commits into from
Feb 18, 2022

Conversation

vwxyzjn
Copy link
Owner

@vwxyzjn vwxyzjn commented Feb 18, 2022

The current DDPG SAC TD3 files are not compatible with mujoco envs (see below), and this PR fixes it.

(cleanrl-ghSZGHE3-py3.9) ➜  cleanrl git:(fix-mujoco-compatibility) ✗ python -i ddpg_continuous_action.py --gym-id Hopper-v2 --learning-starts 100
/home/costa/.cache/pypoetry/virtualenvs/cleanrl-ghSZGHE3-py3.9/lib/python3.9/site-packages/gym/envs/registration.py:479: UserWarning: WARN: The environment Hopper-v2 is out of date. You should consider upgrading to version `v3` with the environment ID `Hopper-v3`.
  logger.warn(
global_step=23, episode_reward=8.566108703613281
global_step=41, episode_reward=7.716689109802246
global_step=63, episode_reward=17.882747650146484
global_step=73, episode_reward=6.347293853759766
global_step=86, episode_reward=10.202958106994629
global_step=99, episode_reward=7.710036277770996
Traceback (most recent call last):
  File "/home/costa/Documents/go/src/github.com/cleanrl/cleanrl/ddpg_continuous_action.py", line 200, in <module>
    next_state_actions = (target_actor.forward(data.next_observations)).clamp(
  File "/home/costa/Documents/go/src/github.com/cleanrl/cleanrl/ddpg_continuous_action.py", line 107, in forward
    x = F.relu(self.fc1(x))
  File "/home/costa/.cache/pypoetry/virtualenvs/cleanrl-ghSZGHE3-py3.9/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/costa/.cache/pypoetry/virtualenvs/cleanrl-ghSZGHE3-py3.9/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 103, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/costa/.cache/pypoetry/virtualenvs/cleanrl-ghSZGHE3-py3.9/lib/python3.9/site-packages/torch/nn/functional.py", line 1848, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: expected scalar type Float but found Double

@gitpod-io
Copy link

gitpod-io bot commented Feb 18, 2022

@vwxyzjn vwxyzjn merged commit 24c96af into master Feb 18, 2022
@vwxyzjn vwxyzjn deleted the fix-mujoco-compatibility branch February 18, 2022 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant