Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] DQN torch version. #7597

Merged
merged 153 commits into from Apr 6, 2020
Merged

[RLlib] DQN torch version. #7597

merged 153 commits into from Apr 6, 2020

Conversation

sven1977
Copy link
Contributor

@sven1977 sven1977 commented Mar 13, 2020

NOTE: Please merge PR #7814 (param-noise completion) and PR #7852 (framework_iterator) first.

Implement a DQN version for PyTorch. This includes:

  • SimpleQ and rainbow DQN (double, dueling, prioritized-replay, n_step, param-noise, noisy layers) with the exception of the distributional-Q head (will be a follow-up/cleanup PR).
  • Learning regression tests for CartPole on SimpleQ torch+tf, DQN torch+tf, and DQN+param-noise tf+torch have been added/updated.
  • All affected regression tests (loss funcs, compilation, param-noise, exploration) have been updated to include torch.
  • Documentation has been updated to show PyTorch symbol on DQN and APEX.

Addresses and solves issue #4371

Closes #4371

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/23144/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/23162/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/23198/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/23215/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/23229/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/23237/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/23235/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24250/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24251/
Test PASSed.

@ericl
Copy link
Contributor

ericl commented Apr 5, 2020

LGTM but some regression tests are failing in travis

@sven1977
Copy link
Contributor Author

sven1977 commented Apr 5, 2020

Yeah, it's the fake multi-GPU PPO one. It's broken already in master :( I moved it into test_ppo to not have to use tune.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24268/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24271/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24272/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24273/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24274/
Test PASSed.

…h_dqn

� Conflicts:
�	.travis.yml
�	rllib/BUILD
�	rllib/agents/dqn/tests/test_dqn.py
�	rllib/tests/run_regression_tests.py
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24293/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24296/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24299/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24300/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24302/
Test PASSed.

@sven1977 sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Apr 6, 2020
@sven1977
Copy link
Contributor Author

sven1977 commented Apr 6, 2020

@ericl RLlib tests are ok now. Please merge. Thanks!

@ericl ericl merged commit 22ccc43 into ray-project:master Apr 6, 2020
@sven1977 sven1977 deleted the torch_dqn branch August 21, 2020 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[rllib] [Feature Request] PyTorch version of DQN style algorithms
3 participants