Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CleanRL examples: PPO solve Pong in 5 mins #48

Merged
merged 7 commits into from
Jan 9, 2022
Merged

Add CleanRL examples: PPO solve Pong in 5 mins #48

merged 7 commits into from
Jan 9, 2022

Conversation

vwxyzjn
Copy link
Collaborator

@vwxyzjn vwxyzjn commented Jan 9, 2022

Kudos to this repo! This PR adds CleanRL example. Interestingly, after increasing num_envs=32, I was able to solve Pong in 10 mins :D

image

See the tracked experiment in costa-huang/cleanRL/runs/3rx432mj

See also vwxyzjn/cleanrl#100

@vwxyzjn
Copy link
Collaborator Author

vwxyzjn commented Jan 9, 2022

Hey @Trinkle23897 do you mind if I override to ignore the lining errors?

@Trinkle23897
Copy link
Collaborator

Hmm...I can fix that

@vwxyzjn
Copy link
Collaborator Author

vwxyzjn commented Jan 9, 2022

Ran a hyper-parameter sweep (sweeps/nfrd091p) overnight, now i can solve Pong in ~5 mins, according to runs/opk2dmta, with hyper parameters

--clip-coef=0.2 --num-envs=16 --num-minibatches=8 --num-steps=128 --update-epochs=3

D22B5FE2-0515-46FF-91A3-B29B4DF49EBA

@vwxyzjn vwxyzjn changed the title Add CleanRL examples: PPO solve Pong in 10 mins Add CleanRL examples: PPO solve Pong in 5 mins Jan 9, 2022
@Trinkle23897
Copy link
Collaborator

It works on my side!

README.md Outdated Show resolved Hide resolved
Trinkle23897
Trinkle23897 previously approved these changes Jan 9, 2022
@vwxyzjn
Copy link
Collaborator Author

vwxyzjn commented Jan 9, 2022

LGTM. Feel free to merge.

@Trinkle23897 Trinkle23897 merged commit 5abf286 into sail-sg:master Jan 9, 2022
@vwxyzjn vwxyzjn deleted the cleanrl branch January 9, 2022 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants