Skip to content

Conversation

@wrzadkow
Copy link
Contributor

@wrzadkow wrzadkow commented Sep 17, 2020

Reinforcement learning example using the Proximal Policy Optimization algorithm, prepared in close collaboration with @jheek and @lespeholt .

The implementation learns to play Atari games implemented in OpenAI gym environment. Tests on BeamRider, Breakout, Pong, Qbert, Seaquest, and SpaceInvaeders show that the training performance from the original paper is reproduced. The speed is ~1000 FPS on a VM with one V100 GPU. Unit tests and documentation are provided.

@google-cla google-cla bot added the cla: yes label Sep 17, 2020
@wrzadkow wrzadkow force-pushed the rl-example-ppo branch 4 times, most recently from d411337 to 1e89a8b Compare September 24, 2020 20:22
@wrzadkow wrzadkow marked this pull request as ready for review September 29, 2020 10:15
copybara-service bot pushed a commit that referenced this pull request Oct 1, 2020
PiperOrigin-RevId: 334765232
@copybara-service copybara-service bot merged commit 45937af into google:master Oct 1, 2020
@wrzadkow wrzadkow mentioned this pull request Oct 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants