Deprecation Notice

This repo is deprecated - please visit our new repo https://github.com/vwxyzjn/ppo-implementation-details and the improved ICLR 2022 blog post on PPO https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/

PPO-Implementation-Deep-Dive

This repo contains the source code for the PPO Implementation Deep Dive tutorial series.

Proximal Policy Optimization Implementation Deep Dive | 11 Core Implementation Details (youtu.be/MEt6rrxH8W4)
Proximal Policy Optimization Implementation Deep Dive | 9 Atari-specific Details (youtu.be/05RMTj-2K_Y)
Proximal Policy Optimization Implementation Deep Dive | 8 Details for Continuous Actions (youtu.be/BvZvx7ENZBw)

You can find out where theses implementation details come from by visiting my blog post, which contains github permanent links of the details to the original implementation.

If you like this repo, consider also checking out CleanRL, my RL library based on single-file implementations.

Get started

Prerequisites:

Python 3.8+
Poetry

Install dependencies:

poetry install

Train agents:

poetry run python ppo.py

Train agents with experiment tracking:

poetry run python ppo.py --track --capture-video

Atari

Install dependencies:

poetry install -E atari

Train agents:

poetry run python ppo_atari.py

Train agents with experiment tracking:

poetry run python ppo_atari.py --track --capture-video

Pybullet

Install dependencies:

poetry install -E pybullet

Train agents:

poetry run python ppo_continuous_action.py

Train agents with experiment tracking:

poetry run python ppo_continuous_action.py --track --capture-video

MuJoCo

!! Note this installation method only works in Linux

Install dependencies:

poetry install -E mujoco
poetry run python -c "import mujoco_py"

Train agents:

poetry run python ppo_continuous_action.py --gym-id Hopper-v2

Train agents with experiment tracking:

poetry run python ppo_continuous_action.py --gym-id Hopper-v2 --track --capture-video

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
ppo.py		ppo.py
ppo_atari.py		ppo_atari.py
ppo_continuous_action.py		ppo_continuous_action.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

poetry.lock

poetry.lock

ppo.py

ppo.py

ppo_atari.py

ppo_atari.py

ppo_continuous_action.py

ppo_continuous_action.py

pyproject.toml

pyproject.toml

Repository files navigation

Deprecation Notice

PPO-Implementation-Deep-Dive

Get started

Atari

Pybullet

MuJoCo

About

Releases

Packages

Languages

vwxyzjn/PPO-Implementation-Deep-Dive

Folders and files

Latest commit

History

Repository files navigation

Deprecation Notice

PPO-Implementation-Deep-Dive

Get started

Atari

Pybullet

MuJoCo

About

Resources

Stars

Watchers

Forks

Languages