Skip to content

Latest commit

 

History

History
25 lines (17 loc) · 854 Bytes

ppo.rst

File metadata and controls

25 lines (17 loc) · 854 Bytes

Pendulum with PPO

In this notebook we solve the Pendulum environment using PPO </examples/stubs/ppo>. We'll use a simple multi-layer percentron for our function approximator for the policy and q-function.

This notebook periodically generates GIFs, so that we can inspect how the training is progressing.

After a few hundred episodes, this is what you can expect:

Successfully swinging up the pendulum.


ppo.py

Open in Google Colab

ppo.py