Using PPO, I am attempting to solve the cartpole environment. PPO is a reinforcement learning algorithm which uses a trust region to learn and uses actor-critic style decision making.
To run this program, run the main.py file in a python environment. Make sure to have the following libraries installed:
- NumPy
- PyTorch
- OpenAI Gym
- Matplotlib