Skip to content

Implementation of Proximal Policy Optimization(PPO)

Notifications You must be signed in to change notification settings

Abhipanda4/PPO-PyTorch

Repository files navigation

This is a Pytorch implementation of Proximal Policy Optimization as described in this paper.

The implementation used in this repo was used as a reference for this implementation.

To run a demo, clone the repo and use the command: python simulate.py

The training plots are shown below:

reward plot

actor loss plot

critic loss plot

Releases

No releases published

Packages

No packages published

Languages