Skip to content

Implementation of Proximal Policy Optimisation algorithm using pytorch

Notifications You must be signed in to change notification settings

Manaro-Alpha/PPO_PyTorch

Repository files navigation

PPO

About

This repo contains an optimised version of PPO using tricks like Generalised Advantage Estimates, Entropy Regularisation etc. in an attempt to match the performance offered by StableBaselines3's PPO.

Usage

  • To train the agent, run train.py
  • Run tensorboard --logdir runs to visualise the data in your browser
  • To test the trained policy, run test.py

Results

PPO Continuous LunarLander-v2 PPO Continuous LunarLander-v2
PPO Continuous BipedalWalker-v3 PPO Continuous BipedalWalker-v3

About

Implementation of Proximal Policy Optimisation algorithm using pytorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published