Implementation of Trust Region Policy Optimization and Proximal Policy Optimization algorithms on the objective of Robot Walk.
- OpenAI Gym : A toolkit for developing and comparing reinforcement learning algorithms
- PyBullet Gym : PyBullet Robotics Environments fully compatible with Gym toolkit (uses the Bullet physics engine)
- PyTorch : Open source machine learning library based on the Torch library
- NumPy : Fundamental package for scientific computing with Python
- matplotlib : Plotting library for the Python programming language and its numerical mathematics extension NumPy
Trust Region Policy Optimization (TRPO) - implemented by Vasilije Pantić
Proximal Policy Optimization (PPO) - implemented by Nikola Zubić
For TRPO: Run trpo_main.py
at root/code/trpo/
,
For PPO: Run ppo_main.py
at root/code/ppo/
,
and enter the absolute file path to the trained model.
Trained models are available at: root/code/trained_models/
.
Training time [h] | 24 | 96 |
TRPO |
Training time [h] | 6.5 | 48 |
PPO |
Copyright (c) 2021 Nikola Zubić, Vasilije Pantić