A parallel agent training version of Proximal Policy Optimization with clipped objective.
- To test a pre-trained network : run
test.py
- To train a new network : run
parallel_PPO.py
- All the hyperparameters are in the file, main function
CartPole-v1 | LunarLander-v2 |
---|---|
Trained and tested on:
Python 3.6
PyTorch 1.3
NumPy 1.15.3
gym 0.10.8
Pillow 5.3.0
- implement Conv net based training
conda env export | grep -v "^prefix: " > environment.yml
to export the fileenvironment.yml
conda create -f environment.yml
to create the conda environment used for training