This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment
to test the algorithm. Below is a small visualization of the environment, the algorithm is tested in.
You first need to install PyTorch. For a clean installation from Anaconda, checkout this short tutorial, or this tutorial, to only install the binaries.
Do
mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH=/absolut/path/to/libtorch ..
make
Run the executable with
cd build
./train_ppo
To plot the results, run
cd ..
python plot.py --online_view --csv_file data/data.csv --epochs 1 10
It should produce something like shown below.
The algorithm can also be used in test mode, once trained. Therefore, run
cd build
./test_ppo
To plot the results, run
cd ..
python plot.py --online_view --csv_file data/data_test.csv --epochs 1
The results are saved to data/data.csv
and can be visualized by running python plot.py
. Run
python plot.py --help
for help.