Implementation of proximal policy optimization(PPO) using tensorflow
CartPole-v0 of open ai gym
state space: continuous
action space: discrete
python3.6
tensorflow v1.4
open ai gym
python main.py
python test_policy.py
tensorboard --logdir=log
MIT ICENSE