This is the implementation code for Convex Optimization course project on "Reformulation and Analysis of Trust Region Policy Optimization" with its application on optimizing an industrial operation using a discrete event simulator. The final report can be found here (relative link)
python DynaFork_Online_TRPO.py
- Python 2.7
- Tensorflow 1.12.0