Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment (IJCAI 2021)
require:
pytorch 1.4.0
python 3.6.7
mujoco_py 1.50.1.34
mjpro150
openmpi 4.0.3
reduce_var/PPO.py: The algorithm.
reduce_var/mlp.py: Model architectures.
For plot: plotforv1.py gaeplot.py
Main Fucntion: mpitraingae.py mpitraingae_th.py
How to get the experiment results?
- mkdir result_setgae
- mkdir modes_setgae
- mpiexec -n 27 python mpitraingae.py
- mpiexec -n 27 python mpitraingae_th.py
Then you can plot with results saved in result_setgae: Using gaeplot.py to compare lambda for GAE. Using plotforv1.py to plot the learing curve figure.
For your own tasks, I recommend tuning parameters such as h_dim, l_dim. In addition, the estimation of mutual information is an important part of the algorithm. In this paper, we set the scale of both loss functions