This is the implementation code for the Paper "Generalized Maximum Entropy Reinforcement Learning via Reward Shaping"."
$conda create -n env_name python=3.6$ $conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch$ $pip install gym$ $pip install mujoco_py==2.0.2.8$ $conda install pandas$ $pip install seaborn$
Make sure to add license txt to .mujoco folder
python main.py --env-name Humanoid-v2 --alpha 0.1