Official PyTorch implementation of Deep Deterministic Dynamics Gradients. Code is based heavily on the Twin Delayed DDPG implementation. For research purpose only. Support and/or new releases may be limited.
Clone the repo:
git clone https://github.com/uber-research/D3G.git && cd D3G
We use Python 3.6.2. Requirements for D3G can be installed by running:
pip install -r D3G/requirements.txt
The stochastic action results can be reproduced by running:
cd toy_problems/stochastic_actions && python model_gridworld.py --stochasticity {rand}
The windy cliffworld results can be reproduced by running:
cd toy_problems/windy_cliffworld && python model_gridworld.py
The redundant action results can be reproduced by running:
cd toy_problems/redundant_actions && python model_gridworld.py
The shuffled action results can be reproduced by running:
cd toy_problems/shuffled_actions && python model_gridworld.py && python model_gridworld.py --shuffled
The paper results can be reproduced by running:
cd D3G && ./mujoco_experiments/run_D3G_experiments.sh
The model predictions for Reacher-v2 and InvertedPendulum-v2 can be visualized. To see this, run:
cd D3G && python main.py --policy D3G --env Reacher-v2 --visualize
The paper results can be reproduced by running:
cd D3G && ./lfo_experiments/run_D3G_experiments
The model predictions for Reacher-v2 and InvertedPendulum-v2 can be visualized here too. To see this, run:
cd D3G && python learn_from_observation.py --policy D3G --env Reacher-v2 --visualize