Using short-horizon nonlinear dynamics for on-policy simulation to improve value estimation. See the paper for algorithmic details.
Here are setup-specific requirements that you really, really have to do yourself:
- MuJoCo 1.50, with the appropriate key available - MuJoCo downloads. On Linux, you can run
~/install-mujoco.sh
with the keymjkey.txt
in the CWD. - Python 3.5 (
scripts/
assume this is thepython
andpip
inPATH
) - If you get any error messages relating to
glfw3
, then reinstall everything in a shell where the following environment variables are set (and for good measure in the shell where you're launching experiments):
export LD_LIBRARY_PATH=~/.mujoco/mjpro150/bin
export LIBRARY_PATH=~/.mujoco/mjpro150/bin
Other system dependencies:
- System dependencies for
gym
- gym README - System dependencies for
gym2
- gym2 README
Example installation:
# GPU version
# ... you install system packages here
conda create -y -n gpu-py3.5 python=3.5
source activate gpu-py3.5
pip install -r <( sed 's/tensorflow/tensorflow-gpu/' requirements.txt )
# CPU version
# ... you install system packages here
conda create -y -n cpu-py3.5 python=3.5
source activate cpu-py3.5
pip install -r requirements.txt
# Lazy version (defaults to CPU)
conda create -y -n cpu-py3.5 python=3.5
./scripts/install-mujoco.sh
./scripts/ubuntu-install.sh
# Lazy version (GPU)
conda create -y -n gpu-py3.5 python=3.5
./scripts/install-mujoco.sh
sed -i 's/tensorflow/tensorflow-gpu/' requirements.txt
./scripts/ubuntu-install.sh
All scripts are available in scripts/
, and should be run from the repo root.
script | purpose |
---|---|
lint.sh |
invokes pylint with the appropriate flags for this repo |
ubuntu-install.sh |
installs all deps except MuJoCo/python on Ubuntu 14.04 or Ubuntu 16.04 |
install-mujoco.sh |
add MuJoCo 1.50 on linux, assuming a key |
tests.sh |
runs tests |
fake-display.sh |
create a dummy X11 display (to render on a server) |
launch-ray-aws.sh |
launch an AWS ray cluster at the current branch |
teardown-ray-aws.sh |
tear down a cluster |
To run experiments locally, use main_ray.py
(note the resources requirements specified here are only used for scheduling trials; the actual processes are free to create as many threads as they want if you'd like to oversubscribe the machines by setting tf_parallelism
within the YAML config to larger than the number of guaranteed CPUs). To multiplex the GPUs locally, set self_host
to the total number of virtual GPUs.
python mve/main_ray.py --experiment_name hc0 --config experiments/hc0.yaml --ncpus 1 --ngpus 0 --self_host 1
Experiments can also be run in a distributed manner by connecting to a live ray cluster by changing --self_host
to --port RAY_REDIS_PORT
above. Multiple experiments can be run by the same driver. See python mve/main_ray.py --help
for details.
Multiple components of this code run in parallel.
- Evaluation on environments is parallelized by multiple processes or threads (which one depends on the environment), and uses
--env_parallelism
workers. I have found that peak performance is reached when there is some batching; i.e., the number of workers is less than half the number of environments used for evaluation (default 8). - If running on CPUs, TensorFlow internal thread pool parallelism (for each of intra-operation and inter-operation thread pools) is set by the value of
--tf_parallelism
(defaultnproc
). - If running on GPUs, then TensorFlow GPU count used is automatically set to those GPUs specified by the environment variable
CUDA_VISIBLE_DEVICES
, which if left empty uses the first GPU available on the machine. There is currently no support for actually using multiple GPUs in a single experiment. - The
OMP_NUM_THREADS
is overriden; there's no need to set it.
Just use the manual encryption instructions here. .travis.yml
is already configured to securely unencrypt mjkey.txt.gpg.
- Create a new
FullyObservable<env>
undermve/envs
. SeeFullyObservableHalfCheetah.py
for a link to the Open AI Gym commit that contains the source code you should adapt. - Expose the new environment by importing it in
mve/envs/__init__.py
- Make it accessible through
mve/env_info.py
's_env_class
function.
Make sure to test that your environment works and amend the tests in scripts/tests.sh
to include a check that it runs correctly.