Updated instructions for 2D environments.
This library should run on Linux, Mac, or Windows.
# TODO: Create and activate virtual env
# Download the repo as well as the submodules
# Make sure to add ssh key to github
git clone git@github.com:belinghy/SteppingStone.git --recurse-submodules
# switch to walker2d branch, master branch is not updated yet
# 2d env in master branch is broken
cd SteppingStone
git checkout walker2d
# make sure mocca_envs is also updated
# should be on walker2d branch as well
cd .environments; git checkout walker2d; cd ..
# install required libraries
pip install -r requirements
To start a new training experiment named test_experiment
:
# Walker2D, see plaground/train.py for arguments
./scripts/local_run_playground_train.sh walker_experiment \
env='mocca_envs:Walker2DCustomEnv-v0'
# Crab2D
./scripts/local_run_playground_train.sh crab_experiment \
env='mocca_envs:Crab2DCustomEnv-v0'
The bash script will create a new experiment directory inside the runs
directory that contains the following files:
pid
: the process ID of the task running the training algorithmprogress.csv
: a CSV file containing the data about the the training progressslurm.out
: the output of the process. You can usetail -f
to view the contentsconfigs.json
: a JSON file containing all the hyper-parameter values used in this runrun.json
: extra useful stuff about the run including the host information and the git commit ID (only works if GitPython is installed)models
: a directory containing the saved models
If you use Compute Canada, we also have scripts like cedar_run_playground_train.sh
to create a batch job. These scripts use the same argument sctructure but also allow you to run the same task with multiple replicates using the num_replicates
variable.
On Windows, it is possible run the bash script on a version of WSL which supports GPU. Unfortunately, last we tested, CUDA in WSL is still very slow.
The recommended way to run on Windows is using conda
and pycharm
.
Downside is the logs and models will be saved at various places in the playground
folder.
# Walker2D
python playground/train.py with env='mocca_envs:Walker2DCustomEnv-v0'
# Crab2D
python playground/train.py with env='mocca_envs:Crab2DCustomEnv-v0'
The enjoy.py
script can be used to run pretrained policies and render the results. Hit r
in the PyBullet window to reset.
# Run Walker2D controller
python playground/enjoy.py --env mocca_envs:Walker2DCustomEnv-v0 \
--net <experiment_path>/models/mocca_envs:Walker2DCustomEnv-v0_latest.pt
# See help for saving replay as video
# Needs either ffmpeg or moviepy (pip)
python playground/enjoy.py -h
The plot_from_csv.py
script can be helpful for plotting the learning curves:
python -m playground.plot_from_csv --load_paths runs/*/ \
--columns mean_rew max_rew --smooth 2
# group results based on the name
python -m playground.plot_from_csv --load_paths runs/*/ \
--columns mean_rew max_rew --name_regex ".*__([^_\/])*" --group 1
- The
load_paths
argument specifies which directories the script should look. - It opens the
progress.csv
file and plots thecolumns
as the y-axis and uses therow
for the x-axis (defaults tototal_num_steps
). - You can also provide a
name_regex
to make the figure legends simpler and more readable, e.g.--name_regex 'walker-(.*)\/'
. group
can be used to aggregate the results of multiple runs of the same experiment into one.name_regex
is used to specify the groups.