This repo contains code accompaning the manuscript: Zhi Wang, Chunlin Chen, and Daoyi Dong, "A Dirichlet Mixture of Robust Task Models for Scalable Lifelong Reinforcement Learning", IEEE Transactions on Cybernetics, DOI: 10.1109/TCYB.2022.3170485, 2022. It contains code for running the lifelong learning tasks, including 2D navigation, Reacher, and Hopper domains.
This code requires the following:
- python 3.5+
- pytorch 0.4+
- gym
- MuJoCo license
- For the 2D navigation domains, data is generated from
envs/navigation.py
- For the Hopper/HalfCheetah/Ant Mujoco domains, the modified Mujoco enviornments are in
envs/mujoco/*
- For example, to run the code in the 2D navigation domain, just run the bash script
navi_v1.sh
, also see the usage instructions in the python scriptsmain_sllrl.py
and `main_baselines.py'. - When getting the results in
output/*/*.npy
files, plot the results usingplot_results.py
. For example, the result fornavi_v1.sh
is:
performance comparison | clustering visualization |
---|---|
To ask questions or report issues, please open an issue on the issues tracker, or email to zhiwang@nju.edu.cn.