Guided Cooperation in Hierarchical Reinforcement Learning via Model-based Rollout (GCMR)

This is a PyTorch implementation for our paper: Guided Cooperation in Hierarchical Reinforcement Learning via Model-based Rollout.

Our code is based on official implementation of HIGL (NeurIPS 2021).

By integrating the proposed GCMR and ACLG, a disentangled variant of HIGL (see Branch "ACLG" or "ACLG_Complex_Tasks" in another repository ACLG_GCMR for details), we achieved a remarkable SOTA.

Branch Tree

We implemented our code based on our another repository ACLG_GCMR , which has a well-organized code structure by implementing its code 'Branch by Branch'. This repository was implemented as follows:

flowchart TD
    S[HaoranWang-TJ/ACLG_GCMR/tree/ACLG_GCMR_Complex_Tasks] --> |A copy| A[ACLG_GCMR_Complex_Tasks]
    A[ACLG_GCMR_Complex_Tasks] -->|Minor code refactoring| B[main]

Installation

conda create -n aclg_gcmr python=3.7
conda activate aclg_gcmr
./install_all.sh

Also, to run the MuJoCo experiments, a license is required (see here).

Install MuJoCo

MuJoCo210

Download the MuJoCo version 2.1 binaries for Linux or OSX.
Extract the downloaded mujoco210 directory into ~/.mujoco/mujoco210.

mkdir ~/.mujoco
tar -zxvf mujoco210-linux-x86_64.tar.gz -C ~/.mujoco/

If you want to specify a nonstandard location for the package, use the env variable MUJOCO_PY_MUJOCO_PATH.

vim ~/.bashrc
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.mujoco/mujoco210/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia
source ~/.bashrc

MuJoCo200

Download the MuJoCo version 2.0 binaries for Linux or OSX.
Extract the downloaded mujoco200 directory into ~/.mujoco/mujoco200.

vim ~/.bashrc
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.mujoco/mujoco200/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia
source ~/.bashrc

Key license

Also, to run the MuJoCo experiments using MuJoCo200, a license is required (see here).

e.g., cp mjkey.txt ~/.mujoco/mjkey.txt

Usage

Training & Evaluation

Point Maze

./scripts/aclg_gcmr_point_maze.sh ${reward_shaping} ${timesteps} ${gpu} ${seed}
./scripts/aclg_gcmr_point_maze.sh sparse 5e5 0 2
./scripts/aclg_gcmr_point_maze.sh dense 5e5 0 2

Ant Maze (U-shape)

./scripts/aclg_gcmr_ant_maze_u.sh ${reward_shaping} ${timesteps} ${gpu} ${seed}
./scripts/aclg_gcmr_ant_maze_u.sh sparse 7e5 0 2
./scripts/aclg_gcmr_ant_maze_u.sh dense 7e5 0 2

Ant Maze (W-shape)

./scripts/aclg_gcmr_ant_maze_w.sh ${reward_shaping} ${timesteps} ${gpu} ${seed}
./scripts/aclg_gcmr_ant_maze_w.sh sparse 6e5 0 2
./scripts/aclg_gcmr_ant_maze_w.sh dense 6e5 0 2

Reacher & Pusher

./scripts/aclg_gcmr_fetch.sh ${env} ${timesteps} ${gpu} ${seed}
./scripts/aclg_gcmr_fetch.sh Reacher3D-v0 5e5 0 2
./scripts/aclg_gcmr_fetch.sh Pusher-v0 5e5 0 2

FetchPickAndPlace & FetchPush

./scripts/aclg_gcmr_openai_fetch.sh ${env} ${timesteps} ${gpu} ${seed}
./scripts/aclg_gcmr_openai_fetch.sh FetchPickAndPlace-v1 10e5 0 2
./scripts/aclg_gcmr_openai_fetch.sh FetchPush-v1 5e5 0 2

Stochastic Ant Maze (U-shape)

./scripts/aclg_gcmr_ant_maze_u_stoch.sh ${reward_shaping} ${timesteps} ${gpu} ${seed}
./scripts/aclg_gcmr_ant_maze_u_stoch.sh sparse 7e5 0 2
./scripts/aclg_gcmr_ant_maze_u_stoch.sh dense 7e5 0 2

Large Ant Maze (U-shape)

./scripts/aclg_gcmr_ant_maze_u_large.sh ${reward_shaping} ${timesteps} ${gpu} ${seed}
./scripts/aclg_gcmr_ant_maze_u_large.sh sparse 12e5 0 2
./scripts/aclg_gcmr_ant_maze_u_large.sh dense 12e5 0 2

Ant Maze Bottleneck

./scripts/aclg_gcmr_ant_maze_bottleneck.sh ${reward_shaping} ${timesteps} ${gpu} ${seed}
./scripts/aclg_gcmr_ant_maze_bottleneck.sh sparse 7e5 0 2
./scripts/aclg_gcmr_ant_maze_bottleneck.sh dense 7e5 0 2

Ant Maze Complex

./scripts/aclg_gcmr_ant_maze_complex.sh ${reward_shaping} ${timesteps} ${gpu} ${seed}
./scripts/aclg_gcmr_ant_maze_complex.sh sparse 30e5 0 2
./scripts/aclg_gcmr_ant_maze_complex.sh dense 30e5 0 2

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
algo		algo
envs		envs
goal_env		goal_env
planner		planner
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
install_all.sh		install_all.sh
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

algo

algo

envs

envs

goal_env

goal_env

planner

planner

scripts

scripts

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

install_all.sh

install_all.sh

main.py

main.py

Repository files navigation

Guided Cooperation in Hierarchical Reinforcement Learning via Model-based Rollout (GCMR)

Branch Tree

Installation

Install MuJoCo

MuJoCo210

MuJoCo200

Usage

Training & Evaluation

About

Releases

Packages

Languages

License

HaoranWang-TJ/GCMR_ACLG_official

Folders and files

Latest commit

History

Repository files navigation

Guided Cooperation in Hierarchical Reinforcement Learning via Model-based Rollout (GCMR)

Branch Tree

Installation

Install MuJoCo

MuJoCo210

MuJoCo200

Usage

Training & Evaluation

About

Resources

License

Stars

Watchers

Forks

Languages