CUP: Critic-Guided Policy Reuse

This repository is the official implementation of CUP: Critic-Guided Policy Reuse], which has been accepted by NeurIPS 2022. Please create an issue if you have any problems!

Introduction

Setup

Install dependencies:

git init
git add .
git commit -m init
pip install -r requirements/dev.txt
cd ./src
git clone git@github.com:NagisaZj/metaworld-cup.git
git clone git@github.com:NagisaZj/mtenv.git
cd ./src/mtenv
pip install -e .
cd ../metaworld-cup
pip install -e .

Usage

CUP:

CAUTION: Remember to replace setup.load_dir, setup.load_dir_2, and setup.load_dir_3 with your own absolute path to the corresponding directories.

CUDA_VISIBLE_DEVICES=7 OPENBLAS_NUM_THREADS=4 PYTHONPATH=. python3 -u main.py \
setup=metaworld \
env=metaworld-push-back \
env.task_idx=-1 \
env.fix_goal=0 \
agent=state_sac \
experiment.num_eval_episodes=1 \
experiment.num_train_steps=1000000 \
setup.seed=1695 \
experiment.eval_freq=5000 \
replay_buffer.batch_size=1280 \
agent.multitask.num_envs=1 \
agent.multitask.should_use_disentangled_alpha=True \
agent.encoder.type_to_select=identity \
agent.multitask.should_use_multi_head_policy=False \
agent.multitask.should_use_disjoint_policy=False \
agent.multitask.should_use_task_encoder=True \
agent.multitask.actor_cfg.should_condition_model_on_task_info=False \
agent.multitask.actor_cfg.should_condition_encoder_on_task_info=True \
agent.multitask.actor_cfg.should_concatenate_task_info_with_encoder=True \
setup.relabel_num_tasks=1 \
setup.relabel_range=10 \
setup.load_dir=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_43_2/model \
setup.load_dir_2=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_43_2/model \
setup.load_dir_3=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_253/model \
setup.load=1 \
setup.load_log_std_bounds=[-20,2]

SAC baseline: Just set agent.use_expert to 0 in the corresponding config file (config/agent/state_sac.yaml), or pass arguments with commands, for example:

CUDA_VISIBLE_DEVICES=7 OPENBLAS_NUM_THREADS=4 PYTHONPATH=. python3 -u main.py \
setup=metaworld \
env=metaworld-push-back \
env.task_idx=-1 \
env.fix_goal=0 \
agent=state_sac \
experiment.num_eval_episodes=1 \
experiment.num_train_steps=1000000 \
setup.seed=1695 \
experiment.eval_freq=5000 \
replay_buffer.batch_size=1280 \
agent.multitask.num_envs=1 \
agent.multitask.should_use_disentangled_alpha=True \
agent.encoder.type_to_select=identity \
agent.multitask.should_use_multi_head_policy=False \
agent.multitask.should_use_disjoint_policy=False \
agent.multitask.should_use_task_encoder=True \
agent.multitask.actor_cfg.should_condition_model_on_task_info=False \
agent.multitask.actor_cfg.should_condition_encoder_on_task_info=True \
agent.multitask.actor_cfg.should_concatenate_task_info_with_encoder=True \
setup.relabel_num_tasks=1 \
setup.relabel_range=10 \
setup.load_dir=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_43_2/model \
setup.load_dir_2=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_43_2/model \
setup.load_dir_3=/data3/zj/CUP/source_policies/90f2497ff4cee27c0d30fbc66e6ba205f94808ba4ea16e057df58e73_issue_None_seed_253/model \
setup.load=1 \
setup.load_log_std_bounds=[-20,2] \
agent.multitask.use_expert=0

Other available environments can be seen in ./config/env.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
config		config
docs_src		docs_src
metadata/task_embedding/roberta_small		metadata/task_embedding/roberta_small
mtrl		mtrl
news		news
requirements		requirements
source_policies		source_policies
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.readthedocs.yaml		.readthedocs.yaml
.yamllint		.yamllint
LICENSE		LICENSE
README.md		README.md
main.py		main.py
noxfile.py		noxfile.py
render_test.py		render_test.py
setup.cfg		setup.cfg
setup.py		setup.py
towncrier.toml		towncrier.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CUP: Critic-Guided Policy Reuse

Contents

Introduction

Setup

Usage

About

Releases

Packages

Languages

License

NagisaZj/CUP

Folders and files

Latest commit

History

Repository files navigation

CUP: Critic-Guided Policy Reuse

Contents

Introduction

Setup

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages