PECAN

(A picture of PECAN created by DALL·E)

This codebase is implementation of the AAMAS 2023 paper [PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination](https://arxiv.org/abs/2301.06387).

This codebase is based on human-aware-rl and MEP.

We provide an easy-to-use human-AI coordination framework for the Overcooked environment. Please check this repo for further details.

Instruction for usage

1. Install

Install relavant modules and the human-aware-rl package by

   cd human-aware-rl/
   ./install.sh

2. Stage 1&2 (Train the population and context encoder. Optional)

Follow the instructions of the first step of MEP to train an maximum-entropy population.

Train the context encoder in here via standard supervised learning with data collected in the stage 2 training of MEP.

For easy usage, some pre-trained models are provided in this repo so you can skip this step.

3. Train the ego agent

Train the ego agent with the following comman. Note that you have to specificy paths of all 15 agents in the population in the LOAD_FOLDER_LST variable, seperated by :.

cd human-aware-rl/human-aware-rl
export PBT_DATA_DIR=pbt_data_dir_2/ && python pbt/pbt_model_pool.py with fixed_mdp layout_name="simple"\
EX_NAME="pbt_simple" TOTAL_STEPS_PER_AGENT=1.1e7 TRAINING_ITERATIONS=$training_iteration ALPHA_DECAY_HORIZON=$alpha_horizon ALPHA_FINAL=$alpha_final REW_SHAPING_HORIZON=$reward_shaping\
LR=$lr POPULATION_SIZE=15 SEEDS="[$seeds]" GPU_ID=0 NUM_SELECTION_GAMES=$number_sel_games VF_COEF=$vf_coef MINIBATCHES=$minibatches\
TIMESTAMP_DIR=False ENTROPY_POOL=$entropy_pool PRIORITIZED_SAMPLING=True ALPHA=$alpha METRIC=1.0\
LOAD_FOLDER_LST="path1/:path2/:path3/..."

An example with the pre-trained models:

export PBT_DATA_DIR=pbt_data_dir_2/ && python pbt/pbt_model_pool.py with fixed_mdp layout_name="simple"\
EX_NAME="pbt_simple" TOTAL_STEPS_PER_AGENT=1.1e7 TRAING_ITERATIONS=100 ALPHA_DECAY_HORIZON=70 \
ALPHA_FINAL=0.3 REW_SHAPING_HORIZON=5e6 LR=8e-4 POPULATION_SIZE=15 SEEDS="[1111,5015,7015,8015]" GPU_ID=0\
NUM_SELECTION_GAMES=6 VF_COEF=0.5 MINIBATCHES=10 TIMESTAMP_DIR=False ENTROPY_POOL=0.0\
PRIORITIZED_SAMPLING=True ALPHA=3.0 METRIC=1.0\ LOAD_FOLDER_LST="pbt_data_dir/pbt_simple/seed_9015/agent0/pbt_iter1/:pbt_data_dir/pbt_simple/seed_9015/agent1/pbt_iter1/:pbt_data_dir/pbt_simple/seed_9015/agent2/pbt_iter1/:pbt_data_dir/pbt_simple/seed_9015/agent3/pbt_iter1/:pbt_data_dir/pbt_simple/seed_9015/agent4/pbt_iter1/:pbt_data_dir/pbt_simple/seed_9015/agent0/pbt_iter152/:pbt_data_dir/pbt_simple/seed_9015/agent1/pbt_iter152/:pbt_data_dir/pbt_simple/seed_9015/agent2/pbt_iter152/:pbt_data_dir/pbt_simple/seed_9015/agent3/pbt_iter152/:pbt_data_dir/pbt_simple/seed_9015/agent4/pbt_iter152/:pbt_data_dir/pbt_simple/seed_9015/agent0/pbt_iter305/:pbt_data_dir/pbt_simple/seed_9015/agent1/pbt_iter305/:pbt_data_dir/pbt_simple/seed_9015/agent2/pbt_iter305/:pbt_data_dir/pbt_simple/seed_9015/agent3/pbt_iter305/:pbt_data_dir/pbt_simple/seed_9015/agent4/pbt_iter305/"

Citation

Please cite

@article{lou2023pecan,
 title={PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination},
 author={Lou, Xingzhou and Guo, Jiaxian and Zhang, Junge and Wang, Jun and Huang, Kaiqi and Du, Yali},
 journal={arXiv preprint arXiv:2301.06387},
 year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
human_aware_rl		human_aware_rl
.DS_Store		.DS_Store
README.md		README.md
dalle.png		dalle.png
pecan_simple.gif		pecan_simple.gif
pecan_uni.gif		pecan_uni.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PECAN

Instruction for usage

1. Install

2. Stage 1&2 (Train the population and context encoder. Optional)

3. Train the ego agent

Citation

About

Releases

Packages

Contributors 2

Languages

LxzGordon/PECAN

Folders and files

Latest commit

History

Repository files navigation

PECAN

Instruction for usage

1. Install

2. Stage 1&2 (Train the population and context encoder. Optional)

3. Train the ego agent

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages