Skip to content

baitingzbt/PEDA

Repository files navigation

Scaling Pareto-Efficient Decision Making via Offline Multi-Objective RL (ICLR 2023)

Website | Poster | OpenReview

Authors: Baiting Zhu, Meihua Dang, Aditya Grover

Setup

git clone https://github.com/baitingzbt/PEDA.git
cd PEDA
conda env create -f environment.yml
conda activate peda_env

Data Download

This folder contain all dataset variants used in the paper experiments including ablation study. All variants: check "generate your own data" section below.

pip install gdown
gdown --folder https://drive.google.com/drive/folders/1wfd6BwAu-hNLC9uvsI1WPEOmPpLQVT9k?usp=sharing --output data

The "data" folder should be under "PEDA" e.g.: PEDA/data/env/data_name.pkl

Training

First double-check your CUDA devices and data path in this shell script. Run the uniform experiments for all environments:

sh all_env_uniform.sh

Alternatively, here is an example for a single experiment:

python experiment.py --dir experiment_runs/uniform --env MO-HalfCheetah-v2 --data_mode _formal --concat_state_pref 1 --concat_rtg_pref 0 --concat_act_pref 0 --mo_rtg True --seed 1 --dataset expert_uniform --model_type rvs --num_steps_per_iter 200000 --max_iters 2

Generate Your Own Data (WIP)

Due to storage limit, we cannot easily open-source all data variants. Please check the source code to collect data. First download ckpts from https://drive.google.com/file/d/19kEqdNG-ttwxmZ__30gop_KRvPf4NSjL/view. Unzip, rename folder to Precomputed_Result, and move this folder under data_generation.

# DOWNLOAD, UPZIP, RENAME, MOVE

# USE AFTER MANUAL SETUP
cd data_generation
sh collect_all.sh

Note 1: We use randomly-initialized environments which is different from behavioral policy paper. This helps to diversify trajectories.

Note 2: Model ckpts are stored under PEDA/data_generation/Precomputed_Results. All were kindly provided by the authors of behavioral policy paper, except that we trained Hopper-v3 ourselves.

Citation

If you use this repo, please cite:

@inproceedings{
    zhu2023paretoefficient,
    title     = {Scaling Pareto-Efficient Decision Making via Offline Multi-Objective RL},
    author    = {Baiting Zhu and Meihua Dang and Aditya Grover},
    booktitle = {International Conference on Learning Representations},
    year      = {2023},
    url       = {https://openreview.net/forum?id=Ki4ocDm364}
}

About

Scaling Pareto-Efficient Decision Making via Offline Multi-Objective RL, published in ICLR 2023

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published