Scaling Pareto-Efficient Decision Making via Offline Multi-Objective RL (ICLR 2023)

Authors: Baiting Zhu, Meihua Dang, Aditya Grover

Setup

git clone https://github.com/baitingzbt/PEDA.git
cd PEDA
conda env create -f environment.yml
conda activate peda_env

Data Download

This folder contain all dataset variants used in the paper experiments including ablation study. All variants: check "generate your own data" section below.

pip install gdown
gdown --folder https://drive.google.com/drive/folders/1wfd6BwAu-hNLC9uvsI1WPEOmPpLQVT9k?usp=sharing --output data

The "data" folder should be under "PEDA" e.g.: PEDA/data/env/data_name.pkl

Training

First double-check your CUDA devices and data path in this shell script. Run the uniform experiments for all environments:

sh all_env_uniform.sh

Alternatively, here is an example for a single experiment:

python experiment.py --dir experiment_runs/uniform --env MO-HalfCheetah-v2 --data_mode _formal --concat_state_pref 1 --concat_rtg_pref 0 --concat_act_pref 0 --mo_rtg True --seed 1 --dataset expert_uniform --model_type rvs --num_steps_per_iter 200000 --max_iters 2

Generate Your Own Data (WIP)

Due to storage limit, we cannot easily open-source all data variants. Please check the source code to collect data. First download ckpts from https://drive.google.com/file/d/19kEqdNG-ttwxmZ__30gop_KRvPf4NSjL/view. Unzip, rename folder to Precomputed_Result, and move this folder under data_generation.

# DOWNLOAD, UPZIP, RENAME, MOVE

# USE AFTER MANUAL SETUP
cd data_generation
sh collect_all.sh

Note 1: We use randomly-initialized environments which is different from behavioral policy paper. This helps to diversify trajectories.

Note 2: Model ckpts are stored under PEDA/data_generation/Precomputed_Results. All were kindly provided by the authors of behavioral policy paper, except that we trained Hopper-v3 ourselves.

Citation

If you use this repo, please cite:

@inproceedings{
    zhu2023paretoefficient,
    title     = {Scaling Pareto-Efficient Decision Making via Offline Multi-Objective RL},
    author    = {Baiting Zhu and Meihua Dang and Aditya Grover},
    booktitle = {International Conference on Learning Representations},
    year      = {2023},
    url       = {https://openreview.net/forum?id=Ki4ocDm364}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
__pycache__		__pycache__
data_generation		data_generation
environments		environments
lr_models		lr_models
modt		modt
rvs		rvs
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
all_env_uniform.sh		all_env_uniform.sh
environment.yml		environment.yml
experiment.py		experiment.py
state_norm_params.py		state_norm_params.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycache

pycache

data_generation

data_generation

environments

environments

lr_models

lr_models

modt

modt

rvs

rvs

.DS_Store

.DS_Store

LICENSE

LICENSE

README.md

README.md

all_env_uniform.sh

all_env_uniform.sh

environment.yml

environment.yml

experiment.py

experiment.py

state_norm_params.py

state_norm_params.py

Repository files navigation

Scaling Pareto-Efficient Decision Making via Offline Multi-Objective RL (ICLR 2023)

Setup

Data Download

Training

Generate Your Own Data (WIP)

Citation

About

Releases

Packages

Languages

License

baitingzbt/PEDA

Folders and files

Latest commit

History

Repository files navigation

Scaling Pareto-Efficient Decision Making via Offline Multi-Objective RL (ICLR 2023)

Setup

Data Download

Training

Generate Your Own Data (WIP)

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages