The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning (RAVL)

EDGE-OF-REACH is the official implementation of RAVL ("Reach-Aware Value Estimation") from the paper:

The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning; Anya Sims, Cong Lu, Yee Whye Teh, 2024 [ArXiv]

It includes:

offline dynamics model training,
offline model-based agent training using RAVL.

RAVL implementation has Weights and Biases integration, and is heavily inspired by CORL for model-free offline RL - check them out too!

Setup | Running experiments | Citation

Setup

To start, clone the repository and install requirements with:

# Clone repository
git clone https://github.com/anyasims/edge-of-reach.git && cd edge-of-reach
# Install requirements in virtual environment "ravl"
python3 -m venv ravl
source ravl/bin/activate
pip install -r requirements.txt

Main requirements:

pytorch
gym (MuJoCo RL environments*)
d4rl (offline RL datasets)
wandb (logging)

The code was tested with Python 3.8. *If you don't have MuJoCo installed, follow the instructions here: https://github.com/openai/mujoco-py#install-mujoco.

Running experiments

Training (offline model-based RL) includes:

first training a dynamics model, and then
training an agent (RAVL) in the dynamcis model.

Example:

Training dynamics model

python3 train_dynamics_model.py \
        --env_name halfcheetah-medium-v2 \
        --seed 0 \
        --save_path <folder_for_saving_trained_dynamics_models>

Training RAVL agent

Hyperparameters are: Q-ensemble size num_critics, rollout length steps_k, ratio of original to synthetic data dataset_ratio, and coefficient for EDAC regularizer eta.

python3 train_ravl_agent.py \
        --env_name halfcheetah-medium-v2 \
        --num_critics 10 \
        --steps_k 5 \
        --dataset_ratio 0.05 \
        --eta 1.0 \
        --seed 0 \
        --load_model_dir <path_to_trained_dynamics_model>

Citation

If you use this implementation in your work, please cite us with the following:

@misc{sims2024edgeofreach,
      title={The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning}, 
      author={Anya Sims and Cong Lu and Yee Whye Teh},
      year={2024},
      eprint={2402.12527},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
agent_training		agent_training
dynamics_training		dynamics_training
shared		shared
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
train_dynamics_model.py		train_dynamics_model.py
train_ravl_agent.py		train_ravl_agent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agent_training

agent_training

dynamics_training

dynamics_training

shared

shared

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

train_dynamics_model.py

train_dynamics_model.py

train_ravl_agent.py

train_ravl_agent.py

Repository files navigation

The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning (RAVL)

Setup

Running experiments

Training dynamics model

Training RAVL agent

Citation

About

Releases

Packages

Contributors 2

Languages

License

anyasims/edge-of-reach

Folders and files

Latest commit

History

Repository files navigation

The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning (RAVL)

Setup

Running experiments

Training dynamics model

Training RAVL agent

Citation

About

Resources

License

Stars

Watchers

Forks

Languages