This repository provides the official implementation of our ICLR 2025 paper, Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning.
To set up the required environment, run:
conda env create -f environment.ymlThe following command installs Meta-World, adapted from Farama-Foundation/Metaworld:
git clone https://github.com/Farama-Foundation/Metaworld.git
cd Metaworld
git checkout 83ac03c
pip install .
cd .. && rm -rf MetaworldTo install the required TinyLlama dependencies, run the following commands (adapted from TinyLlama’s PRETRAIN.md):
git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention
git checkout 320fb59
python setup.py install
cd csrc/rotary && pip install .
cd ../layer_norm && pip install .
cd ../xentropy && pip install .
cd ../../.. && rm -rf flash-attentionThe following commands demonstrate the basic usage of the code in GridWorld environments.
To collect training data, run:
python collect_data.py -ac [algorithm config] -ec [environment config] -t [trajectory directory]To train the model, run:
python train.py -ac [algorithm config] -ec [environment config] -mc [model config] -t [trajectory directory] -l [log directory]To evaluate a trained model, run:
python evaluate.py -c [checkpoint directory] -k [beam size]If you find this work useful, please cite our paper:
@inproceedings{son2025distilling,
author = {Jaehyeon Son and Soochan Lee and Gunhee Kim},
title = {Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning},
booktitle = {International Conference on Learning Representations},
year = {2025},
}