Skip to content

jaehyeon-son/dicp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Distillation for In-Context Planning (DICP)

This repository provides the official implementation of our ICLR 2025 paper, Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning.

Requirements

To set up the required environment, run:

conda env create -f environment.yml

Meta-World Installation

The following command installs Meta-World, adapted from Farama-Foundation/Metaworld:

git clone https://github.com/Farama-Foundation/Metaworld.git
cd Metaworld
git checkout 83ac03c
pip install .
cd .. && rm -rf Metaworld

TinyLlama Dependencies

To install the required TinyLlama dependencies, run the following commands (adapted from TinyLlama’s PRETRAIN.md):

git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention
git checkout 320fb59
python setup.py install
cd csrc/rotary && pip install .
cd ../layer_norm && pip install .
cd ../xentropy && pip install .
cd ../../.. && rm -rf flash-attention

Usage

The following commands demonstrate the basic usage of the code in GridWorld environments.

Data Collection

To collect training data, run:

python collect_data.py -ac [algorithm config]  -ec [environment config] -t [trajectory directory]

Training

To train the model, run:

python train.py -ac [algorithm config]  -ec [environment config] -mc [model config] -t [trajectory directory] -l [log directory]

Evaluation

To evaluate a trained model, run:

python evaluate.py -c [checkpoint directory] -k [beam size]

Citation

If you find this work useful, please cite our paper:

@inproceedings{son2025distilling,
  author    = {Jaehyeon Son and Soochan Lee and Gunhee Kim},
  title     = {Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
}

About

Official implementation for ICLR 2025 paper "Distilling Reinforcement Learning Algorithms for In-Context Model-Based Planning"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages