This repository demonstrate the implementation code of Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs, which is published in ICML 2023. Please visit our project page for more information.
We re-formulate solving a reinforcement learning task as synthesizing a task-solving program that can be executed to interact with the environment and maximize the return. We first learn a program embedding space that continuously parameterizes a diverse set of programs sampled from a program dataset. Then, we train a meta-policy, whose action space is the learned program embedding space, to produce a series of programs (i.e., predict a series of actions) to yield a composed task-solving program.
The experimental results in the Karel domain show that our proposed framework outperforms baseline approaches. The ablation studies confirm the limitations of LEAPS and justify our design choices.
- The implementation code can be found in this directory
- Python 3.6
- PyTorch 1.4.0
- Install
virtualenv
, create a virtual environment, activate it and install the requirements inrequirements.txt
.
pip3 install --upgrade virtualenv
virtualenv hprl
source hprl/bin/activate
pip3 install -r requirements.txt
-
Download dataset from here
-
Unzip the file
bash run_vae_option_L30.sh
bash run_meta_policy_new_vae_ppo_64dim.sh
@inproceedings{liu2023hierarchical,
title={Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs},
author={Guan-Ting Liu and En-Pei Hu and Pu-Jen Cheng and Hung-Yi Lee and Shao-Hua Sun},
booktitle = {International Conference on Machine Learning},
year={2023}
}
Guan-Ting Liu, En-Pei Hu, Pu-Jen Cheng, Hung-Yi Lee, Shao-Hua Sun