Zeyi Liu, Zhenjia Xu, Shuran Song
Columbia University, New York, NY, United States
Conference on Robot Learning 2022
Project Page | Video | arXiv
We have prepared a conda YAML file which contains all the python dependencies.
conda env create -f environment.yml
Download the pre-trained interaction and reasoning model from Google Drive, and place the models under interact/pre-trained/
and reason/pre-trained/
respectively.
Download the interaction data for training the reasoning model and place the data
directory under interact/
. The full dataset (~86G) can be downloaded from here, which contains all RGB images of the interaction sequences. We also provide a smaller alternative of the dataset with no images but pre-extracted image features on Google Drive.
You can also generate your own interaction data by running the interaction module.
The full data
directory is organized as follows:
.
└── data/
├── train/
│ ├── interact-data/ # 10,000 scenes
│ │ ├── 1/ # each scene contains 30 frames
│ │ │ ├── fig_0.png
│ │ │ ├── ...
│ │ │ │ ├── fig_29.png
│ │ │ └── data.h5
│ │ ├── ...
│ │ ├── 10,000/
│ │ └── stats.h5
│ ├── plan-binary/ # 50 1-to-1 goal-conditioned tasks
│ │ ├── 1/
│ │ │ ├── fig_0.png
│ │ │ ├── ...
│ │ │ └── scene_state.pickle
│ │ ├── ...
│ │ ├── 50/
│ │ └── stats.h5
│ └── plan-multi/ # 50 1-to-many goal-conditioned tasks
│ │ ├── ...
│ │ ├── 50/
│ │ └── stats.h5
├── valid/
│ ├── interact-data/ # 2,000 scenes
│ │ ├── ...
│ │ ├── 2,000/
│ │ └── stats.h5
│ ├── plan-binary/
│ └── plan-multi/
└── unseen/
├── interact-data/ # 2,000 scenes
├── plan-binary/
└── plan-multi/
data.h5
contains the image features, object states, actions, object positions, object bounding boxes, object types, and inter-object relations for each scene. stats.h5
aggregates the information for each interaction sequence into a global file for faster loading.
interact/assets/objects/
contains the URDF models for all objects. The Door, Lamp, and Switch category are selected from the PartNet-Mobility Dataset from SAPIEN. The Toy category are borrowed from UMPNet. The interact/assets/objects/data.json
file defines the train and test instances. Each object also corresponds to a object_meta_info.json
file that contains basic object information: category, instance id, scale, moveable link, bounding box, cause/effect properties, etc. If you want to add new objects from the PartNet-Mobility Dataset, you can refer to interact/data_process.py
on how to process the data.
To train the interaction module, run the following command
cd interact
python train.py --exp {exp_name}
You can access the trained models and visualization under interact/exp/{exp_name}
.
To train the reasoning module, run the following command
cd reason
bash scripts/train_board.sh --exp {exp_name}
You can access the trained models under reason/model/{exp_name}
.
For future state prediction, we evaluate the state accuracy of objects. To map object features extracted from image to object states, we train a decoder for each object category and provide the pre-trained models under reason/decoders/
.
To run a demo of the trained reasoning model given a single interaction sequence, run the following command
cd reason
bash scripts/demo_board.sh
If you want to extract image features on your own interaction dataset, we provide a script to do that as well
cd reason
bash scripts/feature_extract.sh
To run evaluation on goal-conditioned tasks, run the following command
cd plan
bash scripts/planning.sh
We refer to UMPNet by Zhenjia Xu for the interaction module and V-CDN by Yunzhu Li for the reasoning module when developing this codebase.
If you find this codebase useful, consider citing:
@inproceedings{liu2023busybot,
title={BusyBot: Learning to Interact, Reason, and Plan in a BusyBoard Environment},
author={Liu, Zeyi and Xu, Zhenjia and Song, Shuran},
booktitle={Conference on Robot Learning},
pages={505--515},
year={2023},
organization={PMLR}
}