We propose a novel approach called Graph-based Robotic Instruction Decomposer (GRID), leverages scene graph instead of image to perceive global scene information and continuously plans subtask in each stage for a given instruction.
For details, see the paper GRID: Scene-Graph-based Instruction-driven Robotic Task Planning and the project website.
Clone the repository
git clone https://github.com/jackyzengl/GRID.git
cd ./GRID
Setup conda environment
conda create --name grid python=3.8.16
conda activate grid
pip install -r requirements.txt
Install instructor from source code
pip install -e instructor-embedding/.
Download our dataset to path dataset/
.
The parameter required by the data preprocessor is defined in ${workspace}/hparams.cfg
[data_preprocessor] section.
Run data preprocessor to obtain all text embeddings required during training,
and save the data to the disk. The path of data to be preprocessed is ${workspace}/dataset/
by default. The pre-processor saves each processed sample as a .pt
file under the directory ${workspace}/preprocess_data/
.
Please give one device only as instructor does not support multiple devices.
python run_preprocess_data.py --gpu_devices <your device>
If you want to specify the dataset location, pass it by argument --data_path
python run_preprocess_data.py --gpu_devices <your device> --data_path /path/to/data
Use python run_preprocess_data.py --help
to see more available options.
set up configuration in hparams.cfg
This will fit and predict a model from data preprocessed in ${workspace}/preprocess_data/
python train.py
If your preprocessed dataset is placed somewhere else, parse the data location into the script by
python train.py --preprocessed_data_path /path/to/data
Use python train.py --help
to see more arguments such as setting your gpu devices for training.
This will continue training, and predict output from the training result.
A checkpoint generated by a trained model is saved to ${workspace}/logs/${experiment_name}/${version_name}/checkpoints/{checkpoint_name}
with an extension .ckpt
by default.
To run from the checkpoint, the model automatically reads the hyper-parameter which was generated along with the checkpoint. It is usually saved in ${workspace}/logs/${experiment_name}/${version_name}/hparams.yaml
.
With the default pre-processed data location, use:
python train.py --fit_flag True --from_ckpt_flag True --ckpt_path /path/to/checkpoint
Specify the dataset location by argument --preprocessed_data_path
if necessary.
This will not train any models but directly predict results from the checkpoint.
The checkpoint file path requires a hparams.cfg
or hparams.yaml
at related path checkpoints/../..
Typically, hparams.yaml
is generated automatically when running train.py
.
With the default data location, use:
python train.py --fit_flag False --from_ckpt_flag True --ckpt_path /path/to/checkpoint
Specify the dataset location by argument --preprocessed_data_path
if necessary.
Our network takes an instruction, a scene graph and a robot graph as input, and predicts a subtask which consists of an action and an object. We use subtask accuracy to evaluate the accuracy of each prediction.
Running a prediction automatically calculates the subtask accuracy.
Please use only one gpu device to infer results to log the complete prediction labels.
The prediction creates the sub_task_label.xlsx
and accuracy.xlsx
files in the new checkpoint directory logs/{experiment name}/version_{version_number}/
.
sub_task_label.xlsx
shows the prediction result for each sample, and accuracy.xlsx
gives the overall accuracy calculation for all subtasks.
Each task is associated with a series of subtasks. A task is considered as successful when all predicted subtasks associated with the task are correct.
To obtain task accuracy, change the raw_data_path_
and output_root_path_
variables in your run_evaluation.py
file.
raw_data_path_:
is the path of the raw dataset instead of the preprocessed data.
output_root_path_:
the folder of the subtask accuracy file generated in the prediction we made in the previous step.
Run
python run_evaluation.py
The task accuracy is updated in accuracy.xlsx
.
The new file sub_task_accuracy.xlsx
gives the task accuracies for tasks associated with different numbers of subtasks.
- Release training codes.
- Release checkpoints.
- Release inference codes.
If you find the code useful, please consider citing:
@article{ni2024grid,
title={GRID: Scene-Graph-based Instruction-driven Robotic Task Planning},
author={Zhe Ni and Xiaoxin Deng and Cong Tai and Xinyue Zhu and Qinghongbing Xie and Weihang Huang and Xiang Wu and Long Zeng},
journal={arXiv preprint arXiv:2309.07726},
year={2024}
}