3D Implicit Transporter for Temporally Consistent Keypoint Discovery

This repository contains the implementation of the following paper:

Chengliang Zhong, Yuhang Zheng, Yupeng Zheng, Hao Zhao, Li Yi, Xiaodong Mu, Ling Wang, Pengfei Li, Guyue Zhou, Chao Yang, Xinliang Zhang, Jian Zhao

In ICCV 2023 (Oral)

If you find our code or paper useful, please consider citing:

@inproceedings{zhong20233d,
  title={3D Implicit Transporter for Temporally Consistent Keypoint Discovery},
  author={Zhong, Chengliang and Zheng, Yuhang and Zheng, Yupeng and Zhao, Hao and Yi, Li and Mu, Xiaodong and Wang, Ling and Li, Pengfei and Zhou, Guyue and Yang, Chao and others},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={3869--3880},
  year={2023}
}

Overall

The architecture of our model is as follows:

Discover keypoints for rigid cases (simulator and real-world):

Discover keypoints for non-rigid cases (rodents and human):

Using temporally consistent keypoints for closed-loop manipulation:

Datasets

PartNet-Mobility dataset is provided by UMPNET, which can be downloaded from here.

We utilize the Pybullet simulator and object models from PartNet-Mobility dataset to generate training and test data, which can be downloaded from here. Then, move the downloaded data into 'data' folder.

Our 'data' folder structure is as follows:

data
  ├── bullet_multi_joint_train
  │    ├── FoldingChair
  │    ...
  │    ├── Window
  ├── bullet_multi_joint_test
  │    ├── Box
  │    ...  
  │    ├── Window

Installation

Make sure that you have all dependencies in place. The simplest way to do so, is to use anaconda.

You can create an anaconda environment called 3d_transporter using

conda create --name 3d_transporter python=3.7
conda activate 3d_transporter

Note: Install python packages according to the CUDA version on your computer:

# CUDA >= 11.0
pip install -r requirements_cu11.txt 
pip install torch-scatter==2.0.9
# CUDA < 11.0
pip install -r requirements_cu10.txt 
pip install torch-scatter==2.0.4

Next, compile the extension modules. You can do this via

python setup.py build_ext --inplace

Training

If a directory for storing training results does not exist, run:

mkdir exp/train0901

The name 'train0901' can be modified as per your discretion.

If train on single GPU, run:

sh exp/train0901/train_single.sh

If train on multiple GPUs, modify the values of 'CUDA_VISIBLE_DEVICES' and 'nproc_per_node' in the 'train_multi.sh' according to the number of available GPUs of yours and run:

sh exp/train0901/train_multi.sh

Extract and Save Keypoints

For seen data:

sh exp/train0901/test_seen.sh save_kpts

For unseen data:

sh exp/train0901/test_unseen.sh save_kpts

Evaluate

1. Perception

Test for seen data:

python tools/eval_repeat.py \
--dataset_root data/bullet_multi_joint_test --test_root ${scriptDir}/test_result/seen/ \
 --test_type seen

Test for unseen data:

python tools/eval_repeat.py \
--dataset_root data/bullet_multi_joint_test --test_root ${scriptDir}/test_result/unseen/ \
 --test_type unseen

2. Manipulation

After training, one can get 'model_best.pth' in 'exp/train0901/checkpoints'.

ln -s exp/train0901/checkpoints/model_best.pth manipulation/ckpts/
cd manipulation
sh eval.sh

Pretrained Models

We provide pretrained models on Google Drive. Move the models to exp/train0901/checkpoints/.

License

Our repo is released under the MIT License.

Acknowledgment

We would like to thank the open-source code of UMPNET, Transporter, Ditto, SNAKE and D3feat.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
core		core
exp/train0901		exp/train0901
manipulation		manipulation
media		media
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements_cu10.txt		requirements_cu10.txt
requirements_cu11.txt		requirements_cu11.txt

License

zhongcl-thu/3D-Implicit-Transporter

Folders and files

Latest commit

History

Repository files navigation

3D Implicit Transporter for Temporally Consistent Keypoint Discovery

Overall

Datasets

Installation

Training

Extract and Save Keypoints

Evaluate

1. Perception

2. Manipulation

Pretrained Models

License

Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Languages