IMHD$^2$: Inertial and Multi-view Highly Dynamic human-object interactions Dataset

I'M HOI: Inertia-aware Monocular Capture of 3D Human-Object Interactions
Chengfeng Zhao, Juze Zhang, Jiashen Du, Ziwei Shan, Junye Wang, Jingyi Yu, Jingya Wang, Lan Xu*

🔥News

May, 2024: 🔈🔈 The 32-view 2D & 3D human keypoints have been released!
March, 2024: 🎉🎉 I'M HOI is accepted to CVPR 2024!
Jan. 04, 2024: 🔈🔈 Fill out the form to have access to IMHD$^2$!

Dataset Features

IMHD$^2$ is featured by:

Human motion annotation in SMPL-H format, built on EasyMocap
Object motion annotation, built on PHOSA
Well-scanned object geometry, using Polycam
Object-mounted IMU sensor measurement, using Movella DOT
32-view RGB videos & instance-level segmentations, built on SAM, Track-Anything and XMem
32-view 2D&3D human keypoints detection, using ViTPose and MediaPipe

Dataset Structure

data/
|--calibrations/           # camera intrinsics and world-to-cam extrinsics
|--object_templates/       # raw and downsampled geometry
|--imu_preprocessed/       # pre-processed IMU signal
|--keypoints2d/            # body keypoints in OP25 format and hand keypoints in MediaPipe format
|--keypoints3d/            # body keypoints in OP25 format and hand keypoints in MediaPipe format
|--ground_truth/           # human motion in SMPL-H format and rigid object motion
|----<date>/
|------<segment_name>/
|--------<sequence_name>/
|----------gt_<part_id>_<start>_<end>.pkl

All sub-folders have the similar detailed structure as the shown one of ground truth. Particularly, since motion annotations of some part in some sequence are not ideal, there may exist several .pkl files under one sequence folder. To parse the file name meaning of leaf .pkl files, here is an example: gt_0_10_100.pkl: the first motion part which starts from frame_10 and ends at frame_100.

Getting Started

We tested our code on Windows 10, Windows 11, Ubuntu 18.04 LTS and Ubuntu 20.04 LTS.

All dependencies:

python>=3.8
CUDA=11.7
torch=1.13.0
pytorch3d
opencv-python
matplotlib
smplx

For Windows

conda create -n imhd2 python=3.8 -y
conda activate imhd2
conda install pytorch=1.13.0 torchvision pytorch-cuda=11.7 -c pytorch -c nvidia
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
git clone https://github.com/facebookresearch/pytorch3d.git
cd pytorch3d && pip install -e . --ignore-installed PyYAML

For Ubuntu

conda create -n imhd2 python=3.8 -y
conda activate imhd2
conda install --file conda_install_cuda117_pakage.txt -c nvidia
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"

How to use

Prepare data. Download IMHD$^2$ from here and place it under the root directory in the pre-defined structure.
Prepare body model. Please refer to body_model.
Run python visualization.py to check how to load and visualize IMHD$^2$. Results will be saved in visualizations/.

FAQs

Q1: Which coordinate are the ground-truth motions in? How to align all the motions across different dates?

A1: The ground-truth motions are in the world coordinate which was calibrated using multi-camera system and may different across dates. To align them, you can use the provided camera parameters in calibrations/ to transform all motion data to camera coordinate.

Q2: Which category of object does the motions named with 'bat' in 20230825/ and 20230827/ interact with?

A2: The interacting object category of motions in 20230825/ and 20230827/ is baseball bat, corresponding to 'baseball' in the object_templates/ folder.

Q3: Which camera serves as the main view?

A3: The main view is from the camera labeled with '1'(starting from 0).

Citation

If you find our data or paper helps, please consider citing:

@article{zhao2023imhoi,
  title={I'M HOI: Inertia-aware Monocular Capture of 3D Human-Object Interactions},
  author={Zhao, Chengfeng and Zhang, Juze and Du, Jiashen and Shan, Ziwei and Wang, Junye and Yu, Jingyi and Wang, Jingya and Xu, Lan},
  journal={arXiv preprint arXiv:2312.08869},
  year={2023}
}

Acknowledgement

This work was supported by National Key R&D Program of China (2022YFF0902301), Shanghai Local college capacity building program (22010502800). We also acknowledge support from Shanghai Frontiers Science Center of Human-centered Artificial Intelligence (ShangHAI).

We thank Jingyan Zhang and Hongdi Yang for settting up the capture system. We thank Jingyan Zhang, Zining Song, Jierui Xu, Weizhi Wang, Gubin Hu, Yelin Wang, Zhiming Yu, Xuanchen Liang, af and zr for data collection. We thank Xiao Yu, Yuntong Liu and Xiaofan Gu for data checking and annotations.

Licenses

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
body_model		body_model
LICENSE		LICENSE
README.md		README.md
conda_install_cuda117_pakage.txt		conda_install_cuda117_pakage.txt
visualization.py		visualization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

body_model

body_model

LICENSE

LICENSE

README.md

README.md

conda_install_cuda117_pakage.txt

conda_install_cuda117_pakage.txt

visualization.py

visualization.py

Repository files navigation

IMHD$^2$: Inertial and Multi-view Highly Dynamic human-object interactions Dataset

🔥News

Contents

Dataset Features

Dataset Structure

Getting Started

For Windows

For Ubuntu

How to use

FAQs

Citation

Acknowledgement

Licenses

About

Languages

License

AfterJourney00/IMHD-Dataset

Folders and files

Latest commit

History

Repository files navigation

IMHD$^2$: Inertial and Multi-view Highly Dynamic human-object interactions Dataset

🔥News

Contents

Dataset Features

Dataset Structure

Getting Started

For Windows

For Ubuntu

How to use

FAQs

Citation

Acknowledgement

Licenses

About

Resources

License

Stars

Watchers

Forks

Languages