Skip to content

bruinxiong/2022_MM_DMAE-Mocap

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Dual-Masked Auto-Encoder for Robust Motion Capture with Spatial-Temporal Skeletal Token Completion

This repository includes the source code for our ACM Multimedia 2022 paper on multi-view multi-person 3D pose estimation. The preprint version is available at arXiv (arXiv:2207.07381). The project webpage is provided here. The dataset presented in the paper is provided here. Please refer to this for more details.

Dependencies

The code is tested on Windows with

pytorch                   1.10.2
torchvision               0.11.3
CUDA                      11.3.1

We suggest using the virtual environment and an easy-to-use package/environment manager such as conda to maintain the project.

conda create -n dmaeMocap python=3.6
conda activate dmaeMocap
# install pytorch
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
# install the rest of the dependencies
pip install -r requirements.txt

Data preparation

Follow the instruction to prepare the necessary data:

  • Shelf: Download the bundle from here and unzip it to /data/shelf.
    • The bundle consists of the pretrained model, multi-view RGB images, 2D pose detection results, camera matrices and 3D ground-truth. Except for the model, the rest of them are credited to 4D Associate Graph.
    • The RGB images are decoded by ffmpeg with low quality. If you want to get high-quality images, please refer to Shelf webpage.
    • The 2D pose detector is OpenPose.
    • We arrange the data from txt to numpy. The code can be found at util/gizmo/shelf_makeup.py.

Or, generate 2D poses on your own. We provide the instruction at util/gizmo/data_makeup.

Data should be organized as follows:

ROOT/
    └── data/
        └── shelf/
            └── sequences/
                └── img_0/
                └── .../
                └── img_4/
            └── camera_params.npy
            └── checkpoint-best.pth
            └── shelf_eval_2d_detection_dict.npy
    └── ...

Inference

We provide the following script to reconstruct and complete 3D skeletons from multi-view RGB video sequences.

python inference.py

The configuration of triangulation can be found and modified at util/config.py. It can visualize the reconstruction results when self.snapshot_flag = True at Line 18. We set self.snapshot_flag = False as default.

You can use python inference.py --no-dmae to disable the motion completion from D-MAE, and use --snapshot to enable the snapshot.

Evaluate

python evaluate.py

Similar to Inference, the way to reconstruct and complete, the evaluation script is configured by util/config.py. In default, we visualize the inference results and the ground-truth at data/shelf/output/eval_snapshot directory. You can find the metrics in the command console as the output and also are saved at data/shelf/output/eval.log. If you want to evaluate the framework without D-MAE, you need to add --no-dmae to the end of the command line, i.e. python evaluate.py --no-dmae.

Overall, output data would be organized as follows:

ROOT/
    └── data/
        └── shelf/
            └── output/
                └── eval_snapshot/
                └── npy/
                └── eval.log
            └── ...
    └── ...

Train the D-MAE

In this short guide, we focus on HPE reconstruction and completion by the pretrained model. If you want to reproduce the results of the pretrained model, please refer to training/README.md.

Bibtex

If you use our code/models in your research, please cite our paper:

@inproceedings{jiang2022dmae,
  title={A Dual-Masked Auto-Encoder for Robust Motion Capture with Spatial-Temporal Skeletal Token Completion},
  author={Jiang, Junkun and Chen, Jie and Guo, Yike},
  booktitle={Proceedings of the 30th ACM international conference on Multimedia},
  year={2022}
}

Acknowledgement

Many thanks to the following open-source repositories for the help to develop D-MAE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%