Skip to content

smplbody/hmr-benchmarks

Repository files navigation

Benchmarking 3D Pose and Shape Estimation Beyond Algorithms

S-Lab, Nanyang Technological University

NeurIPS 2022

Getting started

Experiments

Introduction

This repository builds upon MMHuman3D, an open source PyTorch-based codebase for the use of 3D human parametric models in computer vision and computer graphics. MMHuman3D is a part of the OpenMMLab project. The main branch works with PyTorch 1.7+.

These features will be contributed to MMHuman3D at a later date.

Major Features added to MMHuman3D

We have added multiple major features on top of MMHuman3D.

  • Benchmarks on 31 datasets
  • Benchmarks on 11 dataset combinations
  • Benchmarks on 9 backbones and different initialisation
  • Benchmarks on 9 augmentation techniques
  • Provide trained models on optimal configurations for inference
  • Evaluation on 5 test sets
  • FLOPs calculation

Additional:

  • Train annotation files for 31 datasets will be provided in the future
  • Future works can easily obtain benchmarks on HMR for baseline comparison on their selected dataset mixes and partition using our provided pipeline and annotation files.

Experiments

Single-datasets

Supported datasets:

(click to collapse)
  1. AGORA (CVPR'2021)
  2. AI Challenger (ICME'2019)
  3. COCO (ECCV'2014)
  4. COCO-WholeBody (ECCV'2020)
  5. EFT-COCO-Part (3DV'2021)
  6. EFT-COCO (3DV'2021)
  7. EFT-LSPET (3DV'2021)
  8. EFT-OCHuman (3DV'2021)
  9. EFT-PoseTrack (3DV'2021)
  10. EFT-MPII (3DV'2021)
  11. Human3.6M (TPAMI'2014)
  12. InstaVariety (CVPR'2019)
  13. LIP (CVPR'2017)
  14. LSP (BMVC'2010)
  15. LSP-Extended (CVPR'2011)
  16. MPI-INF-3DHP (3DC'2017)
  17. MPII (CVPR'2014)
  18. MTP (CVPR'2021)
  19. MuCo-3DHP (3DV'2018)
  20. MuPoTs-3D (3DV'2018)
  21. OCHuman (CVPR'2019)
  22. 3DOH50K (CVPR'2020)
  23. Penn Action (ICCV'2012)
  24. 3D-People (ICCV'2019)
  25. PoseTrack18 (CVPR'2018)
  26. PROX (ICCV'2019)
  27. 3DPW (ECCV'2018)
  28. SURREAL (CVPR'2017)
  29. UP-3D (CVPR'2017)
  30. VLOG (CVPR'2019)
  31. CrowdPose (CVPR'2019)

Please refer to datasets.md for training configs and results.

  • Benchmarks on different dataset combinations

Mixed-datasets

(click to collapse)
  1. Mix 1: H36M, MI, COCO
  2. Mix 2: H36M, MI, EFT-COCO
  3. Mix 3: H36M, MI, EFT-COCO, MPII
  4. Mix 4: H36M, MuCo, EFT-COCO
  5. Mix 5: H36M, MI, COCO, LSP, LSPET, MPII
  6. Mix 6: EFT-[COCO, MPII, LSPET], SPIN-MI, H36M
  7. Mix 7: EFT-[COCO, MPII, LSPET], MuCo, H36M, PROX
  8. Mix 8: EFT-[COCO, PT, LSPET], MI, H36M
  9. Mix 9: EFT-[COCO, PT, LSPET, OCH], MI, H36M
  10. Mix 10: PROX, MuCo, EFT-[COCO, PT, LSPET, OCH], UP-3D, MTP, Crowdpose
  11. Mix 11: EFT-[COCO, MPII, LSPET], MuCo, H36M

Please refer to mixed-datasets.md for training configs and results.

Backbones

(click to collapse)
  • ResNet-50, -101, -152 (CVPR'2016)
  • ResNeXt (CVPR'2017)
  • HRNet (CVPR'2019)
  • EfficientNet
  • ViT
  • Swin
  • Twins

Please refer to backbone.md for training configs and results.

Backbone-initialisation

We find that transfering knowledge from a pose estimation model gives more competitive performance.

Initialised backbones:

(click to collapse)
  1. ResNet-50 ImageNet (default)
  2. ResNet-50 MPII
  3. ResNet-50 COCO
  4. HRNet-W32 ImageNet
  5. HRNet-W32 MPII
  6. HRNet-W32 COCO
  7. Twins-SVT ImageNet
  8. Twins-SVT MPII
  9. Twins-SVT COCO

Please refer to backbone.md for training configs and results.

Augmentations

New augmentations:

(click to collapse)
  1. Coarse dropout
  2. Grid dropout
  3. Photometric distortion
  4. Random crop
  5. Hard erasing
  6. Soft erasing
  7. Self-mixing
  8. Synthetic occlusion
  9. Synthetic occlusion over keypoints

Please refer to augmentation.md for training configs and results.

Losses

We find that training with L1 loss gives more competitive performance. Please refer to mixed-datasets-l1.md for training configs and results.

Downloads

We provide trained models from the optimal configurations for download and inference. Please refer to combine.md for training configs and results.

Dataset Backbone 3DPW (PA-MPJPE) Download
H36M, MI, COCO, LSP, LSPET, MPII ResNet-50 51.66 model
H36M, MI, COCO, LSP, LSPET, MPII HRNet-W32 49.18 model
H36M, MI, COCO, LSP, LSPET, MPII Twins-SVT 48.77 model
H36M, MI, COCO, LSP, LSPET, MPII Twins-SVT 47.70 model
EFT-[COCO, LSPET, MPII], H36M, SPIN-MI HRNet-W32 47.68 model
EFT-[COCO, LSPET, MPII], H36M, SPIN-MI Twins-SVT 47.31 model
H36M, MI, EFT-COCO HRNet-W32 48.08 model
H36M, MI, EFT-COCO Twins-SVT 48.27 model
H36M, MuCo, EFT-COCO Twins-SVT 47.92 model

Algorithms

We benchmarked our major findings on several algorithms and hope to add more in the future. Please refer to algorithms.md for training configs and logs.

(click to collapse)
  1. SPIN
  2. GraphCMR
  3. PARE
  4. Mesh Graphormer

Installation

General set-up instructions follow that of MMHuman3d. Please refer to install.md for installation.

Train

Training with a single / multiple GPUs

python tools/train.py ${CONFIG_FILE} ${WORK_DIR} --no-validate

Example: using 1 GPU to train HMR.

python tools/train.py ${CONFIG_FILE} ${WORK_DIR} --gpus 1 --no-validate

Training with Slurm

If you can run MMHuman3D on a cluster managed with slurm, you can use the script slurm_train.sh.

./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR} ${GPU_NUM} --no-validate

Common optional arguments include:

  • --resume-from ${CHECKPOINT_FILE}: Resume from a previous checkpoint file.
  • --no-validate: Whether not to evaluate the checkpoint during training.

Example: using 8 GPUs to train HMR on a slurm cluster.

./tools/slurm_train.sh my_partition my_job configs/hmr/resnet50_hmr_pw3d.py work_dirs/hmr 8 --no-validate

You can check slurm_train.sh for full arguments and environment variables.

Evaluation

There's five benchmarks for evaluation:

  • 3DPW-test (P2)
  • H36m-test (P2)
  • EFT-COCO-val
  • EFT-LSPET-test
  • EFT-OCHuman-test

Evaluate with a single GPU / multiple GPUs

python tools/test.py ${CONFIG} --work-dir=${WORK_DIR} ${CHECKPOINT} --metrics=${METRICS}

Example:

python tools/test.py configs/hmr/resnet50_hmr_pw3d.py --work-dir=work_dirs/hmr work_dirs/hmr/latest.pth --metrics pa-mpjpe mpjpe

Evaluate with slurm

If you can run MMHuman3D on a cluster managed with slurm, you can use the script slurm_test.sh.

./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} ${CONFIG} ${WORK_DIR} ${CHECKPOINT} --metrics ${METRICS}

Example:

./tools/slurm_test.sh my_partition test_hmr configs/hmr/resnet50_hmr_pw3d.py work_dirs/hmr work_dirs/hmr/latest.pth 8 --metrics pa-mpjpe mpjpe

FLOPs

tools/get_flops.py is a script adapted from flops-counter.pytorch and MMDetection to compute the FLOPs and params of a given model.

python tools/get_flops.py ${CONFIG_FILE} [--shape ${INPUT_SHAPE}]

You will get the results like this.

==============================
Input shape: (3, 1280, 800)
Flops: 239.32 GFLOPs
Params: 37.74 M
==============================

Note: This tool is still experimental and we do not guarantee that the number is absolutely correct. You may well use the result for simple comparisons, but double check it before you adopt it in technical reports or papers.

  1. FLOPs are related to the input shape while parameters are not. The default input shape is (1, 3, 224, 224).
  2. Some operators are not counted into FLOPs like GN and custom operators. Refer to mmcv.cnn.get_model_complexity_info() for details.

Citation

If you find our work useful for your research, please consider citing the paper:

@inproceedings{
  title={Benchmarking and Analyzing 3D Human Pose and Shape Estimation Beyond Algorithms},
  author={Pang, Hui En and Cai, Zhongang and Yang, Lei and Zhang, Tianwei and Liu, Ziwei},
  booktitle={NeurIPS},
  year={2022}
}

License

Distributed under the S-Lab License. See LICENSE for more information.

Acknowledgements

This study is supported by NTU NAP, MOE AcRF Tier 2 (T2EP20221-0033), and under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s).