3D Human Pose Estimation with Spatial and Temporal Transformers

This repo is the official implementation for CrossFormer: Cross Spatio-Temporal Transformer for 3D Human Pose Estimation

Our code is built on top of VideoPose3D.

Environment

The code is developed and tested under the following environment

Python 3.8.2
PyTorch 1.7.1
CUDA 11.0

You can create the environment:

conda env create -f crossformer.yml

Dataset

Our code is compatible with the dataset setup introduced by Martinez et al. and Pavllo et al.. Please refer to VideoPose3D to set up the Human3.6M dataset (./data directory).

Evaluating pre-trained models

We provide the pre-trained 81-frame model (CPN detected 2D pose as input) here. To evaluate it, put it into the ./checkpoint directory and run:

python run_crossformer.py -k cpn_ft_h36m_dbb -f 81 -c checkpoint --evaluate best_epoch44.4.bin

We also provide pre-trained 81-frame model (Ground truth 2D pose as input) here. To evaluate it, put it into the ./checkpoint directory and run:

python run_crossformer.py -k gt -f 81 -c checkpoint --evaluate best_epoch_gt_28.5.bin

Training new models

To train a model from scratch (CPN detected 2D pose as input), run:

python run_crossformer.py -k cpn_ft_h36m_dbb -f 27 -lr 0.00004 -lrd 0.99

To train a model from scratch (Ground truth 2D pose as input), run:

python run_crossformer.py -k gt -f 81 -lr 0.0004 -lrd 0.99

81 frames achieves 28.5 mm (MPJPE).

Visualization and other functions

We keep our code consistent with VideoPose3D. Please refer to their project page for further information.

Acknowledgement

Part of our code is borrowed from VideoPose3D. We thank the authors for releasing the codes.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
common		common
LICENSE		LICENSE
README.md		README.md
commona		commona
crossformer.yml		crossformer.yml
run_crossformer.py		run_crossformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3D Human Pose Estimation with Spatial and Temporal Transformers

Environment

Dataset

Evaluating pre-trained models

Training new models

Visualization and other functions

Acknowledgement

About

Releases

Packages

Languages

License

mfawzy/CrossFormer

Folders and files

Latest commit

History

Repository files navigation

3D Human Pose Estimation with Spatial and Temporal Transformers

Environment

Dataset

Evaluating pre-trained models

Training new models

Visualization and other functions

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages