HolisticTracker is a 3-stage expressive human body tracking pipeline that estimates SMPL-X + FLAME mesh parameters from monocular video. It was developed for building the TEDWB1k dataset and training HolisticAvatar.
The pipeline adopts the Expressive Human Model (EHM) representation from GUAVA, which jointly optimizes SMPL-X body and FLAME head parameters in a unified mesh. The tracker extends this with multi-stage optimization and temporal smoothness for robust video-level tracking.
- Track Base (
infer_ehmx_track_base.py): Per-frame perception — body bbox, Sapiens 1B 133-keypoint detection, HaMeR for per-hand MANO regression, MediaPipe FaceMesh for 478-point face landmarks, plus 70 / 203-point face landmark models. SMPL-X is initialized with PIXIE and face / hand crops are computed from the keypoints. - FLAME Refinement (
infer_ehmx_flame.py): Refines head / jaw / expression / eye / eyelid parameters using FLAME-specific landmarks and pixel3dmm for dense face alignment. - SMPL-X Optimization (
infer_ehmx_smplx.py): Joint optimization of full-body SMPL-X parameters (body, hands, expression) with per-frame and temporal losses, consistent with the FLAME face fit.
- Joint SMPL-X + FLAME body model (EHM) with hand/head scale optimization
- Sapiens-based face and body landmark detection
- PIXIE body estimation for robust initialization
- Per-shot temporal smoothness regularization
- Multi-GPU parallel processing with configurable workers per GPU
- Dataset merging with train/val/test split support
Tested on Ubuntu 20.04 with CUDA 11.8+ and Python 3.10+.
git clone https://github.com/initialneil/HolisticTracker.git
cd HolisticTracker
conda create -n holistic_tracker python=3.10
conda activate holistic_tracker
pip install -r requirements.txt
pip install "git+https://github.com/facebookresearch/pytorch3d.git@v0.7.7"Place body model files under data/body_models/:
- SMPL-X: Download
SMPLX_NEUTRAL_2020.npzfrom SMPL-X - FLAME: Download
generic_model.pklfrom FLAME2020
Download the pretrained weights and extract to pretrained/.
export PYTHONPATH='.'
python tracking_video.py \
--in_root /path/to/videos \
--output_dir /path/to/output \
--save_vis_video --save_images \
--body_estimator_type pixie \
--check_hand_score 0.0 -n 1 -v 0export PYTHONPATH='.'
python infer_ehmx_parallel.py \
--video_root /path/to/videos \
--output_dir /path/to/output \
--distribute 0,1,2,3,4,5,6,7 \
--workers_per_gpu 2 \
--body_estimator_type pixieAfter tracking, merge results into a training-ready format:
python merge_ehmx_dataset.py \
--dataset_dir /path/to/tracked_output \
--test_list /path/to/test.txt \
--images_dir /path/to/source_images \
--mattes_dir /path/to/alpha_mattesThis produces optim_tracking_ehm.pkl, id_share_params.pkl, videos_info.json, dataset_frames.json, and extra_info.json for use with HolisticAvatar's TrackedData loader.
This project is released under the MIT License.