Humanoid-GPT is the first GPT-style humanoid motion Transformer trained with causal attention on a billion-scale motion corpus for whole-body control. Unlike prior shallow MLP trackers constrained by scarce data and an agility–generalization trade-off, Humanoid-GPT is pre-trained on a 2B-frame retargeted corpus that unifies all major mocap datasets with large-scale in-house recordings.
🔬 Key Contributions
- Billion-Scale Pre-Training: First to scale humanoid motion learning to 2B frames
- GPT-Style Architecture: Causal Transformer with Rotary Position Embeddings (RoPE)
- Zero-Shot Generalization: Track arbitrary unseen motions without fine-tuning
| Feature | Description |
|---|---|
| 🧠 Architecture | Causal Transformer with RoPE, supporting variable-length motion sequences |
| 📊 Scale | Pre-trained on 2B motion frames from unified mocap datasets |
| 🎯 Zero-Shot | Unprecedented generalization to unseen motions and tasks |
| 🤖 Platform | Optimized for Unitree G1 humanoid robot (29 DOF whole-body) |
| ⚡ Speed | GPU-accelerated simulation with MuJoCo-MJX |
The following features are planned for future implementation:
- ✅ Inference & deployment code.
- ✅ Pre-trained model checkpoints (
[storage/ckpts/pns_wo_priv216.onnx](storage/ckpts/pns_wo_priv216.onnx)). - Training code.
- Training data.
- NVIDIA GPU with CUDA 12.x
- MacOS is also supported for testing if you skip jax[cuda12] and use mjpython (e.g.
mjpython -m scripts.app). - Conda / Miniconda
git clone https://github.com/GalaxyGeneralRobotics/Humanoid-GPT.git
cd Humanoid-GPT
conda create -n h-gpt python=3.12 -y
conda activate h-gpt
pip install -e ".[cuda]" # or ".[cpu]" on MacOS, or "." for real robot deploy-onlyOn MacOS, use mjpython instead of python for the MuJoCo viewer (e.g. mjpython -m scripts.app).
We support multiple Unitree G1 hardware versions via the G1_VERSION env var (default 5010). The asset folder storage/assets/unitree_g1_${G1_VERSION}/ is selected automatically:
G1_VERSION=5010 python -m scripts.inference ... # default: 5010A pre-trained tracking policy (.onnx) and a sample trajectory under
storage/test/ are all you need to get started.
# Interactive Gradio demo
python -m scripts.app
# Track a single motion / a folder of motions
python -m scripts.inference --load_path storage/ckpts/pns_wo_priv216.onnx --mocap_path storage/test
# Parallel evaluation over a folder of trajectories
python -m scripts.eval_parallel --load_path storage/ckpts/pns_wo_priv216.onnx \
--mocap_path storage/test --workers 32 --privileged
# Visualize a reference trajectory
python -m scripts.vis --mocap_path storage/testThe expected motion format is a .npz containing either qpos directly, or
root_pos / root_rot / dof_pos arrays. To convert retargeted mocap into
the keypoint representation the policy consumes:
python tracking/convert_qpos2kpt.py --mocap_npz <mocap_path.npz> --debug # single file (debug viz)
python tracking/convert_parallel.py --src_dir <in_dir> --save_dir <out_dir> --num_workers 32Deployment on Unitree G1 is split into sub-modules under deploy/ — start with
deploy/DEPLOY.md for install / SDK setup, then:
# Simulation
python -m deploy.play_track --track-dir storage/test
# Real robot
python -m deploy.play_track --real --net <nic_name>- 🖥️
onboard_deploy/— on-board (Jetson Orin) deploy. - 🖥️
onboard_deploy_wo_GMR/— on-board variant that streams retargeting from a host. - ✋
brainco/— BrainCo dexterous-hand tracking variant.
Humanoid-GPT/
├── 📂 tracking/ # Inference core: constants, infer_utils, ONNX policy wrapper (policy.py),
│ # keypoint conversion (convert_qpos2kpt.py) and tracking metrics
├── 📂 scripts/ # inference.py · eval_parallel.py · vis.py · app.py (gradio demo)
├── 📂 deploy/ # Real-robot deployment — see deploy/DEPLOY.md
│ ├── onboard_deploy/ # On-board (Jetson) SSH deployment
│ ├── onboard_deploy_wo_GMR/ # On-board variant with host-side retargeting
│ └── brainco/ # BrainCo dexterous-hand tracking variant
├── 📂 projects/ # Optional side modules
│ ├── hme/ # Harmonic Motion Encoder (Periodic Autoencoder)
│ ├── gqs/ # General Quality Selection (physics + diversity scoring)
│ └── tracking_transformer/ # Transformer tracking policy (inference / deploy)
├── 📂 utils/ # MuJoCo / MJX simulation, transforms, video rendering
└── 📂 storage/ # Assets, configs, sample trajectory, released checkpoints
@article{humanoid-gpt26,
title = {Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking},
author = {Qi, Zekun and Chen, Xuchuan and Liu, Dairu and Lin, Chenghuai and Lian, Yunrui and
Liang, Sikai and Zhang, Zhikai and Guan, Yu and Wang, Jilong and Zhang, Wenyao and
Yu, Xinqiang and Wang, He and Yi, Li},
journal = {arXiv preprint arXiv:2606.03985},
year = {2026}
}Licensed under Apache 2.0. Built on top of MuJoCo, Brax and the Unitree G1 platform.
