Chen Si1 β Yulin Liu1 β Bo Ai1 β Jianwen Xie2 β Rolandos Alexandros Potamias3 β Chuanxia Zheng4 β Hao Su1
1UC San Diego β 2Lambda, Inc β 3Imperial College London β 4Nanyang Technological University
AnyHand is a large-scale synthetic RGB-D dataset for 3D hand pose estimation, containing 2.5M single-hand and 4.1M hand-object interaction images with full geometric annotations (RGB, depth, mask, 3D pose/shape, camera intrinsics).
This repository releases fine-tuned checkpoints of HaMeR and WiLoR co-trained with AnyHand, which achieve consistent improvements on standard benchmarks (FreiHAND, HO-3D) and better generalization to out-of-domain scenes. More components are coming β see the roadmap below.
| Component | Status |
|---|---|
Fine-tuned HaMeR & WiLoR checkpoints + unified AnyHandPredictor |
β Released |
| AnyHandNet-D | π Coming soon |
| AnyHand generation pipeline | π Coming soon |
| AnyHand dataset | π Coming soon |
We release improved checkpoints for both HaMeR and WiLoR, co-trained with AnyHand. These are drop-in replacements for the original checkpoints β no architecture changes are needed.
Both WiLoR and HaMeR are included as git submodules.
git clone --recurse-submodules https://github.com/chen-si-cs/AnyHand.git
cd AnyHandIf you already cloned without --recurse-submodules:
git submodule update --init --recursiveThis populates WiLoR/ (WiLoR codebase) and third_party/hamer/ (HaMeR codebase).
conda create -n anyhand python=3.10 -y
conda activate anyhandInstall PyTorch (adjust CUDA version β see pytorch.org):
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118Then run the preparation scripts for whichever backend(s) you need: WiLoR only (recommended for most users):
bash scripts/prepare_wilor.sh
HaMeR only:
bash scripts/prepare_hamer.sh
Both:
bash scripts/prepare_wilor.sh
bash scripts/prepare_hamer.sh
Each script installs the corresponding Python package, downloads the AnyHand checkpoint, and prints a checklist of remaining manual steps.
WiLoR requires the MANO hand model, which must be downloaded manually due to its license.
- Register and download from the MANO website
- Unzip and place the right hand model at:
AnyHand/
βββ mano_data/
βββ MANO_RIGHT.pkl
Note: By using MANO, you agree to the MANO license terms.
The prepare scripts above handle this automatically once you fill in
your HuggingFace username. See the scripts for manual wget alternatives.
After running, your layout will be:
pretrained_models/
βββ anyhand_wilor.ckpt β AnyHand fine-tuned WiLoR
βββ model_config_wilor.yaml β WiLoR config
βββ detector.pt β YOLO hand detector (shared by both)
βββ hamer_ckpts/
βββ checkpoints/
βββ anyhand_hamer.ckpt β AnyHand fine-tuned HaMeR
βββ model_config.yaml β HaMeR config
We provide AnyHandPredictor, a single class that wraps both models behind one consistent API. It always uses WiLoR's YOLO hand detector for bbox detection, then dispatches to whichever reconstruction backbone you choose.
Predict
from anyhand import AnyHandPredictor
# WiLoR backend (default)
predictor = AnyHandPredictor(backend='wilor')
hands = predictor.predict('path/to/image.jpg')
for hand in hands:
print(f"{'Right' if hand.is_right else 'Left'} hand score={hand.score:.2f}")
print(f" MANO pose : {hand.mano_pose.shape}") # (48,)
print(f" MANO shape : {hand.mano_shape.shape}") # (10,)
print(f" Vertices : {hand.vertices.shape}") # (778, 3)
print(f" Keypoints3D: {hand.keypoints_3d.shape}") # (21, 3)
print(f" Keypoints2D: {hand.keypoints_2d.shape}") # (21, 2)
print(f" Cam translation : {hand.cam_t}") # (3,)
# HaMeR backend
predictor = AnyHandPredictor(backend='hamer')
hands = predictor.predict('path/to/image.jpg')
# Both at once β same bboxes, two sets of predictions
# hand.backend == 'wilor' or 'hamer' tells you which is which
predictor = AnyHandPredictor(backend='both')
hands = predictor.predict('path/to/image.jpg')
# Batch of images
import cv2
imgs = [cv2.imread(p) for p in ['img1.jpg', 'img2.jpg']]
batch_results = predictor.predict(imgs) # List[List[HandPrediction]]
# Override backend per call
hands = predictor.predict('photo.jpg', backend='wilor')Render mesh overlay
# Renders all detected hand meshes overlaid on the image.
# Returns a BGR uint8 numpy array ready for cv2.imwrite().
overlay = predictor.render_overlay('photo.jpg', hands)
cv2.imwrite('out.jpg', overlay)
# Custom mesh colour (float RGB in [0, 1])
overlay = predictor.render_overlay('photo.jpg', hands, mesh_color=(0.9, 0.4, 0.2))Save per-hand .obj meshes
# Saves <prefix>_0.obj, <prefix>_1.obj, β¦ into out_dir/.
# Returns the list of absolute paths written.
paths = predictor.save_meshes(hands, out_dir='out/meshes', prefix='frame0042')
print(paths)
# ['β¦/out/meshes/frame0042_0.obj', 'β¦/out/meshes/frame0042_1.obj']Project 3D points to 2D pixels
img = cv2.imread('photo.jpg')
for hand in hands:
# Works on vertices (778, 3) or keypoints (21, 3) β any (N, 3) array
kpts_2d = AnyHandPredictor.project_3d_to_2d(
hand.keypoints_3d,
hand.cam_t,
hand.focal_length,
img_size=(img.shape[1], img.shape[0]),
) # (21, 2) float32 pixel coordinates
for x, y in kpts_2d.astype(int):
cv2.circle(img, (x, y), 4, (0, 255, 0), -1)
cv2.imwrite('keypoints.jpg', img)Custom checkpoint paths
predictor = AnyHandPredictor(
backend = 'wilor',
wilor_ckpt = '/path/to/anyhand_wilor.ckpt',
wilor_cfg = '/path/to/model_config_wilor.yaml',
detector_pt = '/path/to/detector.pt',
device = 'cuda:0',
det_conf = 0.4, # YOLO detection confidence threshold
rescale_factor = 2.0, # bbox padding factor
batch_size = 32,
)Command-line demo (WiLoR's original script, with AnyHand checkpoint)
python WiLoR/demo.py \
--img_folder demo_img \
--out_folder demo_out \
--checkpoint pretrained_models/anyhand_wilor.ckpt \
--cfg pretrained_models/model_config_wilor.yaml \
--save_meshComing soon.
Coming soon.
Coming soon.
- AnyHand checkpoints: CC-BY-NC-ND
- WiLoR codebase: CC-BY-NC-ND
- MANO model: MANO license
- Detector (
detector.pt): Ultralytics license
If you find AnyHand useful, please cite our work and the baselines:
@misc{si2026anyhand,
title = {AnyHand: A Large-Scale Synthetic Dataset for RGB(-D) Hand Pose Estimation},
author = {Si, Chen and Liu, Yulin and Ai, Bo and Xie, Jianwen and
Potamias, Rolandos Alexandros and Zheng, Chuanxia and Su, Hao},
year = {2026},
eprint = {XXXX.XXXXX},
archivePrefix = {arXiv},
primaryClass = {cs.CV}
}
@misc{potamias2024wilor,
title={WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild},
author={Rolandos Alexandros Potamias and Jinglei Zhang and Jiankang Deng and Stefanos Zafeiriou},
year={2024},
eprint={2409.12259},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
@inproceedings{pavlakos2024reconstructing,
title={Reconstructing Hands in 3D with Transformers},
author={Pavlakos, Georgios and Shan, Dandan and Radosavovic, Ilija and Kanazawa, Angjoo and Fouhey, David and Malik, Jitendra},
booktitle={CVPR},
year={2024}
}