Skip to content

KevinyWu/hug

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HUG teaser

Human Universal Grasping

Project Website arXiv Paper PDF Weights

Trained solely on real-world human grasping data, HUG generates diverse human hand grasps for any user-selected object in a single RGB-D image captured from a stereo camera. HUG works with any stereo camera, anywhere, out of the box.

HUG demo

🗓️ Release

  • Paper and website
  • Inference + visualization code
  • 1M-HUGs dataset (planned 2026/06/29)
  • HUG-Bench benchmark, assets + sim eval (planned 2026/06/29)
  • Training code (planned 2026/06/29)

📦 Installation

Tested on Ubuntu 22.04/24.04, CUDA 12.8, PyTorch 2.9.1, Python 3.10.

# 1) Environment
conda env create -f environment.yaml && conda activate hug
pip install torch==2.9.1 torchvision==0.24.1 torchaudio==2.9.1 --index-url https://download.pytorch.org/whl/cu128
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.9.1+cu128.html
pip install --no-build-isolation git+https://github.com/mattloper/chumpy.git@580566e
pip install -e .

# 2) Download required assets listed below
  • MANO: Register → download and unzip the MANO models → copy contents of mano_v*_*/ to assets/mano/
  • DINOv2: Auto-downloads on first use
  • HUG weights: hf download kevinywu/hug hug_full.safetensors --local-dir checkpoints/

🚀 Usage

HUG predicts human grasps in MANO form for selected objects in the camera frame. Currently, only inference is supported. We provide sample inputs of one image from each scene in HUG-Bench.

CKPT=checkpoints/hug_full.safetensors
DATA=data/hug_bench/

# App: click an object to predict a grasp
# --save-pred saves each clicked prediction to $DATA/grasp_pred/
python -m hug.app --checkpoint-path "$CKPT" --dataset-path "$DATA" --save-pred

If predictions are saved with --save-pred, you can visualize them with:

python -m hug.visualize_predictions --dataset-path "$DATA"

Custom inputs

You can also run inference on your own captures. Put three files in one folder, we provide an example in data/custom/ for a ZED 2i output.

  • RGB: 8-bit image ("rgb.png"/"rgb.jpg"), any H×W, grayscale is also supported.
  • Depth: 16-bit single-channel PNG ("depth.png" in uint16), millimeter units, same H×W as RGB and registered to it. Use S2M2 to estimate depth for best results.
  • Intrinsics: text file ("intrinsics.txt") at the RGB resolution: either four numbers fx fy cx cy or a 3×3 K matrix. .npy/.json also accepted.
# Prepare inputs writes <stem>.pkl into the folder
python -m hug.prepare_inputs --dataset-path data/custom
python -m hug.app --checkpoint-path "$CKPT" --dataset-path data/custom --save-pred

📝 Citation

If you find our work useful, please consider citing our paper:

@article{wu2026hug,
  title={Human Universal Grasping},
  author={Kevin Yuanbo Wu and Tianxing Zhou and Isaac Tu and Billy Yan and Irmak Guzey and David Fouhey and Dandan Shan and Lerrel Pinto},
  journal={arXiv preprint arXiv:2606.17054},
  year={2026}
}

About

Official Implementation of "Human Universal Grasping".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages