Human Universal Grasping

Trained solely on real-world human grasping data, HUG generates diverse human hand grasps for any user-selected object in a single RGB-D image captured from a stereo camera. HUG works with any stereo camera, anywhere, out of the box.

🗓️ Release

Paper and website
Inference + visualization code
1M-HUGs dataset (planned 2026/06/29)
HUG-Bench benchmark, assets + sim eval (planned 2026/06/29)
Training code (planned 2026/06/29)

📦 Installation

Tested on Ubuntu 22.04/24.04, CUDA 12.8, PyTorch 2.9.1, Python 3.10.

# 1) Environment
conda env create -f environment.yaml && conda activate hug
pip install torch==2.9.1 torchvision==0.24.1 torchaudio==2.9.1 --index-url https://download.pytorch.org/whl/cu128
pip install torch-cluster -f https://data.pyg.org/whl/torch-2.9.1+cu128.html
pip install --no-build-isolation git+https://github.com/mattloper/chumpy.git@580566e
pip install -e .

# 2) Download required assets listed below

MANO: Register → download and unzip the MANO models → copy contents of mano_v*_*/ to assets/mano/
DINOv2: Auto-downloads on first use
HUG weights: hf download kevinywu/hug hug_full.safetensors --local-dir checkpoints/

🚀 Usage

HUG predicts human grasps in MANO form for selected objects in the camera frame. Currently, only inference is supported. We provide sample inputs of one image from each scene in HUG-Bench.

CKPT=checkpoints/hug_full.safetensors
DATA=data/hug_bench/

# App: click an object to predict a grasp
# --save-pred saves each clicked prediction to $DATA/grasp_pred/
python -m hug.app --checkpoint-path "$CKPT" --dataset-path "$DATA" --save-pred

If predictions are saved with --save-pred, you can visualize them with:

python -m hug.visualize_predictions --dataset-path "$DATA"

Custom inputs

You can also run inference on your own captures. Put three files in one folder, we provide an example in data/custom/ for a ZED 2i output.

RGB: 8-bit image ("rgb.png"/"rgb.jpg"), any H×W, grayscale is also supported.
Depth: 16-bit single-channel PNG ("depth.png" in uint16), millimeter units, same H×W as RGB and registered to it. Use S2M2 to estimate depth for best results.
Intrinsics: text file ("intrinsics.txt") at the RGB resolution: either four numbers fx fy cx cy or a 3×3 K matrix. .npy/.json also accepted.

# Prepare inputs writes <stem>.pkl into the folder
python -m hug.prepare_inputs --dataset-path data/custom
python -m hug.app --checkpoint-path "$CKPT" --dataset-path data/custom --save-pred

📝 Citation

If you find our work useful, please consider citing our paper:

@article{wu2026hug,
  title={Human Universal Grasping},
  author={Kevin Yuanbo Wu and Tianxing Zhou and Isaac Tu and Billy Yan and Irmak Guzey and David Fouhey and Dandan Shan and Lerrel Pinto},
  journal={arXiv preprint arXiv:2606.17054},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
data		data
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Human Universal Grasping

🗓️ Release

📦 Installation

🚀 Usage

Custom inputs

📝 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Human Universal Grasping

🗓️ Release

📦 Installation

🚀 Usage

Custom inputs

📝 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages