GraspGenX is a cross-embodiment grasp generation framework that produces high-quality 6-DOF grasps for any robot gripper — including novel grippers not seen during training. Unlike GraspGen, which trains a separate model per gripper, GraspGenX trains a single model conditioned on a gripper representation derived from the gripper's swept volume. This enables generalization across grippers with different morphologies, kinematics, and degrees of freedom. We release pretrained model checkpoints trained with a large-scale simulated grasp dataset, spanning over 2 billion grasps computed across 32 procedurally generated grippers in 6 kinematic families and 8000+ objects.
![]() |
![]() |
![]() |
![]() |
![]() |
- Release News
- Upcoming Features
- Installation
- Inference Demos
- End-to-End Demo And VLA Data Generation
- Integrating a New Gripper
- Agentic Workflows
- FAQ
- License
- Citation
- Contact
- [06/01/2026] Initial code and model release!
- Support for YAM gripper (feel free to contribute and make a PR!)
- DROID VLA pick-and-place data generation (feel free to contribute and make a PR!)
- Grasp Data Generation in Simulation (Newton)
- Full Training Dataset release
Choose your preferred installation method. For training, we recommend Docker. For inference, uv is sufficient.
| Method | Use Case | Complexity |
|---|---|---|
| Docker | Training + Inference | ⭐⭐⭐ |
| uv | Inference | ⭐ |
Install uv if you don't have it, then:
git clone https://github.com/NVlabs/GraspGenX.git && cd GraspGenX && uv syncNote:
pip install -e .also works inside a conda or Python virtual environment with python=3.10/3.11 .
Training has only been tested inside the Docker.
git clone https://github.com/NVlabs/GraspGenX.git && cd GraspGenX && bash docker/build.shThe GraspGenX model checkpoints are downloaded automatically on the first import from https://huggingface.co/adithyamurali/GraspGenXModel and cloned to <repo_root>/ext/graspgenx_checkpoints. If you have already cloned the checkpoints elsewhere, specify the path by setting the env variable:
export GRASPGENX_CHECKPOINT_DIR=/path/to/your/graspgenx_checkpointsEach gripper needs a corresponding config file (with sweep volume information etc.) and URDF as an input to GraspGenX. We have already curated such gripper meta data for a host of popular grippers (see gripper_descriptions). This package is needed at runtime for running inference on supported grippers. This package is automatically cloned on the first import to <repo_root>/ext/gripper_descriptions, but if you have downloaded it elsewhere you'll have to specify the following ENV variable:
export GRASPGENX_GRIPPER_CFG_DIR=/path/to/your/gripper_descriptionsVerify the assets can be imported properly:
uv run python -c "from graspgenx import get_gripper_descriptions_root; print(get_gripper_descriptions_root())"To list all the available grippers:
uv run python scripts/list_grippers.pyGraspGenX supports two input modalities:
- Partial point cloud observations — e.g. from a depth camera
- Object meshes —
.obj,.stl,.ply, or USD formats
The demo script generates grasp poses for segmented object point clouds and visualizes the results using viser. You can access the GUI by visiting http://localhost:8080 on a browser. Note that exact port may be different sometimes (it will be specified in the log output). Click Next to cycle through the objects. The gripper mesh for the top scoring grasp is visualized in blue.
uv run python scripts/demo_object_pc.py \
--sample_data_dir assets/sample_data/real_world \
--gripper_name robotiq_2f_85 \
--plot_top_mesh![]() |
![]() |
![]() |
![]() |
Replace --gripper_name with any supported gripper, e.g. franka_umi, franka_panda, robotiq_2f_85, robotiq_2f_140, unitree_g1, inspire_hand, surge_hand, barrett_hand, galaxea_g1, ezgripper. See Supported Grippers for the full list. By default the demo uses the GraspMoE planner (grasps sampled with diffusion ∪ Oriented-Bounding grasps, all scored by the discriminator). To fall back to the diffusion-only planner: --planner diffusion. To restrict the GraspMoE OBB sweep to the top face only: --moe_obb_density sparse (single centroid pose) or --moe_obb_density dense (positions along the OBB's longer XY axis, top-down only).
The scene point cloud demo loads full-scene point clouds (with per-object segmentation), runs GraspGenX inference on every segmented object, and visualizes the predicted grasps in the context of the full scene. Grasps that would collide with the rest of the scene are filtered out. You can access the GUI by visiting http://localhost:8080 on a browser.
uv run python scripts/demo_scene_pc.py \
--sample_data_dir assets/sample_data/real_world \
--gripper_name robotiq_2f_85![]() |
![]() |
![]() |
![]() |
By default, the model runs GraspMoE (Grasp Mixture-of-Experts) sampling with both the diffusion model or Oriented-Bounding-Box (OBB)-based heuristics. To use just the diffusion model, you can use --planner diffusion and to remove collision checking, you can add --no-filter_collisions.
The script samples points from the mesh surface, runs GraspGenX inference, and visualizes both the mesh and predicted grasps. You can access the GUI by visiting http://localhost:8080 on a browser.
uv run python scripts/demo_object_mesh.py \
--mesh_file assets/sample_data/object_mesh/banana.obj \
--mesh_scale 1.0 \
--gripper_name inspire_hand \
--grasp_threshold -1.0 --return_topk --topk_num_grasps 100 --plot_top_meshSupported mesh formats: .obj, .stl, .ply, and USD (.usd, .usda, .usdc, .usdz). USD files are loaded via scene_synthesizer. For headless/batch inference (no visualization), add --no-visualization --output_file /tmp/grasps.yml.
![]() |
![]() |
![]() |
Beyond predicting grasps, end2end/ wires GraspGenX into
a full closed-loop pick-and-place pipeline: GraspGenX grasps → cuRobo
collision-free motion planning → Newton/MuJoCo physics replay (gravity +
contacts) → MP4 render + USD export for IsaacSim / Omniverse. Four
example demos ship out of the box — a Franka Panda (single object → bin, and a
3-object clutter scene → bin) and a UR10e with two multi-finger hands
(arx_x5 pick-and-lift, surge_hand pick → bin).
This pipeline needs an extra dependency stack (cuRobo, Newton, MuJoCo, CoACD, USD, fcl):
uv sync --extra end2endSee end2end/README.md for the run command, config
layout, and visualization tools.
GraspGenX runs zero-shot on any gripper given a URDF + meshes. The only thing GraspGenX needs that a stock URDF does not provide is a config.json describing the open/close joint states and the swept-volume bounding boxes that condition the model. The interactive gripper config wizard generates that file (and copies the URDF/meshes into the right place) for you; everything else — visual mesh, collision mesh, point-cloud cache, etc. — is produced automatically downstream.
uv run python scripts/gripper_config_wizard.py \
--urdf /path/to/your/gripper.urdf \
--name <gripper_name> \
--port 8081For example, to onboard the right Sharpa hand from a sibling gripper_descriptions checkout:
uv run python scripts/gripper_config_wizard.py \
--urdf ../gripper_descriptions/gripper_descriptions/assets/sharpa/right_sharpa_wave.urdf \
--name sharpa_right \
--port 8081Open http://localhost:8081 and step through the GUI. There are six panels — the wizard prompts you in order and you click "Confirm" to advance:
- Align Base Frame. Rotate so
+Zis the approach axis (along the fingers) and+Xis the closing direction. - Confirm Open Configuration. Drive joint sliders to the fully-open pose.
- Annotate Open Sweep Volume. Drag the blue box to enclose the inner finger volume at the open state.
- Annotate Half-Open Sweep Volume. Set the closed pose, then drag the orange box to enclose the inner volume at the half-open state.
- Review Animation. Watch open ↔ close with both boxes overlaid; go back if anything looks off.
- Confirm and Save. Pick the gripper type (
parallel_2f,revolute_2f,revolute_3f) and symmetry flag, then save.
On save, the wizard writes a fresh directory at <gripper_descriptions_root>/assets/x_grippers/<name>/ (resolved via $GRASPGENX_GRIPPER_CFG_DIR, or the auto-cloned <repo>/ext/gripper_descriptions/ — see Gripper Descriptions Asset Repo) containing:
gripper.urdfandmeshes/— copied from your source URDF.config.json— the open/close joint states, fingertip, swept-volume boxes, gripper type, and base rotation you annotated in the GUI. This is the file the model conditions on.vis_mesh.obj— merged visual mesh of the gripper at the open pose.
The collision mesh and point-cloud caches needed at inference time are generated lazily on first use; you don't need to run anything else.
Spin up the descriptor viewer to sanity-check the saved config (animates open ↔ close with both sweep volumes overlaid):
uv run python scripts/vis_gripper_desc.py --gripper <gripper_name> --port 8081Then run inference against a point cloud as you would for any built-in gripper. Any of the demo scripts should work:
uv run python scripts/demo_object_pc.py \
--sample_data_dir assets/sample_data/real_world \
--gripper_name <gripper_name> \
--plot_top_meshIf you've onboarded a gripper that isn't already in gripper_descriptions, please consider opening a PR with the saved assets/x_grippers/<name>/ directory. The community benefits enormously from a growing library of curated grippers, and your config will let everyone else skip the wizard for that hand.
Procedural gripper meshes and URDFs are stored under assets/proc_grippers/, organized by kinematic family. Each gripper directory contains the URDF, collision meshes, and swept volume point cloud.
See mcp/ and client-server/.
The end2end takes a while longer and can be skipped:
uv run --no-sync pytest -m "not end2end"GraspGen trains a separate model per gripper. GraspGenX trains a single model that generalizes across grippers by conditioning on a swept-volume representation of the gripper. This means GraspGenX can generate grasps for novel grippers not seen during training, without re-training.
GraspGenX supports any gripper for which you can compute a swept volume. The released model was trained on 32 procedural grippers spanning 6 kinematic families (parallel 2-finger, revolute 2-finger, revolute 3-finger — each with two design variants). It generalizes zero-shot to multiple real-world grippers.
See the Integrating a New Gripper section. The interactive wizard generates config.json (and copies the URDF/meshes) for any gripper given just its URDF.
No. GraspGen and GraspGenX use different model architectures (per-gripper vs. cross-embodiment conditioning). You need GraspGenX-specific checkpoints.
Please post a GitHub issue and we will follow up, or email us directly.
This project is licensed under the Apache License 2.0. The model checkpoints are released under the NVIDIA Open Model License.
Please email us (admurali@nvidia.com) if you are using this repository for commerical deployment and require any support.
If you found this work useful, please consider citing:
@inproceedings{graspgenx2026,
title = {GraspGen-X: Cross-Embodiment 6-DOF Diffusion-based Grasping},
author = {Han, Beining and Chao, Yu-Wei and Coumans, Erwin and Eppner, Clemens and Sundaralingam, Balakumar and Deng, Jia and Birchfield, Stan and Murali, Adithyavairavan},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2026},
}
This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
Please reach out to Adithya Murali (admurali@nvidia.com) for further enquiries.
















