DragMesh: Interactive 3D Generation Made Easy

Official repository for the paper "DragMesh: Interactive 3D Generation Made Easy".

Tianshan Zhang*, Zeyu Zhang*†, Hao Tang^#

*Equal contribution. †Project lead. ^#Corresponding author.

Paper (PDF) | Project Website | Hugging Face Models

Note

GAPartNet (link above) is the canonical dataset source for all articulated assets used in DragMesh.

teaser.mp4

🧾 Citation

If you find DragMesh helpful, please cite:

✨ Intro

While generative models have excelled at creating static 3D content, the pursuit of systems that understand how objects move and respond to interactions remains a fundamental challenge. Current methods for articulated motion lie at a crossroads: they are either physically consistent but too slow for real-time use, or generative but violate basic kinematic constraints. We present DragMesh, a robust framework for real-time interactive 3D articulation built around a lightweight motion generation core. Our core contribution is a novel decoupled kinematic reasoning and motion generation framework. First, we infer the latent joint parameters by decoupling semantic intent reasoning (which determines the joint type) from geometric regression (which determines the axis and origin using our Kinematics Prediction Network (KPP-Net)). Second, to leverage the compact, continuous, and singularity-free properties of dual quaternions for representing rigid body motion, we develop a novel Dual Quaternion VAE (DQ-VAE). This DQ-VAE receives these predicted priors, along with the original user drag, to generate a complete, plausible motion trajectory. To ensure strict adherence to kinematics, we inject the joint priors at every layer of the DQ-VAE's non-autoregressive Transformer decoder using FiLM (Feature-wise Linear Modulation) conditioning. This persistent, multi-scale guidance is complemented by a numerically-stable cross-product loss to guarantee axis alignment. This decoupled design allows DragMesh to achieve real-time performance and enables plausible, generative articulation on novel objects without retraining, offering a practical step toward generative 3D intelligence.

📰 News

✅ TODO

Upload the DragMesh paper and project page.
Release the training and inference code.
Provide GAPartNet processing pipeline and LMDB builder.
Share checkpoints on Hugging Face.
Create an interactive presentation.
Publish a Hugging Face Space for browser-based manipulation.

⚡ Quick Start

🧩 Environment Setup

We ship a full Conda specification in environment.yml (environment name: dragmesh). It targets Python 3.10, CUDA 12.1, and PyTorch 2.4.1. Create or update via:

conda env create -f environment.yml
conda activate dragmesh
# or update an existing env
conda env update -f environment.yml --prune

The spec already installs trimesh, pyrender, pygltflib, viser, Objaverse, SAPIEN, pytorch3d, and tiny-cuda-nn. If you prefer a minimal setup, install those packages manually before running the scripts.

🛠️ Native Extensions

Chamfer distance kernels are required for the VAE loss. Clone and build the upstream project:

git clone https://github.com/ThibaultGROUEIX/ChamferDistancePytorch.git
cd ChamferDistancePytorch
python setup.py install
cd ..

📦 Data Preparation (GAPartNet)

Visit https://pku-epic.github.io/GAPartNet/ and download the articulated assets for the categories listed in config/category_split_v2.json.
Arrange files so that each object folder contains mobility_annotation_gapartnet.urdf, meta.json, and textured meshes (*.obj). Example:
```
data/gapartnet/<object_id>/
  |- mobility_annotation_gapartnet.urdf
  |- meta.json
  |- textured_objs/*.obj
```

Convert to LMDB for fast training IO:

python utils/build_lmdb.py \
  --dataset_root data/gapartnet \
  --output_prefix data/dragmesh \
  --config config/category_split_v2.json \
  --num_frames 16 \
  --num_points 4096
# Produces data/dragmesh_train.lmdb and data/dragmesh_val.lmdb

Use utils/balanced_dataset_utils.get_motion_type_weights with WeightedRandomSampler if you need balanced revolute/prismatic sampling.

🧠 Training

Dual Quaternion VAE

python scripts/train_vae_v2.py \
  --lmdb_train_path data/dragmesh_train.lmdb \
  --lmdb_val_path data/dragmesh_val.lmdb \
  --data_split_json_path config/category_split_v2.json \
  --output_dir outputs/vae \
  --num_epochs 300 \
  --batch_size 16 \
  --latent_dim 256 \
  --num_frames 16 \
  --mesh_recon_weight 10.0 \
  --cd_weight 30.0 \
  --kl_weight 0.001 \
  --kl_anneal_epochs 80 \
  --use_tensorboard --use_wandb

Kinematics Prediction Pipeline (KPP-Net)

python scripts/train_predictor.py \
  --lmdb_train_path data/dragmesh_train.lmdb \
  --lmdb_val_path data/dragmesh_val.lmdb \
  --data_split_json_path config/category_split_v2.json \
  --output_dir outputs/kpp \
  --batch_size 32 \
  --num_epochs 200 \
  --encoder_type attention \
  --head_type decoupled \
  --predict_type True

Both scripts log to TensorBoard and optionally Weights & Biases. Check modules/loss.py and modules/predictor_loss.py for objective details.

🧪 Inference

Batch Sweep (dataset mode)

python inference_animation.py \
  --dataset_root data/gapartnet \
  --checkpoint best_model.pth \
  --sample_id 40261 \
  --output_dir results_deterministic \
  --num_samples 5 \
  --num_frames 16

Outputs MP4, GIF, and animated GLB per object. If you plan to process a large dataset using dual-quaternion ground truth (no manual drags), prefer this script because running only KPP predictions frame-by-frame may introduce cumulative drift that eventually breaks physical alignment.

Custom Mesh Manipulation (manual input)

python inference_pipeline.py \
  --mesh_file assets/cabinet.obj \
  --mask_file assets/cabinet_vertex_labels.npy \
  --mask_format vertex \
  --drag_point 0.12,0.48,0.05 \   # example: x,y,z point on the movable part
  --drag_vector 0.0,0.0,0.2 \     # example: direction+magnitude of the drag
  --manual_joint_type revolute \
  --kpp_checkpoint best_model_kpp.pth \
  --vae_checkpoint best_model.pth \
  --output_dir outputs/cabinet_demo \
  --num_samples 3

Supply drag points/vectors directly through the CLI (no viewer UI). Use --manual_joint_type revolute or --manual_joint_type prismatic to force a specific motion family when needed. If you omit the manual override, the pipeline first trusts KPP-Net and, when --llm_endpoint + --llm_api_key are provided, backs off to the LLM-based classifier described in inference_pipeline.py. Outputs share the same MP4/GIF/GLB format as the batch pipeline.

👀 Visualization

GIF/MP4 exports rely on pyrender and imageio. For headless servers, set PYOPENGL_PLATFORM=osmesa.
inference_animation.py also exports animated GLB files for direct use in GLTF viewers.
For additional visualization tooling (e.g., rerun or Blender scripts), see inference_animation.py and inference_pipeline.py.

👩‍💻 Case Study

Scenario	Description
Drawer opening	Translational motion predicted entirely from drag cues.
Microwave door	Revolute joint inference with FiLM conditioned motion generation.
Bucket handle	High curvature rotations showing the benefit of dual quaternions.

🎬 Demo Gallery

Translational drags


predicted_z_0.mp4	predicted_z_0.mp4	predicted_z_0.mp4
predicted_z_0.mp4	predicted_z_0.mp4	predicted_z_0.mp4

Rotational drags


predicted_z_0.mp4	predicted_z_0.mp4	predicted_z_0.mp4
predicted_z_0.mp4	predicted_z_0.mp4	predicted_z_0.mp4

Self-spin / free-spin


predicted_z_0.mp4	predicted_z_0.mp4	predicted_z_0.mp4
predicted_z_0.mp4	predicted_z_0.mp4	predicted_z_0.mp4

🗂️ Repository Tour

Path	Content
`modules/model_v2.py`	Dual Quaternion VAE (encoder, decoder, FiLM Transformer).
`modules/predictor.py`	KPP-Net architecture.
`modules/data_loader_v2.py`	GAPartNet parsing and dual quaternion labels.
`utils/balanced_dataset_utils.py`	LMDB dataset builder and balanced sampling utilities.
`scripts/train_vae_v2.py`, `scripts/train_predictor.py`	Training entry points.
`inference_animation*.py`, `inference_pipeline.py`	Inference pipelines (batch and interactive).
`ChamferDistancePytorch/`	CUDA kernels for Chamfer distance and auxiliary metrics.

🌳 Project Tree (annotated)

DragMesh/
├── assets/                      # Logos, teaser figures, future demo media
│   ├── dragmesh_logo.png
│   └── teaser.png
checkpoints/                
│   ├── dqvae.pth             
│   └── kpp.pth
├── ChamferDistancePytorch/      # CUDA/C++ Chamfer distance implementation (build with setup.py)
├── config/
│   └── category_split_v2.json   # GAPartNet in-domain split definition
├── modules/
│   ├── model_v2.py              # Dual Quaternion VAE architecture
│   ├── predictor.py             # KPP-Net for kinematic reasoning
│   ├── loss.py                  # VAE objectives (Chamfer, dual quaternions, constraints)
│   ├── predictor_loss.py        # Loss terms for KPP-Net
│   └── data_loader_v2.py        # GAPartNet loader + dual quaternion ground truth builder
├── scripts/
│   ├── train_vae_v2.py          # Training loop for the VAE motion prior
│   └── train_predictor.py       # Training loop for KPP-Net
├── utils/
│   ├── balanced_dataset_utils.py # LMDB dataset class + balanced sampling helper
│   ├── dataset_utils.py          # Category-aware dataset wrappers
│   └── build_lmdb.py             # CLI to build LMDBs from GAPartNet folders
├── partnet/
│   └── Hunyuan3D-Part/           # External resources (P3-SAM, XPart docs)
├── results_deterministic/        # Placeholder for inference outputs (MP4/GIF/GLB)
├── inference_animation.py        # Batch evaluation + GLB export
├── inference_animation_kpp.py    # Dataset-driven animation tests (legacy interface)
├── inference_glb.py              # Helper for converting trajectories to GLB
├── inference_pipeline.py         # Interactive mesh manipulation pipeline
├── environment.yml               # Conda environment (name: dragmesh)
├── README.md

🙏 Acknowledgement

We thank the GAPartNet team for the articulated dataset, and upstream projects such as ChamferDistancePytorch, Objaverse, SAPIEN, and PyTorch3D for their open-source contributions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DragMesh: Interactive 3D Generation Made Easy

🧾 Citation

✨ Intro

📰 News

✅ TODO

⚡ Quick Start

🧩 Environment Setup

🛠️ Native Extensions

📦 Data Preparation (GAPartNet)

🧠 Training

Dual Quaternion VAE

Kinematics Prediction Pipeline (KPP-Net)

🧪 Inference

Batch Sweep (dataset mode)

Custom Mesh Manipulation (manual input)

👀 Visualization

👩‍💻 Case Study

🎬 Demo Gallery

🗂️ Repository Tour

🌳 Project Tree (annotated)

🙏 Acknowledgement

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
assets		assets
config		config
modules		modules
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
inference_animation.py		inference_animation.py
inference_animation_kpp.py		inference_animation_kpp.py
inference_pipeline.py		inference_pipeline.py

AIGeeksGroup/DragMesh

Folders and files

Latest commit

History

Repository files navigation

DragMesh: Interactive 3D Generation Made Easy

🧾 Citation

✨ Intro

📰 News

✅ TODO

⚡ Quick Start

🧩 Environment Setup

🛠️ Native Extensions

📦 Data Preparation (GAPartNet)

🧠 Training

Dual Quaternion VAE

Kinematics Prediction Pipeline (KPP-Net)

🧪 Inference

Batch Sweep (dataset mode)

Custom Mesh Manipulation (manual input)

👀 Visualization

👩‍💻 Case Study

🎬 Demo Gallery

🗂️ Repository Tour

🌳 Project Tree (annotated)

🙏 Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages