Skip to content

AIGeeksGroup/DragMesh

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

20 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DragMesh logo DragMesh: Interactive 3D Generation Made Easy

Official repository for the paper "DragMesh: Interactive 3D Generation Made Easy".

Tianshan Zhang*, Zeyu Zhang*†, Hao Tang#

*Equal contribution. †Project lead. #Corresponding author.

Paper (PDF) | Project Website | Hugging Face Models

Note

GAPartNet (link above) is the canonical dataset source for all articulated assets used in DragMesh.

teaser.mp4

🧾 Citation

If you find DragMesh helpful, please cite:


✨ Intro

While generative models have excelled at creating static 3D content, the pursuit of systems that understand how objects move and respond to interactions remains a fundamental challenge. Current methods for articulated motion lie at a crossroads: they are either physically consistent but too slow for real-time use, or generative but violate basic kinematic constraints. We present DragMesh, a robust framework for real-time interactive 3D articulation built around a lightweight motion generation core. Our core contribution is a novel decoupled kinematic reasoning and motion generation framework. First, we infer the latent joint parameters by decoupling semantic intent reasoning (which determines the joint type) from geometric regression (which determines the axis and origin using our Kinematics Prediction Network (KPP-Net)). Second, to leverage the compact, continuous, and singularity-free properties of dual quaternions for representing rigid body motion, we develop a novel Dual Quaternion VAE (DQ-VAE). This DQ-VAE receives these predicted priors, along with the original user drag, to generate a complete, plausible motion trajectory. To ensure strict adherence to kinematics, we inject the joint priors at every layer of the DQ-VAE's non-autoregressive Transformer decoder using FiLM (Feature-wise Linear Modulation) conditioning. This persistent, multi-scale guidance is complemented by a numerically-stable cross-product loss to guarantee axis alignment. This decoupled design allows DragMesh to achieve real-time performance and enables plausible, generative articulation on novel objects without retraining, offering a practical step toward generative 3D intelligence.

πŸ“° News

βœ… TODO

  • Upload the DragMesh paper and project page.
  • Release the training and inference code.
  • Provide GAPartNet processing pipeline and LMDB builder.
  • Share checkpoints on Hugging Face.
  • Create an interactive presentation.
  • Publish a Hugging Face Space for browser-based manipulation.

⚑ Quick Start

🧩 Environment Setup

We ship a full Conda specification in environment.yml (environment name: dragmesh). It targets Python 3.10, CUDA 12.1, and PyTorch 2.4.1. Create or update via:

conda env create -f environment.yml
conda activate dragmesh
# or update an existing env
conda env update -f environment.yml --prune

The spec already installs trimesh, pyrender, pygltflib, viser, Objaverse, SAPIEN, pytorch3d, and tiny-cuda-nn. If you prefer a minimal setup, install those packages manually before running the scripts.

πŸ› οΈ Native Extensions

Chamfer distance kernels are required for the VAE loss. Clone and build the upstream project:

git clone https://github.com/ThibaultGROUEIX/ChamferDistancePytorch.git
cd ChamferDistancePytorch
python setup.py install
cd ..

πŸ“¦ Data Preparation (GAPartNet)

  1. Visit https://pku-epic.github.io/GAPartNet/ and download the articulated assets for the categories listed in config/category_split_v2.json.
  2. Arrange files so that each object folder contains mobility_annotation_gapartnet.urdf, meta.json, and textured meshes (*.obj). Example:
    data/gapartnet/<object_id>/
      |- mobility_annotation_gapartnet.urdf
      |- meta.json
      |- textured_objs/*.obj
    
  3. Convert to LMDB for fast training IO:
    python utils/build_lmdb.py \
      --dataset_root data/gapartnet \
      --output_prefix data/dragmesh \
      --config config/category_split_v2.json \
      --num_frames 16 \
      --num_points 4096
    # Produces data/dragmesh_train.lmdb and data/dragmesh_val.lmdb
  4. Use utils/balanced_dataset_utils.get_motion_type_weights with WeightedRandomSampler if you need balanced revolute/prismatic sampling.

🧠 Training

Dual Quaternion VAE

python scripts/train_vae_v2.py \
  --lmdb_train_path data/dragmesh_train.lmdb \
  --lmdb_val_path data/dragmesh_val.lmdb \
  --data_split_json_path config/category_split_v2.json \
  --output_dir outputs/vae \
  --num_epochs 300 \
  --batch_size 16 \
  --latent_dim 256 \
  --num_frames 16 \
  --mesh_recon_weight 10.0 \
  --cd_weight 30.0 \
  --kl_weight 0.001 \
  --kl_anneal_epochs 80 \
  --use_tensorboard --use_wandb

Kinematics Prediction Pipeline (KPP-Net)

python scripts/train_predictor.py \
  --lmdb_train_path data/dragmesh_train.lmdb \
  --lmdb_val_path data/dragmesh_val.lmdb \
  --data_split_json_path config/category_split_v2.json \
  --output_dir outputs/kpp \
  --batch_size 32 \
  --num_epochs 200 \
  --encoder_type attention \
  --head_type decoupled \
  --predict_type True

Both scripts log to TensorBoard and optionally Weights & Biases. Check modules/loss.py and modules/predictor_loss.py for objective details.

πŸ§ͺ Inference

Batch Sweep (dataset mode)

python inference_animation.py \
  --dataset_root data/gapartnet \
  --checkpoint best_model.pth \
  --sample_id 40261 \
  --output_dir results_deterministic \
  --num_samples 5 \
  --num_frames 16

Outputs MP4, GIF, and animated GLB per object. If you plan to process a large dataset using dual-quaternion ground truth (no manual drags), prefer this script because running only KPP predictions frame-by-frame may introduce cumulative drift that eventually breaks physical alignment.

Custom Mesh Manipulation (manual input)

python inference_pipeline.py \
  --mesh_file assets/cabinet.obj \
  --mask_file assets/cabinet_vertex_labels.npy \
  --mask_format vertex \
  --drag_point 0.12,0.48,0.05 \   # example: x,y,z point on the movable part
  --drag_vector 0.0,0.0,0.2 \     # example: direction+magnitude of the drag
  --manual_joint_type revolute \
  --kpp_checkpoint best_model_kpp.pth \
  --vae_checkpoint best_model.pth \
  --output_dir outputs/cabinet_demo \
  --num_samples 3

Supply drag points/vectors directly through the CLI (no viewer UI). Use --manual_joint_type revolute or --manual_joint_type prismatic to force a specific motion family when needed. If you omit the manual override, the pipeline first trusts KPP-Net and, when --llm_endpoint + --llm_api_key are provided, backs off to the LLM-based classifier described in inference_pipeline.py. Outputs share the same MP4/GIF/GLB format as the batch pipeline.

πŸ‘€ Visualization

  • GIF/MP4 exports rely on pyrender and imageio. For headless servers, set PYOPENGL_PLATFORM=osmesa.
  • inference_animation.py also exports animated GLB files for direct use in GLTF viewers.
  • For additional visualization tooling (e.g., rerun or Blender scripts), see inference_animation.py and inference_pipeline.py.

πŸ‘©β€πŸ’» Case Study

Scenario Description
Drawer opening Translational motion predicted entirely from drag cues.
Microwave door Revolute joint inference with FiLM conditioned motion generation.
Bucket handle High curvature rotations showing the benefit of dual quaternions.

🎬 Demo Gallery

Translational drags

predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4

Rotational drags

predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4

Self-spin / free-spin

predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4
predicted_z_0.mp4

πŸ—‚οΈ Repository Tour

Path Content
modules/model_v2.py Dual Quaternion VAE (encoder, decoder, FiLM Transformer).
modules/predictor.py KPP-Net architecture.
modules/data_loader_v2.py GAPartNet parsing and dual quaternion labels.
utils/balanced_dataset_utils.py LMDB dataset builder and balanced sampling utilities.
scripts/train_vae_v2.py, scripts/train_predictor.py Training entry points.
inference_animation*.py, inference_pipeline.py Inference pipelines (batch and interactive).
ChamferDistancePytorch/ CUDA kernels for Chamfer distance and auxiliary metrics.

🌳 Project Tree (annotated)

DragMesh/
β”œβ”€β”€ assets/                      # Logos, teaser figures, future demo media
β”‚   β”œβ”€β”€ dragmesh_logo.png
β”‚   └── teaser.png
checkpoints/                
β”‚   β”œβ”€β”€ dqvae.pth             
β”‚   └── kpp.pth
β”œβ”€β”€ ChamferDistancePytorch/      # CUDA/C++ Chamfer distance implementation (build with setup.py)
β”œβ”€β”€ config/
β”‚   └── category_split_v2.json   # GAPartNet in-domain split definition
β”œβ”€β”€ modules/
β”‚   β”œβ”€β”€ model_v2.py              # Dual Quaternion VAE architecture
β”‚   β”œβ”€β”€ predictor.py             # KPP-Net for kinematic reasoning
β”‚   β”œβ”€β”€ loss.py                  # VAE objectives (Chamfer, dual quaternions, constraints)
β”‚   β”œβ”€β”€ predictor_loss.py        # Loss terms for KPP-Net
β”‚   └── data_loader_v2.py        # GAPartNet loader + dual quaternion ground truth builder
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ train_vae_v2.py          # Training loop for the VAE motion prior
β”‚   └── train_predictor.py       # Training loop for KPP-Net
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ balanced_dataset_utils.py # LMDB dataset class + balanced sampling helper
β”‚   β”œβ”€β”€ dataset_utils.py          # Category-aware dataset wrappers
β”‚   └── build_lmdb.py             # CLI to build LMDBs from GAPartNet folders
β”œβ”€β”€ partnet/
β”‚   └── Hunyuan3D-Part/           # External resources (P3-SAM, XPart docs)
β”œβ”€β”€ results_deterministic/        # Placeholder for inference outputs (MP4/GIF/GLB)
β”œβ”€β”€ inference_animation.py        # Batch evaluation + GLB export
β”œβ”€β”€ inference_animation_kpp.py    # Dataset-driven animation tests (legacy interface)
β”œβ”€β”€ inference_glb.py              # Helper for converting trajectories to GLB
β”œβ”€β”€ inference_pipeline.py         # Interactive mesh manipulation pipeline
β”œβ”€β”€ environment.yml               # Conda environment (name: dragmesh)
β”œβ”€β”€ README.md                     

πŸ™ Acknowledgement

We thank the GAPartNet team for the articulated dataset, and upstream projects such as ChamferDistancePytorch, Objaverse, SAPIEN, and PyTorch3D for their open-source contributions.

About

DragMesh: Interactive 3D Generation Made Easy

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages