-
Clone the repository
git clone https://github.com/URDF-Anything-plus/URDF-Anything-plus.git cd URDF-Anything-plus -
Create a conda environment
conda create -n urdf-anything python=3.10 -y conda activate urdf-anything
-
Install PyTorch
pip install torch==2.6.0 torchvision==0.21.0
-
Install dependencies:
pip install -r -u requirements.txt -i https://pypi.org/simple/
-
Install torch-cluster。Must be installed after PyTorch:
pip install torch-cluster --no-build-isolation
-
Install diso(TripoSG mesh extraction will use it). Must be installed after PyTorch:
pip install diso --no-build-isolation
Hugging Face authentication(recommended to configure before downloading models):
Setup: Clone TripoSG (used for 3D geometry),and download the weights in TripoSG/pretrained_weights/:
# 1) Clone TripoSG code
git clone https://github.com/VAST-AI-Research/TripoSG.git
# 2) Download TripoSG main model (contains transformer / vae / model_index.json etc.)
huggingface-cli download VAST-AI/TripoSG --local-dir TripoSG/pretrained_weights/TripoSG
# 3) Download RMBG-1.4 background removal model
huggingface-cli download briaai/RMBG-1.4 --local-dir TripoSG/pretrained_weights/RMBG-1.4
# 4) Download DINOv3 image encoder (used for cache building and inference)
huggingface-cli download facebook/dinov3-vith16plus-pretrain-lvd1689m --local-dir DINOv3If huggingface-cli is not installed, you can also download the models using Python:
python -c "
from huggingface_hub import snapshot_download
# TripoSG
snapshot_download(repo_id='VAST-AI/TripoSG', local_dir='TripoSG/pretrained_weights/TripoSG')
# RMBG-1.4
snapshot_download(repo_id='briaai/RMBG-1.4', local_dir='TripoSG/pretrained_weights/RMBG-1.4')
# DINOv3
snapshot_download(repo_id='facebook/dinov3-vith16plus-pretrain-lvd1689m', local_dir='DINOv3')
"Important: There is a little problem with TripoSG/triposg/models/autoencoders/autoencoder_kl_triposg.py, you need to uncomment the line 15 from torch_cluster import fps.
Download the dataset from Hugging Face and unzip it to data_normalized/.
The structure of the dataset is as follows:
URDF-Anything-plus:
├── data_normalized/
│ ├── Laptop_urdf/
│ │ ├── <id>/
│ │ │ ├── images/
│ │ │ ├── xxx.obj
│ │ │ ├── test.urdf/
│ │ │ ├── info.json/
│ ├── Refrigerator_urdf/
│ │ ├── <id>/
│ │ │ ├── images/
│ │ │ ├── xxx.obj
│ │ │ ├── test.urdf/
│ │ │ ├── info.json/
│ ├── ...
Then run the following command to build the cache:
python scripts/build_cache.pybash scripts/run_multi_node_training.sh [node_rank] [master_addr] [nproc_per_node] [training parameters...]For example, to train on 1 machine with 8 GPUs, you can run:
bash scripts/run_multi_node_training.sh 0 localhost 8You can adjust the training parameters in scripts/run_multi_node_training.sh.
In pretraining stage, we use the following hyperparameters:
--init_mode train_from_scratchIn finetuning stage, we use the following hyperparameters:
--init_mode resume_from_ckpt
--checkpoint_path CHECKPOINT_PATH FROM PRETRAINING STAGE
--train_urdf_params True
--train_eot TrueYou can try our inference script:
bash scripts/inference.shIf you are in 'in_the_wild' mode, you should make sure the object is oriented towards the positive z direction. See the examples below, the z-axis is the blue line.
You can rotate the mesh in the terminal to check the orientation. The rotation commands are as follows:
Optional: rotate the mesh in the terminal. You need to make sure the object is oriented towards the positive z direction. You can rotate multiple times, press Enter to end.
0: no rotation (pressing Enter is equivalent to ending)
1-6: rotate around X/Y/Z axis ±90°
angle_map_print: {'1': 'rotate around X axis 90°', '2': 'rotate around X axis -90°', '3': 'rotate around Y axis 90°', '4': 'rotate around Y axis -90°', '5': 'rotate around Z axis 90°', '6': 'rotate around Z axis -90°'}
Enter the rotation number and press Enter (pressing Enter is equivalent to ending): [0-6]:
You need to make sure the object is oriented towards the positive z direction.


