This is the official implementation of LDM: Large Tensorial SDF Model for Textured Mesh Generation.
| | Weights
demo_1.mp4
- 🔥 Release huggingface gradio demo
- 🔥 Release inference and training code.
- 🔥 Release pretrained models.
- Release the training data list.
- Support text to 3D generation.
- Support image to 3D generation using various multi-view diffusion models, including Imagedream and Zero123plus.
# xformers is required! please refer to https://github.com/facebookresearch/xformers for details.
# We recommend using `Python>=3.10`, `PyTorch>=2.1.0`, and `CUDA>=12.1`.
pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121
pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu121
# other dependencies
pip install -r requirements.txt
Our pretrained weight can be downloaded from huggingface.
For example, to download the fp16 model for inference:
mkdir pretrained && cd pretrained
wget https://huggingface.co/rgxie/LDM/resolve/main/LDM_6V_SDF.ckpt
cd ..
The weights of the diffusion model will be downloaded automatically.
### gradio app for both text/image to 3D, the weights of our model will be downloaded automatically.
python app.py
# image to 3d
# --workspace: folder to save output (*.obj,*.jpg)
# --test_path: path to a folder containing images, or a single image
python infer.py tiny_trf_trans_sdf_123plus --resume pretrained/LDM_6V_SDF.ckpt --workspace workspace_test --test_path example --seed 0
# text to 3d
# --workspace: folder to save output (*.obj,*.jpg)
python infer.py tiny_trf_trans_sdf --resume pretrained/LDM_6V_SDF.ckpt --workspace workspace_test --txt_or_image True --mvdream_or_zero123 True --text_prompt 'a hamburge' --seed 0
For more options, please check options. If you find the output unsatisfying, try using different multi-view diffusion models or seeds!
preparing:
Training dataset: our training dataset is based on GObjaverse, which can be downloaded from here. Specifically, we used a ~80K filtered subset list from LGM. The data list can be found here. Furthermore, configure the options with the following:
- data_path: The directory where your downloaded dataset is stored.
- data_list_path: The path to the data list file.
- The structure of dataset:
|-- data_path
|-- dictionary_id
|-- instance_id.rar
|-- ...
Pretrained model: As our model is trained starting from the pretrained OpenLRM model, please download the pretrained model here and place it in the ‘pretrained’ dir.
Training: The minimum recommended configuration for training is 8 * A6000 GPUs, each with 48GB memory.
# step 1: To speed up the convergence of training, we start by not cropping patches. Instead, we use a lower resolution and train with a larger batch size initially.
accelerate launch --config_file acc_configs/gpu8.yaml main.py tiny_trf_trans_sdf --output_size 64 --batch_size 4 --lr 4e-4 --num_epochs 50 --is_crop False --resume pretrained/openlrm_m_l.safetensors --workspace workspace_nocrop
# step 2: Furthermore, we introduce patch cropping and increase the patch resolution to capture better details.
accelerate launch --config_file acc_configs/gpu8.yaml main.py tiny_trf_trans_sdf --output_size 128 --batch_size 1 --gradient_accumulation_steps 2 --lr 2e-5 --num_epochs 50 --is_crop True --resume workspace_nocrop/last.ckpt --workspace workspace_crop
# (optional)step 3: To adapt the model to the 6 view inputs from Zero123plus, we refine the model obtained in the earlier stages.
accelerate launch --config_file acc_configs/gpu8.yaml main.py tiny_trf_trans_sdf_123plus --output_size 128 --batch_size 1 --gradient_accumulation_steps 2 --lr 1e-5 --num_epochs 20 --resume workspace_crop/last.ckpt --workspace workspace_refine
# (optional)step 4: Utilize FlexiCubes layer to further improve the texture details
accelerate launch --config_file acc_configs/gpu8.yaml main.py tiny_trf_trans_mesh --output_size 512 --batch_size 1 --gradient_accumulation_steps 1 --lr 1e-5 --num_epochs 20 --resume the_path_of_sdf_ckpt/last.ckpt --workspace workspace_mesh
This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!
@article{xie2024ldm,
title={LDM: Large Tensorial SDF Model for Textured Mesh Generation},
author={Xie, Rengan and Zheng, Wenting and Huang, Kai and Chen, Yizheng and Wang, Qi and Ye, Qi and Chen, Wei and Huo, Yuchi},
journal={arXiv preprint arXiv:2405.14580},
year={2024}
}