Skip to content

ziangcao0312/DiffTF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Large-Vocabulary 3D Diffusion Model with Transformer

1S-Lab, Nanyang Technological University  2The Chinese University of Hong Kong; 3Shanghai AI Laboratory

DiffTF can generate large-vocabulary 3D objects with rich semantics and realistic texture.

📖 For more visual results, go checkout our project page

Installation

Clone this repository and navigate to it in your terminal. Then run:

bash install_difftf.sh

This will install the related python package that the scripts depend on.

Preparing data

Training

I. Triplane fitting

1. Training the shared decoder
conda activate difftf
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
#Omniobject3D
python -m torch.distributed.launch --nproc_per_node 8 ./Triplanerecon/train.py --config ./Triplanerecon/configs/omni/train.txt \\
--datadir ./dataset/Omniobject3D/renders \\# dataset path
--basedir ./Checkpoint \\# basepath
--expname omni_sharedecoder \\# the ckpt will save in ./Checkpoint/omni_sharedecoder
#ShapeNet
python -m torch.distributed.launch --nproc_per_node 8 ./Triplanerecon/train.py --config ./Triplanerecon/configs/shapenet_car/train.txt \\
--datadir ./dataset/ShapeNet/renders_car
--basedir ./Checkpoint \\# basepath
--expname shapenet_sharedecoder \\# the ckpt will save in ./Checkpoint/shapenet_car_sharedecoder
2. Triplane fitting
conda activate difftf
#Omniobject3D
python ./Triplanerecon/train_single_omni.py \\
--config ./Triplanerecon/configs/omni/train_single.txt \\ #config path
--num_gpu 1 --idx 0 \\ #using 1gpu to fit triplanes 
--datadir ./dataset/Omniobject3D/renders \\# dataset path
--basedir ./Checkpoint \\# basepath
--expname omni_triplane \\# triplanes will save in ./Checkpoint/omni_triplane
--decoderdir ./Checkpoint/omni_sharedecoder/300000.tar # ckpt of shared decoder

#ShapeNet
python ./Triplanerecon/train_single_shapenet.py \\
--config ./Triplanerecon/configs/shapenet_car/train_single.txt \\
--num_gpu 1 --idx 0 \\ #using 1gpu to fit triplanes 
--datadir ./dataset/ShapeNet/renders_car \\# dataset path
--basedir ./Checkpoint \\# basepath
--expname shapenet_triplane \\# triplanes will save in ./Checkpoint/shapenet_triplane
--decoderdir ./Checkpoint/shapenet_sharedecoder/300000.tar # ckpt of shared decoder

#Using 8 gpus
bash multi_omni.sh 8
#Using 8 gpus
bash multi_shapenet.sh 8

Note: We input the related hyperparameters and settings in the config files. You can find them in ./configs/shapenet or ./configs/omni.

3. Preparing triplane for diffusion
#preparing triplanes for training diffusion
python ./Triplanerecon/extract.py 
--basepath ./Checkpoint/omni_triplane \\ # path of triplanes
--mode omni \\ # name of dataset (omni or shapenet)
--newpath ./Checkpoint/omni_triplane_fordiffusion #new path of triplanes

II. Training Diffusion

cd ./3dDiffusion
export PYTHONPATH=$PWD:$PYTHONPATH
conda activate difftf
cd scripts
python image_train.py 
--datasetdir ./Checkpoint/omni_triplane_fordiffusion   #path to fitted triplanes 
--expname difftf_omni #ckpt will save in ./Checkpoint/difftf_omni

You may also want to train in a distributed manner. In this case, run the same command with mpiexec:

mpiexec -n 8 python image_train.py 
--datasetdir ./Checkpoint/omni_triplane_fordiffusion   #path to fitted triplanes
--expname difftf_omni #ckpt will save in ./Checkpoint/difftf_omni

Note: Hyperparameters about training are set in image_train.py while hyperparameters about architecture are set in ./improved_diffusion/script_util.py.

Note: Our fitted triplane can be downloaded via this link.

Inference

I. Sampling triplane using trained diffusion

Our pre-trained model can be founded in difftf_checkpoint/omni

python image_sample.py \\
--model_path ./Checkpoint/difftf_omni/model.pt  #checkpoint_path
--num_samples=5000
--save_path ./Checkpoint/difftf_omni # path of the generated triplanes

II. Rendering triplane using shared decoder

Our pre-trained share decoder can be founded in difftf_checkpoint/triplane decoder.zip

python ddpm_vis.py --config ./configs/omni/ddpm.txt
--ft_path ./Checkpoint/omni_triplane_fordiffusion/003000.tar #path of shared decoder
--triplanepath ./Checkpoint/difftf_omni/samples_5000x18x256x256.npz # path of generated triplanes
--basedir ./Checkpoint \\# basepath
--expname ddpm_omni_vis \\# triplanes will save in ./Checkpoint/omni_triplane
--mesh 0 \\# whether to save mesh
--testvideo \\# whether to save all images using video

python ddpm_vis.py --config ./configs/shapenet_car/ddpm.txt
--ft_path ./Checkpoint/shapenet_car_triplane_fordiffusion/003000.tar #path of shared decoder
--triplanepath ./Checkpoint/difftf_shapenet/samples_5000x18x256x256.npz # path of generated triplanes
--basedir ./Checkpoint \\# basepath
--expname ddpm_shapenet_vis \\# triplanes will save in ./Checkpoint/omni_triplane
--mesh 0 \\# whether to save mesh
--testvideo \\# whether to save all images using video

References

If you find DiffTF useful for your work please cite:

@article{cao2023large,
  title={Large-Vocabulary 3D Diffusion Model with Transformer},
  author={Cao, Ziang and Hong, Fangzhou and Wu, Tong and Pan, Liang and Liu, Ziwei},
  journal={arXiv preprint arXiv:2309.07920},
  year={2023}
}
Acknowledgement

The code is implemented based on improved-diffusion and nerf-pytorch. We would like to express our sincere thanks to the contributors.

🗞️ License

Distributed under the S-Lab License. See LICENSE for more information.

Flag Counter

About

Official PyTorch implementation of DiffTF (Accepted by ICLR2024)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages