PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation

Zhaoxi Chen¹ Fangzhou Hong¹ Haiyi Mei² Guangcong Wang¹ Lei Yang² Ziwei Liu¹

¹S-Lab, Nanyang Technological University ² Sensetime Research

NeurIPS 2023

TL;DR

PrimDiffusion generates 3D human by denoising a set of volumetric primitives.
Our method enables explicit pose, view and shape control with real-time rendering in high resolution.

Paper | Project Page | Video

Updates

[12/2023] Source code released! 🤩

[09/2023] PrimDiffusion has been accepted to NeurIPS 2023! 🥳

Citation

If you find our work useful for your research, please consider citing this paper:

@inproceedings{
chen2023primdiffusion,
title={PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation},
author={Zhaoxi Chen and Fangzhou Hong and Haiyi Mei and Guangcong Wang and Lei Yang and Ziwei Liu},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023}
}

Installation

We highly recommend using Anaconda to manage your python environment. You can setup the required environment by the following commands:

# clone this repo
git clone https://github.com/FrozenBurning/PrimDiffusion
cd PrimDiffusion

# install python dependencies
conda env create -f environment.yaml
conda activate primdiffusion
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d -c pytorch3d

Build raymarching extensions:

cd dva
git clone https://github.com/facebookresearch/mvp
cd mvp/extensions/mvpraymarch
make -j4

Install Easymocap:

git clone https://github.com/zju3dv/EasyMocap
cd EasyMocap
pip install --user .

Install xformers for speedup (Optional): Please refer to the official repo for installation.

Inference

Download Pretrained Models

Download sample data, necessary assets, and pretrained model from Google Drive.

Register and download SMPL models here. Please store the SMPL model together with downloaded files as follows:

├── ...
└── PrimDiffusion
    ├── visualize.py
    ├── README.md
    └── data
        └──checkpoints
            └── primdiffusion.pt
        └──smpl
            ├── basicModel_ft.npy
            ├── basicModel_vt.npy
            └── SMPL_NEUTRAL.pkl
        └──render_people
    ...

Visualize Denoising Process and Novel Views

You can run the following script for generating 3D human with PrimDiffusion:

python visualize.py configs/primdiffusion_inference.yml ddim=False

Please specify the path to the pretrained model as checkpoint_path in the config file. Moreover, please specify ddim=True if you intend to use 100 steps DDIM sampler. The script will render and save videos under output_dir which is specified by the config file.

Training

Data Preparation

You could refer to the downloaded sample data at ./data/render_people to prepare your own multiview dataset, and modify the corresponding path in the config file.

Stage I Training

torchrun --nnodes=1 --nproc_per_node=8 --master_port=6666 train_stage1.py configs/renderpeople_stage1_fitting.yml

This will create a folder with checkpoints, config and a monitoring image at the output_dir specified in config file.

Stage II Training

Please run the following command to launch the training of the diffusion model. Please set pretrained_encoder to the path of the latest checkpoint from Stage I. We also support training with mixed precision by default, please modify train.amp in the config file according to your usage.

torchrun --nnodes=1 --nproc_per_node=8 --master_port=6666 train_stage2.py configs/renderpeople_stage2_primdiffusion.yml

Note that, we use 8 GPUs for training by default. Please adjust --nproc_per_node to the number you want.

License

Distributed under the S-Lab License. See LICENSE for more information. Part of the code are also subject to the LICENSE of DVA.

Acknowledgements

PrimDiffusion is implemented on top of the DVA and Latent-Diffusion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation

TL;DR

PrimDiffusion generates 3D human by denoising a set of volumetric primitives.
Our method enables explicit pose, view and shape control with real-time rendering in high resolution.

Paper | Project Page | Video

Updates

Citation

Installation

Inference

Download Pretrained Models

Visualize Denoising Process and Novel Views

Training

Data Preparation

Stage I Training

Stage II Training

License

Acknowledgements

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
configs		configs
dva		dva
primdiffusion		primdiffusion
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
train_stage1.py		train_stage1.py
train_stage2.py		train_stage2.py
visualize.py		visualize.py

License

FrozenBurning/PrimDiffusion

Folders and files

Latest commit

History

Repository files navigation

PrimDiffusion: Volumetric Primitives Diffusion for 3D Human Generation

TL;DR

PrimDiffusion generates 3D human by denoising a set of volumetric primitives. Our method enables explicit pose, view and shape control with real-time rendering in high resolution.

Paper | Project Page | Video

Updates

Citation

Installation

Inference

Download Pretrained Models

Visualize Denoising Process and Novel Views

Training

Data Preparation

Stage I Training

Stage II Training

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

PrimDiffusion generates 3D human by denoising a set of volumetric primitives.
Our method enables explicit pose, view and shape control with real-time rendering in high resolution.