[CVPR 2024 Highlight (11.9%)] LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching

Yixun Liang$^{\color{red}{*}}$ Xin Yang$^{\color{red}{*}}$, Jiantao Lin, Haodong Li, Xiaogang Xu, Yingcong Chen$^{**}$

$\color{red}{*}$: Equal contribution. **: Corresponding author.

Paper PDF (Arxiv) | Project Page (Coming Soon) | Gradio Demo

The paper is now accepted to CVPR 2024 as poster (highlight, 11.9%).

Note: we compress these motion pictures for faster previewing.

Examples of text-to-3D content creations with our framework, the LucidDreamer, within ~35mins on A100.

📺 Video

Please click to watch the 3-minute video introduction of our project.

🎏 Abstract

We present a text-to-3D generation framework, named the LucidDreamer, to distill high-fidelity textures and shapes from pretrained 2D diffusion models.

CLICK for the full abstract

The recent advancements in text-to-3D generation mark a significant milestone in generative models, unlocking new possibilities for creating imaginative 3D assets across various real-world scenarios. While recent advancements in text-to-3D generation have shown promise, they often fall short in rendering detailed and high-quality 3D models. This problem is especially prevalent as many methods base themselves on Score Distillation Sampling (SDS). This paper identifies a notable deficiency in SDS, that it brings inconsistent and low-quality updating direction for the 3D model, causing the over-smoothing effect. To address this, we propose a novel approach called Interval Score Matching (ISM). ISM employs deterministic diffusing trajectories and utilizes interval-based score matching to counteract over-smoothing. Furthermore, we incorporate 3D Gaussian Splatting into our text-to-3D generation pipeline. Extensive experiments show that our model largely outperforms the state-of-the-art in quality and training efficiency.

🔧 Training Instructions

Our code is now released! Please refer to this link for detailed training instructions.

🤗 Gradio Demo

We are currently building an online demo of LucidDreamer with Gradio, you can check it out by clicking this link. It is still under development, and the service might not be available from time to time.

🚧 Todo

Release the basic training codes
Release the guidance documents
Release the training codes for more applications

📍 Citation

@misc{EnVision2023luciddreamer,
      title={LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching}, 
      author={Yixun Liang and Xin Yang and Jiantao Lin and Haodong Li and Xiaogang Xu and Yingcong Chen},
      year={2023},
      eprint={2311.11284},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

This work is built on many amazing research works and open-source projects:

Thanks for their excellent work and great contribution to 3D generation area.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
arguments		arguments
configs		configs
custom_example/lora/Taylor_Swift		custom_example/lora/Taylor_Swift
gaussian_renderer		gaussian_renderer
guidance		guidance
lora_diffusion		lora_diffusion
resources		resources
scene		scene
submodules		submodules
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
GAUSSIAN_SPLATTING_LICENSE.md		GAUSSIAN_SPLATTING_LICENSE.md
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh

License

EnVision-Research/LucidDreamer

Folders and files

Latest commit

History

Repository files navigation

[CVPR 2024 Highlight (11.9%)] LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching

📺 Video

🎏 Abstract

🔧 Training Instructions

🤗 Gradio Demo

🚧 Todo

📍 Citation

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Languages