3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

3DEnhancer employs a multi-view diffusion model to enhance multi-view images, thus improving 3D models.

📖 For more visual results, go checkout our project page

Introducing 3DEnhancer

Despite advances in neural rendering, due to the scarcity of high-quality 3D datasets and the inherent limitations of multi-view diffusion models, view synthesis and 3D model generation are restricted to low resolutions with suboptimal multi-view consistency. In this study, we present a novel 3D enhancement pipeline, dubbed 3DEnhancer, which employs a multi-view latent diffusion model to enhance coarse 3D inputs while preserving multi-view consistency. Our method includes a pose-aware encoder and a diffusion-based denoiser to refine low-quality multi-view images, along with data augmentation and a multi-view attention module with epipolar aggregation to maintain consistent, high-quality 3D outputs across views. Unlike existing video-based approaches, our model supports seamless multi-view enhancement with improved coherence across diverse viewing angles. Extensive evaluations show that 3DEnhancer significantly outperforms existing methods, boosting both multi-view enhancement and per-instance 3D optimization tasks.

🔥 News

[2024/03/08] Our inference code and Gradio demo are released.
[2024/12/25] Our paper and project page are now live. Merry Christmas!

🔧 Installation

Clone Repo

git clone --recurse-submodules https://github.com/Luo-Yihang/3DEnhancer
cd 3DEnhancer

Create Conda Environment

conda create -n 3denhancer python=3.10 -y
conda activate 3denhancer

Install Python Dependencies

Important: Install Torch and Xformers based on your CUDA version. For example, for Torch 2.1.0 + CUDA 11.8:

# Install Torch and Xformers
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install -U xformers --index-url https://download.pytorch.org/whl/cu118

# Install other dependencies
pip install -r requirements.txt

💾 Pretrained Weights

Download the pretrained model from Hugging Face and place it under pretrained_models/3DEnhancer:

mkdir -p pretrained_models/3DEnhancer
wget -P pretrained_models/3DEnhancer https://huggingface.co/Luo-Yihang/3DEnhancer/resolve/main/model.safetensors

💻 Inference

The code has been tested on NVIDIA A100 and V100 GPUs. An NVIDIA GPU with at least 18GB of memory is required.

We provide example inputs in assets/examples/mv_lq, where each subfolder contains four sequential multi-view images. Perform inference on multi-view images using an aligned prompt and noise_level. For example:

python inference.py \
    --input_folder assets/examples/mv_lq/vase \
    --output_folder results/vase \
    --prompt "vase" \
    --noise_level 0

For more options, refer to inference.py.

⚡ Demo

The script app.py provides a simple web demo for generating and enhancing multi-view images, as well as reconstructing 3D models using LGM.

Install a modified Gaussian splatting (with depth and alpha rendering) required for LGM:

git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization
pip install ./diff-gaussian-rasterization

Download the LGM pretrained weights from Hugging Face and place it under pretrained_models/LGM:

mkdir -p pretrained_models/LGM
wget -P pretrained_models/LGM https://huggingface.co/ashawkey/LGM/resolve/main/model_fp16_fixrot.safetensors

After installing the dependencies, start the demo with:

python app.py

The web demo is also available on Hugging Face Spaces! 🎉

📆 TODO

Release paper and project page.
Release inference code.
Release Gradio demo.

📃 License

This project is licensed under NTU S-Lab License 1.0. Redistribution and use should follow this license.

📝 Citation

If you find our code or paper helps, please consider citing:

@article{luo20243denhancer,
    title={3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement}, 
    author={Yihang Luo and Shangchen Zhou and Yushi Lan and Xingang Pan and Chen Change Loy},
    booktitle={arXiv preprint arXiv:2412.18565}
    year={2024},
}

📫 Contact

If you have any questions, please feel free to reach us at luo_yihang@outlook.com.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
extern		extern
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
app.py		app.py
inference.py		inference.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

🔥 News

🔧 Installation

💾 Pretrained Weights

💻 Inference

⚡ Demo

📆 TODO

📃 License

📝 Citation

📫 Contact

About

Uh oh!

Uh oh!

Languages

License

Luo-Yihang/3DEnhancer

Folders and files

Latest commit

History

Repository files navigation

3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

🔥 News

🔧 Installation

💾 Pretrained Weights

💻 Inference

⚡ Demo

📆 TODO

📃 License

📝 Citation

📫 Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages