Intrinsic Image Diffusion for Single-view Material Estimation

Peter Kocsis · Vincent Sitzmann · Matthias Niessner

CVPR 2024

Paper | Project Page

We utilize the strong prior of diffusion models and formulate the material estimation task probabilistically. Our approach generates multiple solutions for a single input view, with much more details and sharper features compared to previous works.

Structure

Our project has the following structure:

├── configs               <- Hydra config files
├── data                  <- Datasets
├── docs                  <- Project page
├── iid                   <- Our main package for Intrinsic Image Diffusion
│   ├── geometry_prediction   <- Code for geometry prediction
│   ├── lighting_optimization <- Code for lighting optimization
│   └── material_diffusion    <- Code for material diffusion
├── logs                   <- Hydra and WandB logs
├── models                 <- Model and config folder
├── res                    <- Documentation resources
├── environment.yaml       <- Env file for creating conda environment
├── LICENSE
└── README.md

Installation

To install the dependencies, you can use the provided environment file:

conda env create -n iid -f environment.yml
conda activate iid
pip install stable-diffusion-sdkit==2.1.5 --no-deps

(Optional) XFormers

For better performance, installing XFormers is recommended.

conda install xformers -c xformers

The code has been tested with Ubuntu 22.04.3 LTS and Ubuntu 20.04.3 LTS with RTX_3090 and A4000 GPUs.

Model

Material Diffusion

Download our Material Diffusion model to the models folder.

mkdir -p models/material_diffusion
wget "https://syncandshare.lrz.de/dl/fiAomi6K8g5dywJBwAxFiZ/iid_e250.pth" -O "models/material_diffusion/iid_e250.pth"

OmniData

For our full pipeline, download the OmniData model to the models folder.

mkdir -p models/geometry_prediction
wget "https://zenodo.org/records/10447888/files/omnidata_dpt_depth_v2.ckpt?download=1" -O "models/geometry_prediction/omnidata_dpt_depth_v2.pth"
wget "https://zenodo.org/records/10447888/files/omnidata_dpt_normal_v2.ckpt?download=1" -O "models/geometry_prediction/omnidata_dpt_normal_v2.pth"

Logging

The code supports logging to console and WandB. The default config is to log to WandB, but the presented commands override this to console, so you can run them without an account. If you wish to change to WandB, drop the logger=console argument from the commands and edit configs/logger/wandb.yaml with your information.

Training

Coming soon!

Inference

The full pipeline consists of three stages: geometry prediction, material diffusion and lighting optimization. You can run all stages together with the following command (WandB logging is recommended). THis script loads the test image from data/test/im folder, predicts the geometry and materials and extends the dataset with the predictions, then optimizes for the lighting. For more details, you can check configs/intrinsic_image_diffusion.yaml.

python -m iid

Expected run results can be found in this WandB report.

Stage 1 - Geometry Prediction

This script will load the test image from the res folder, predicts the depth and normals and saves it to the output folder. To predict the geometry, you can use the following command. For more details, you can check configs/stage/geometry_prediction.yaml.

python -m iid.geometry_prediction logger=console

Input	Depth	Normals

Stage 2 - Material Diffusion

Running the model requires at least 10GB of GPU memory. This script will load the test image from the res folder, predicts the depth and normals and saves it to the output folder You can run it with the following command. For more details, you can check configs/stage/material_diffusion.yaml.

python -m iid.material_diffusion logger=console

By default, the script predicts 10 material explanations and computes the average.

Input	Albedo	Roughness	Metallic

Stage 3 - Lighting Optimization

The lighting optimization part uses PyTorch Lightning with iterative pruning and early stopping. As more and more light sources are pruned, the faster the iteration time becomes. It requires predicted geometry and materials. This script will load the dataset from the data/test folder, optimizes for the lighting (envmap and 48xSG point lights) and then saves the checkpoints to data/test/lighting. The easiest way to get the predictions prepared is to run the full pipeline or copy the data from data/test_out. You can run it with the following command. For more details, you can check configs/stage/lighting_optimization.yaml.

python -m iid.lighting_optimization logger=console

Input	Shading	Rerendering

Rendering

After running the full pipeline, you can render the results with the following command. This script will load the dataset from the data/test_out folder and the optimized lighting model from data/test_out/lighting/0.ckpt, then rerender the scene using the full decomposition.

python -m iid.test logger=console

Acknowledgements

This project is built upon Latent Diffusion Models, we are grateful for the authors open-sourcing their project. We used Hydra configuration management with Pythorch Lightning. Our model was trained on the high-quality InteriorVerse synthetic indoor dataset. Rendering model was inspired by Zhu et. al. 2022. Our full pipeline uses OmniData for geometry prediction.

Citation

If you find our code or paper useful, please cite

@article{Kocsis2024IID,
  author    = {Kocsis, Peter and Sitzmann, Vincent and Nie\{ss}ner, Matthias},
  title     = {Intrinsic Image Diffusion for Single-view Material Estimation},
  journal   = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2024},
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
configs		configs
data		data
docs		docs
iid		iid
models/material_diffusion		models/material_diffusion
res		res
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

License

Peter-Kocsis/IntrinsicImageDiffusion

Folders and files

Latest commit

History

Repository files navigation