Shiyuan Shen, Zhongyun Bao, Wenju Xu, Chunxia Xiao
- Upgrade LDM to Stable Diffusion 1.5, use input image instead of text prompt.
- Replace Latent HDR Guidance with an equivalent substitution using ControlNet, with the aim of accelerating the fine-tuning process.
- Training epochs are changed to 40 for id_net, 50 for sg_net, 100 for asg_net, 50 for hdr_net, 5 for controlnet.
IllumiDiff/
├── ckpts/ # Pre-trained model checkpoints
├── lighting_est/ # including stage1 (id_net,sg_net,asg_net) and stage3 (hdr_net)
│ ├── asg_fitting_fixed_ldr_adam_batch.py # fitting asg ground truth
│ ├── sg_fitting_free_nadam.py # fitting sg ground truth
│ ├── dataset.py # dataset for stage1 and stage3
│ ├── dataset_processing.py # some dataset processing scripts, still on organization
│ ├── models.py # model definitions for stage1 and stage3
│ ├── modules.py # lightning modules for stage1 and stage3
├── pano_gen/
│ ├── cldm # controlnet core codes
│ ├── configs # configuration files for model definition
│ ├── ldm # ldm core codes
│ ├── openai # CLIP model
│ ├── dataset.py # dataset for ldm
│ ├── pano_tools.py # some tools for panorama projection
│ ├── tool_add_control.py # ckpt initialization
│ ├── outpainting-mask.png # outpainting mask
├── inference_lighting_est.py # inference script for stage1 and stage3
├── inference_pano_gen.py # inference script for stage2
├── pipeline_lighting_est.py # pipeline for lighting estimation (stage1 + stage3)
├── pipeline_full.py # full pipeline for IllumiDiff (stage1 + stage2 + stage3)
├── train_lighting_est.py # training script for stage1 and stage3
├── train_pano_gen.py # training script for stage2
- Config files for all networks.
- Simplify LDM code.
- Full dataset process script.
- Training all stage together.
conda create -n illumidiff python=3.10
conda activate illumidiff
conda install pytorch==2.2.2 torchvision==0.17.2 pytorch-cuda=11.8 numpy=1.26.4 -c pytorch -c nvidia
conda install lightning -c conda-forge
pip install -r requirements.txt
You can download them from OneDrive.
Unzip clip-vit-base-patch32.zip
to IllumiDiff/pano_gen/openai/clip-vit-base-patch32/
,
put all ckpts to IllumDiff/ckpts/
,
control_sd15_clip_asg_sg.ckpt
is required solely for training from scratch.
Full pipeline inference, the input is single images:
python pipeline_full.py --input_path <path> --output_path <path>
Stage 1 or Stage 3 only:
python pipeline_lighting_est.py --input_path <path> --input_pano_path <path> --output_path <path>
Single network only:
for id_net, sg_net, asg_net, or hdr_net:
python inference_lighting_est.py --task <network>
for pano_gen:
python inference_pano_gen.py
See more details in the paper.
All networks are trained separately.
For id_net, sg_net, asg_net or hdr_net:
python train_lighting_est.py --task <network>
For pano_gen:
python train_pano_gen.py --ckpt_path <path> --config_path <path>
For questions please contact:
syshen@whu.edu.cn
@article{shen2025illumidiff,
title={IllumiDiff: Indoor Illumination Estimation from a Single Image with Diffusion Model},
author={Shen, Shiyuan and Bao, Zhongyun and Xu, Wenju and Xiao, Chunxia},
journal={IEEE transactions on visualization and computer graphics},
year={2025},
publisher={IEEE}
}