This repository provides the implementation of our stage-wise framework for zero-shot inverse problems with diffusion models.
This codebase is largely based on the following repositories:
We sincerely thank the authors for releasing their code.
You can set up the environment using the provided environment.yml.
conda env create -f environment.yml
conda activate map_rpsFor text-to-image experiments based on Stable Diffusion, please additionally install diffusers, transformers and accelerate:
pip install diffusers==0.35.2
pip install transformers==4.40.2
pip install accelerateFor nonlinear deblurring experiments, please additionally download the pretrained weights from https://drive.google.com/file/d/1vRoDpIsrTRYZKsOMPNbPcMtFDpCT6Foy and place them under:
./third_party/bkse/experiments/pretrainedWe use 100 test images from FFHQ and 100 test images from MS-COCO.
The FFHQ dataset can be downloaded from the official FFHQ repository.
In our experiments, we randomly sample 100 images. We also provide our sampled subset at:
https://drive.google.com/file/d/1e_e_sVsuuS_PH80c_8xsw5JO3bZ95tZZ/view?usp=sharing
Please place the FFHQ test images under:
exp/datasets/ffhq/0
We provide the MS-COCO test subset used in our experiments at:
https://drive.google.com/file/d/1NntxwMuPR_BTKp11yeBykea9DOBlQ1i9/view?usp=sharing
Please place the MS-COCO test images under:
exp/datasets/coco
The expected dataset structure is:
exp
└── datasets
├── ffhq
│ └── 0
│ ├── 00017.png
│ └── ...
└── coco
├── metadata.csv
└── images
├── 0000.png
├── 0001.png
└── ...
For the FFHQ pixel-space diffusion model, please follow the instructions in the DPS repository to download the pretrained checkpoint. Alternatively, the checkpoint can be downloaded from:
https://drive.google.com/drive/folders/1jElnRoFv7b31fG0v6pTSQkelbSX3xGZh
Please link or place the model under:
pretrained/ffhq_pixel
For the FFHQ latent diffusion model, please download the pretrained models from the CompVis latent-diffusion repository, and place them under:
pretrained/ffhq_ldm
For Stable Diffusion v1.5, the model will be automatically downloaded from Hugging Face through diffusers.
The expected pretrained model structure is approximately:
pretrained
├── ffhq_pixel
│ └── ffhq_10m.pt
└── ffhq_ldm
├── first_stage_models
│ └── ...
└── ldm
└── ...
Run the following command to perform restoration under a specified degradation setting:
python main.py \
--config <PATH_TO_CONFIG> \
--timesteps <TIMESTEPS> \
--deg <DEGRADATION_TYPE> \
--sigma_0 <NOISE_LEVEL> \
--lr <LR> \
--lam <LAMBDA> \
--optimize_iters <NUM_OPTIMIZATION_ITERS> \
--vae_lr <VAE_LR> \
--w_prior <PRIOR_WEIGHT> \
--eta_min <ETA_MIN> \
--noise_t <NOISE_TIMESTEP> \
--renoise_t <RENOISE_TIMESTEP> \
--algo <ALGORITHM> \
--ps_method <POSTERIOR_SAMPLING_METHOD> \
--ni| Argument | Description | Choices / Suggested values |
|---|---|---|
--config |
Target config file. | ffhq_ldm/ffhq-ldm-vq-4.yaml, ffhq_pixel/ffhq.yaml, sd15/sd15.yaml |
--timesteps |
Number of diffusion timesteps used by the pretrained diffusion model. | Usually 1000 |
--deg |
Degradation operator for the inverse problem. | denoise, sr4, inp, cs2, deblur_aniso, hdr, deblur_nonlinear |
--sigma_0 |
Standard deviation of the observation noise. | >= 0, task-dependent |
--lr |
Step size / learning rate for posterior sampling methods. | Tuned task by task |
--lam |
Weight for PSLD. | Task-dependent; only used for PSLD |
--optimize_iters |
Number of MAP optimization iterations in Stage 1. | Positive integer |
--vae_lr |
Initial step size for MAP optimization in Stage 1. | Usually 0.5–2.0 |
--w_prior |
Weight of the prior term. | >= 0 |
--eta_min |
Minimum learning rate for learning-rate decay. | Usually 1e-5–1e-2 |
--noise_t |
Timestep used for calculating the prior loss. | 50 for latent-space models; 10 for pixel-space models |
--renoise_t |
Timestep used for renoising. | Integer in [0, timesteps] |
--algo |
Algorithm to run. | map_rps, lmap_rps |
--ps_method |
Posterior sampling method. | e.g., dps, psld |
--ni |
Run without interaction and overwrite the target folder if needed. | Flag |
We provide all scripts used in our experiments under the scripts/ directory.
bash scripts/<SCRIPT_NAME>.shPlease check the corresponding script for the task-specific configuration, degradation type, and hyperparameters.
A simple example of computing evaluation metrics can be found in the calculate_metrics/ directory.
If you find this repository useful, please consider citing our paper:
@inproceedings{zhang2026stagewise,
title={Stage-wise Distortion--Perception Traversal in Zero-shot Inverse Problems with Diffusion Models},
author={Zhang, Jiawei and Liu, Ziyuan and Yan, Leon and Xiao, Zhenyu and Gu, Yuantao},
booktitle={International Conference on Machine Learning},
year={2026}
}