Skip to content

ErenGurs/ddpm

Repository files navigation

Diffusion Models

Create environment:

conda env create -f environment.yaml
conda activate ddpm

Download the Landscape Dataset from Kaggle as archive.zip or the Celeba dataset (aligned & cropped) as img_align_celeba.zip.

$ mkdir -p landscape_img_folder/train
$ unzip archive.zip -d landscape_img_folder/train/

For my convenience I set up the buckets and download datasets, checkpoints etc. by running:

source ./setup.sh

Sampling

Also trained it for the celeba dataset. Download three example checkpoints (epoch 30, 80, 490) from the bucket (or /ddpm/models_celeba/). Then sample from these three checkpoints (saved as models/ckpt_epoch[30, 80, 490]_ddpm.pt):

python ddpm_accelerate.py --ckpt /mnt/task_runtime/ddpm/models/ckpt_epoch490.pt --ckpt_sampling
Epoch 80 ckpt

Epoch 300 ckpt

Generated the gif by ffmpeg -framerate 5 -i results/denoised/denoised_%3d.jpg ddpm_slow.gif

Training

Use multi-GPU training script using 🤗 Accelerate .

accelerate launch ddpm_accelerate.py

Note: Running the same script as python ddpm_acclerate.py will fall back to single GPU mode (no effect of accelerate). Therefore, the same script can be directly used for single GPU tasks like sampling/inference or debugging. So I decided to retire the single GPU script ddpm.py.

Below images show noising of images used in training (see Details on Notation for more on the notation)

  • Noised samples : from $q(\mathbf{x}_t |\mathbf{x}_0)$
  • Original samples : from $q(\mathbf{x}_0)$
Noised samples Original

Fig. For batch B=12, illustrates the noising process for $t=[962, 237, 38, 39, 988, 559, 299, 226, 985, 791, 859, 485]$

Details on Notation

Noising (or diffusion) is defined as a Markovian process over states $\mathbf{x}_0,...\mathbf{x}_T$ with transitions from Normal distributions given by $q(\mathbf{x} _t|\mathbf{x} _{t-1}) = \mathcal{N}(\mathbf{x} _t; \sqrt{1-\beta _t} \mathbf{x} _{t-1}, \beta _t \mathbf{I} )$ for small values of $\beta_t$.

Fig. Markov chain of forward (reverse) diffusion process (Ref: Ho et al. 2020 with additions by Lil'log)

Then the noising process $q(\mathbf{x}_t|\mathbf{x}_0)$ at any time $t$ from this Markovian process is given as below (details in Lil'log: What are Diffusion Models)

$$q(\mathbf{x}_t|\mathbf{x}_0) = \mathcal{N}(\mathbf{x}_t; \sqrt{\bar{\alpha} _t} \mathbf{x}_0, (1-\bar{\alpha}_t) \mathbf{I} )$$

In training, U-Net is trained to estimate the noise (i.e. the mean of $q(\mathbf{x} _{t-1} | \mathbf{x} _{t}, \mathbf{x} _0)$ the unknown de-noising process) from the noisy pictures sampled from $q(\mathbf{x}_t|\mathbf{x}_0)$ which are given by normal distribution above using a linear schedule $\beta_t \in [0.0001, 0.02]$ where $\alpha_t= 1-\beta_t$ and $\bar{\alpha}_t = \Pi _{s=1}^t \alpha_s$ (see Fig. below for $\bar{\alpha} _t$ and $\beta _t$ ).

Fig. For given $\beta_t$ schedule and large $t$, the $q(\mathbf{x}_t|\mathbf{x}_0)$ becomes zero mean, unit variance Normal distribution $\mathcal{N}(\mathbf{x} _T; \mathbf{0}, \mathbf{I})$ for $T=1000$$


References:

[1] Started the code based on outlier's Diffusion-Models-pytorch repo.

[2] Also used his youtube tutorial [Diffusion Models | Pytorch Implementation].

[3] Referring and using Phil Wang's (lucidrains) denoising-diffusion-pytorch repo.

[4] Lillian Weng's blog (lil'log) https://lilianweng.github.io/posts/2021-07-11-diffusion-models/

[5] Ho et al. 2020 "Denoising Diffusion Probabilistic Models" paper with original implementation and its pytorch reimplementation by Patrick Esser.

[6] Referred to CompVis' Latent Diffusion Models repo and paper Rombach et al. 2022 "High-Resolution Image Synthesis with Latent Diffusion Models"

[7] HuggingFace's Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX. For quickly testing DDPM here

[8] MMagic from OpenMMLab with nice resource on Stable Diffusion which mainly builds on HuggingFace's diffusers.

About

Diffusion Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published