## DDIM with similar noise

In this notebook, we will perform some experiments with Denoising Diffusion Implicit Models (DDIM). Unlike Denoising Diffusion Probabilistic Models (DDPM), DDIMs work deterministically, i.e., from one specific full noise image, they will always generate the same clear image. Using a deterministic backward process is essential here. If we used a probabilistic model, a small amount of noise would be added back into the image after every backward step, which would cause the results to vary significantly, even with the same starting point.

We will do two experiments:

1. **Experiment (Patch condition):** Change a small patch in the full noise image and generate two clear images from it. You can choose the patch size and position yourself.
2. **Experiment (Full noise condition):** Modify the full noise images in such a way that the mathematical distance between the two starting noise images is very small. You can control how much the noise is changed by selecting a value for `noise_scaling_factor`.

Each experiment demonstrates different effects:

1. **Experiment (Patch condition):** Changing a small patch in the full noise image will not only affect that exact patch in the clear image but the entire image. This shows that diffusion models capture dependencies between pixels by learning the probability distributions from the training dataset.
2. **Experiment (Full noise condition):** Slightly changing the noise leads to only small changes in the resulting clear image.

As you will see, the Euclidean distance and a heatmap are displayed after each experiment to give you an objective and visual measure of where and how the images differ. Feel free to try out different values in both experiments. Some interesting research questions you might explore include:

* Is there a correlation between the Euclidean distance of the noise images and that of the clear images?
* Is the Euclidean distance of the noise images generally smaller or larger than that of the clear images? Try to explain the answer based on how the diffusion process works.
* Given a similar Euclidean distance of noise images in both experiments, is the result changed more strongly in the patch condition or the full noise condition?
* In Experiment 2: How much do you need to change the noise image to get substantially different results?

**Note on performance issues:** If the image generation process takes too long, you can decrease the number of steps. This will save computation time but also reduce the quality of the results. To do that, choose a lower value in `ddim_scheduler.set_timesteps(50)` (see code cell 3).

## Setup

In [None]:
from functions import *

In [None]:
import torch
import diffusers
from PIL import Image
from tqdm import tqdm
import os

In [None]:
# setup
model_id = "google/ddpm-bedroom-256" # "google/ddpm-celebahq-256"
model = diffusers.UNet2DModel.from_pretrained(model_id)
ddim_scheduler = diffusers.DDIMScheduler.from_pretrained(model_id)
# time steps for generation process, can be decreased to save computation time, images might be of lower quality
ddim_scheduler.set_timesteps(50)

In [None]:
# input prepraration
image_size = model.config.sample_size # get image size
noise = torch.randn((1, 3, image_size, image_size)) # sample random noise

## 1. Experiment

In [None]:
# Letting the user choose patchsize and position
patch_size, patch_position_x, patch_position_y = user_input_patch()

# prepare input
noise = torch.randn((1, 3, image_size, image_size)) # sample random noise
noises = [noise.clone() for _ in range(2)] # duplicate noise
noises[1][:,:,patch_position_y:patch_position_y+patch_size,patch_position_x:patch_position_x+patch_size] = torch.randn((1, 3, patch_size, patch_size)) # change a small patch in one of the full noise pictures
euclidean = torch.norm((noises[0]-noises[1])).item() # calculate euclidean distance
show_table([[tensor_as_html(noises[0]), tensor_as_html(noises[1]), tensor_as_html(torch.abs(noises[0]-noises[1]).mean(dim=1,keepdim=True).repeat(1,3,1,1))], ["", f"Euclidean distance: {euclidean}", "Heatmap"]])#display images, euclidean, heatmap

In [None]:
# display the images alternately
from time import sleep
for i in range(10):
    show_images(noises[i % 2])
    sleep(0.5)

In [None]:
# output generation
images = list()
for current in noises:
    for t in tqdm(ddim_scheduler.timesteps):
        with torch.no_grad():
            predicted_noise = model(current, t).sample
            current = ddim_scheduler.step(predicted_noise, t, current).prev_sample
    images.append(current)

In [None]:
# show output
euclidean = torch.norm((images[0]-images[1])).item() # calculate euclidean distance
show_table([[tensor_as_html(images[0]), tensor_as_html(images[1]), tensor_as_html(torch.abs(images[0]-images[1]).mean(dim=1,keepdim=True).repeat(1,3,1,1))], ["", f"Euclidean distance: {euclidean}", "Heatmap"]])#display images, euclidean, heatmap

In [None]:
# save input and output
for i in range(len(noises)):
    tensor_as_image(noises[i]).save(f"../output/similar_ddim_noise_{i}.png")
    tensor_as_image(images[i]).save(f"../output/similar_ddim_image_{i}.png")

## 2. Experiment

In [None]:
#Letting the user choose a scaling factor
noise_scaling_factor = user_input_noise_scaling_factor()     

# prepare input
noise = torch.randn((1, 3, image_size, image_size)) # sample random noise
noises = [noise.clone() for _ in range(2)] # duplicate noise
noises[1] = ((1-noise_scaling_factor**2)**0.5) * noises[1] + noise_scaling_factor * torch.randn((1, 3, image_size, image_size)) # change one of the full noise pictures by adding newly generated noise scaled down heavily
euclidean = torch.norm((noises[0]-noises[1])).item() # calculate euclidean distance
show_table([[tensor_as_html(noises[0]), tensor_as_html(noises[1]), tensor_as_html(torch.abs(noises[0]-noises[1]).mean(dim=1,keepdim=True).repeat(1,3,1,1))], ["", f"Euclidean distance: {euclidean}", "Heatmap"]])#display images, euclidean, heatmap

In [None]:
# output generation
images = list()
for current in noises:
    for t in tqdm(ddim_scheduler.timesteps):
        with torch.no_grad():
            predicted_noise = model(current, t).sample
            current = ddim_scheduler.step(predicted_noise, t, current).prev_sample
    images.append(current)

In [None]:
# show output
euclidean = torch.norm((images[0]-images[1])).item() # calculate euclidean distance
show_table([[tensor_as_html(images[0]), tensor_as_html(images[1]), tensor_as_html(torch.abs(images[0]-images[1]).mean(dim=1,keepdim=True).repeat(1,3,1,1))], ["", f"Euclidean distance: {euclidean}", "Heatmap"]])#display images, euclidean, heatmap