# S&DS 431 final project: code demo

- Title: A study on Distribution Matching Distillation Models
- Authors: Sida Chen, Jack Chen

This Jupyter notebook is only intended for demonstration. You can run our project without using this notebook, but by running each Python script via command line. This file solely functions as a driver that connects all these scripts, explains their inputs and outputs, and runs them in the right order.

## Dependencies

The exact Python packages' versions used to run this project can be found in `requirements.txt`. Apart from the standard libraries (like `diffusers`, `transformers`, `torch`, etc.), the most notable ones we need are `clip` and `pytorch-fid`.

## Data preparation

We start by preparing the real images and generated images. The real images (used for FID analysis) is a subset of the ImageNet-64 dataset. You can download the zip and unzip it with:

```bash
wget -c https://image-net.org/data/downsample/Imagenet64_train_part1_npz.zip
unzip Imagenet64_train_part1_npz.zip
```

The dataset consists of binary NumPy dumps, so we need to convert them to real PNG images first. This is implemented in `process_imagenet.py`.

In [None]:
from process_imagenet import extract_images_from_npz

npz_folder = "Imagenet64_train_part1_npz/"
output_folder = "real_images/"
extract_images_from_npz(npz_folder, output_folder)

The generated images are generated by the pre-trained `tianweiy/DMD2` model. The core logic is implemented by `schedule_variation.py`. If you run this file in command line, you need to pass the prompt, number of inference steps, and the scheduler, such as:

```bash
python schedule_variation.py --prompt "a cat" --scheduler "linear" --num-inference-steps 20
```

Or you can invoke the function like this:

In [None]:
from schedule_variation import generate_image

# A list of jobs, each job is a tuple of (prompt, schedule, steps), and the seed
generate_image([("a cat", "linear", 20)], 42)

The command line does not support setting a fixed random seed, so the output will be at `output/None`. In our investigations, we fix the random seed, causing each job to output to `output/{seed}`. To acquire all outputs we use, run `all_jobs.py`. Here are all jobs:

In [None]:
from all_jobs import prompts, jobs, seeds

print("Prompts:")
print("\n".join(prompts))
print("Jobs:")
print("\n".join([f"{prompt} {schedule} {steps}" for prompt, schedule, steps in jobs]))
print("Seeds:")
print(", ".join([str(seed) for seed in seeds]))

Send them off with:

In [None]:
for seed in seeds:
    generate_image(jobs, seed)

This produces a folder tree in the following format:

```plain
output
├── 223
│   ├── A_photo_of_llama_cosine_15.png
│   ├── A_photo_of_llama_cosine_3.png
│   ├── A_photo_of_llama_cosine_4.png
│   ├── A_photo_of_llama_cosine_8.png
│   ├── A_photo_of_llama_exponential_15.png
│   ├── A_photo_of_llama_exponential_3.png
│   ├── A_photo_of_llama_exponential_4.png
│   ├── A_photo_of_llama_exponential_8.png
│   ├── A_photo_of_llama_linear_15.png
│   ├── A_photo_of_llama_linear_3.png
│   ├── A_photo_of_llama_linear_4.png
│   ├── A_photo_of_llama_linear_8.png
│   ├── a_shiba_inu_wearing_cosine_15.png
│   ├── ...
│   └── times.csv
├── ...
└── 69
    ├── ...
    └── times.csv
```

To facilitate group-based analysis, we will regroup them in another folder, such that the images are grouped by scheduler/steps and then identified by prompt/seed. This is implemented in `reorder_images.py`.

In [None]:
from reorder_images import reorder_images

reorder_images("output", "output-alt")

## Calculating scores

We will calculate the following metrics:

- Inference time - for evaluating runtime performance
- [CLIP score](https://arxiv.org/pdf/2104.08718) - for evaluating text-to-image alignment
- [Fréchet inception distance (FID)](https://en.wikipedia.org/wiki/Fr%C3%A9chet_inception_distance) - for evaluating image quality
- MSE against paper results - for evaluating output stability and consistency

Inference time is already collected during generation.

CLIP score is a bit tricky to compute because embedding all images at once causes CUDA memory errors. Our approach is to only process one seed at a time. In our `clip_score.py`, we accept one single seed per run, and use Bash to run the script over all seeds. The script will output a `output/{seed}/similarities.csv` file with all CLIP scores for each seed. You can compute one seed with:

In [None]:
from clip_score import compute_clip_score

compute_clip_score(42)

Or compute all of them with the following Bash script:

```bash
for dir in output/*; do
  if [ -d "$dir" ]; then
    suffix=${dir#output/}
    python clip_score.py --seed "$suffix"
  fi
done
```

FID scores are directly calculated using the command line. The `pytorch-fid` command line tool accepts one folder of real images and one folder of generated images, so to run it over all groups and output a CSV, we again use Bash:

```bash
echo "scheduler,steps,FID" > fid.csv
for dir in output-alt/*; do
  fid_output=$(pytorch-fid real_images/ "$dir" --batch-size 64 --device cuda | grep 'FID: ' | sed 's/FID: //')
  scheduler=$(echo "$dir" | cut -d'_' -f1)
  steps=$(echo "$dir" | cut -d'_' -f2)
  echo "$scheduler,$steps,$fid_output" >> fid.csv
done
```

## Visualization

We first create collages for some sample images. This is implemented in `plot_output.py`.

In [None]:
from plot_output import make_demos

make_demos(431, "plots/output_demos")

Plot for the scheduled timesteps (`plot_schedule.py`):

In [None]:
from plot_schedule import plot_schedule

plot_schedule("plots/schedule.png")

Plot for inference time (`plot_times.py`):

In [None]:
from plot_times import plot_times

plot_times("plots/time.png")

Plot for CLIP scores (`plot_similarities.py`):

In [None]:
from plot_similarities import plot_similarities

plot_similarities("plots/similarity.png")

Plot for FID scores (`plot_fid.py`):

In [None]:
from plot_fid import plot_fid

plot_fid("plots/fid_score.png")

Plot for MSE against paper results (`mse_score.py`):

In [None]:
from mse_score import plot_mse

plot_mse("plots/mse_score.png")