# Diffusion 1

## Inference 1

Adapted from the [fast.ai repo](https://github.com/fastai/diffusion-nbs).

## Workflow

#### Drive

If you need to load/save to your drive:

```python
import sys
if 'google.colab' in sys.modules:
    from google.colab import drive
    drive.mount('/content/drive/')

import os
os.chdir('drive/My Drive/IS53055B-DMLCP/DMLCP/python') # to change to another directory
```

#### Huggingface login

For some models and datasets, and if you want to push your model to HF (same as GitHub, but for models) you need to be logged into your HF account.

For that, you need to create an account [here](https://huggingface.co/) and then to ['/settings/tokens'](https://huggingface.co/settings/tokens) to create an access token.

```python
from pathlib import Path
from huggingface_hub import notebook_login
if not (Path.home()/'.huggingface'/'token').exists():
    notebook_login()
```

#### Install

1. On Colab, just use `pip` to install Huggingface libraries (see below).

2. Locally, the install is the same as the one used for Language models, see [`setup.md`](https://github.com/jchwenger/DMLCP/blob/main/setup.md#pytorch--huggingfacegradio).

In [None]:
import sys

if 'google.colab' in sys.modules:
    !pip install --upgrade transformers diffusers accelerate

In [None]:
from pathlib import Path

from PIL import Image
import matplotlib.pyplot as plt

import torch

# Get cpu, gpu or mps device for training.
# See: https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html#creating-models
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)

from diffusers import StableDiffusionPipeline

In [None]:
torch.manual_seed(1)

In [None]:
MODEL_ID = "CompVis/stable-diffusion-v1-4"
# Check out other models by CompVis, different flavours: https://huggingface.co/CompVis
# Runway also has a few: https://huggingface.co/runwayml, for instance "runwayml/stable-diffusion-v1-5"
# or Stability AI (scroll down to models): https://huggingface.co/stabilityai?search_models=stable-diffusion

pipe = StableDiffusionPipeline.from_pretrained(
    MODEL_ID,
    variant="fp16",
    torch_dtype=torch.float16,
    safety_checker = None # remove NSFW filter
).to(device)

# Note:  removing the filter is no licence to do harm, it is to give *you* the responsibility
# of your use. (Also, the HF safety_checker is very, very conservative, and rejects
# a lot of abstract images.)
# (you can also do it later btw: pipe.safety_checker = None)

In [None]:
prompt = "a photograph of an astronaut riding a horse"

In [None]:
pipe(prompt).images[0] # you can generate rectangular images by passing
                       # a height and a width argument

The seed allows you to control randomness / reproduce your outputs.

In [None]:
torch.manual_seed(1024)
pipe(prompt).images[0]

See the process of generation

In [None]:
torch.manual_seed(1024)
pipe(prompt, num_inference_steps=3).images[0]

In [None]:
torch.manual_seed(1024)
pipe(prompt, num_inference_steps=16).images[0]

## The Classifier Guidance Parameter

The higher it is, the more closely the model will try and stick to the prompt. A lower number produces more random (creative?) results.

Default: 7.5

In [None]:
num_rows,num_cols = 4,4
prompts = [prompt] * num_cols

In [None]:
guidances = [1.1,3,7,14]
result = [pipe(prompts, guidance_scale=g).images for g in guidances]

In [None]:
result

In [None]:
result[-1][0] # Colab is smart, it displays PIL images automatically

A convenience function to display a batch of images.

In [None]:
# https://matplotlib.org/stable/gallery/axes_grid1/simple_axesgrid.html
from mpl_toolkits.axes_grid1 import ImageGrid

def plot_images(imgs, rows=1, cols=None, figsize=(12,8), title=None):
    fig = plt.figure(figsize=figsize)   # control figure size
    grid = ImageGrid(
        fig, 111,                       # similar to subplot(111) | see: https://stackoverflow.com/a/11404223
        nrows_ncols=(rows, cols if cols is not None else len(imgs)),  # creates one row of images
        axes_pad=0.1,                   # pad between axes in inch
    )
    if title is not None:
        # https://matplotlib.org/3.2.1/gallery/subplots_axes_and_figures/figure_title.html
        fig.suptitle(title, x=0, y=0.5)

    # Iterating over the grid returns the Axes.
    for ax, im in zip(grid, imgs):
        # no x/y ticks: https://stackoverflow.com/a/45149018, https://stackoverflow.com/a/58535290
        ax.set_xticks([])
        ax.set_yticks([])
        ax.imshow(im)

In [None]:
for imgs, g in zip(result, guidances):
    plot_images(imgs, title=g)

## Negative Prompts

In [None]:
torch.manual_seed(1000)
prompt = "Labrador in the style of Vermeer"
pipe(prompt).images[0]

In [None]:
torch.manual_seed(1000)
pipe(prompt, negative_prompt="blue").images[0]

## Deeper

For a deeper dive, that unpacks what is going on under the hood when invoking `pipe`, check out [the official Huggingface inference notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_diffusion.ipynb#scrollTo=yW14FA-tDQ5n). One thing this deeper dive can be used for is, for instance, to save the intermediate steps of the denoising process (to make gifs/videos).

---

## Experiments

1. Test everything!
  - Come up with your own prompts!
  - Set yourself a specific theme, and test various styles (photo, oil on canvas, vaporwave, names of artists, etc.)
  - Search for different [models](https://huggingface.co/models?other=stable-diffusion-diffusers)! (Some of them may exceed your GPU capacity, beware).
2. Make sure you understand and develop an intuition of:
  - the `torch.manual_seed()`: make sure you try repeating your own results!
  - the `num_inference_steps` parameter (a lot of possibilities open to you exploring *very few steps*!)
  - the `plot_images` function
  - the use of Python loops to test an array of things (seeds, guidances, prompts, etc.)
3. Research prompting tricks, here are resources:
  - [Lexica](https://lexica.art/)
  - [PromptHero](https://prompthero.com/)
  - [PromptBook, by OpenArt](https://openart.ai/promptbook)
  - [Reddit](https://www.reddit.com/r/StableDiffusion/)
  - After a while, searching online and making your own tests, you may notice that the style you obtain really feels a bit repetitive: is there a way you can push through that and find strange, unexpected edge cases?
4. Test negative prompting: when does it work, when not?
5. Can you think of a way to introduce computational thinking into this? Ideas:
  - Take a piece of text, slice it into parts, and use each part as a prompt, translating the text into a series of images? Perhaps you could extract *tiles* from a text ('abcdef...' → 'abc', 'bcd', 'def', ...)?
  - You could imagine trying to build a random prompt generator (using a list of things, styles, etc.), that will construct a prompt programmatically from bits of texts and generate images using that.
  
There are many models available, including:
  - the recent and higher quality [Stable Diffusion XL](https://huggingface.co/docs/diffusers/using-diffusers/sdxl);
  - the other high quality model [IF by DeepFloyd](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/deepfloyd_if_free_tier_google_colab.ipynb);
  - Music generators: [AudioDiffusion](https://huggingface.co/docs/diffusers/main/en/api/pipelines/audio_diffusion), [AudioLDM](https://huggingface.co/docs/diffusers/main/en/api/pipelines/audioldm), [AudioLDM 2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/audioldm2), [MusicLDM](https://huggingface.co/docs/diffusers/main/en/api/pipelines/musicldm).
  
Each of the pages on Huggingface have some starter code that should be relatively straightforward to set up!
