<a href="https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JoaoLages/diffusers-interpret/blob/main/notebooks/stable_diffusion_example_colab.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Stable Diffusion 🎨

This notebook shows an example of how to run `diffusers_interpret.StableDiffusionPipelineExplainer` to explain `diffusers.StableDiffusionPipeline`.

Before going through it, it is recommended to have a look at [🤗 HuggingFace's notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_diffusion.ipynb#scrollTo=-xMJ6LaET6dT).

In [1]:
# make sure you are running in GPU
!nvidia-smi

Thu Dec  5 15:57:19 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   44C    P8              10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [8]:
!pip install diffusers-interpret
!pip install matplotlib
!pip install --upgrade diffusers

Collecting diffusers
  Downloading diffusers-0.31.0-py3-none-any.whl.metadata (18 kB)
Downloading diffusers-0.31.0-py3-none-any.whl (2.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.9/2.9 MB[0m [31m36.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: diffusers
  Attempting uninstall: diffusers
    Found existing installation: diffusers 0.3.0
    Uninstalling diffusers-0.3.0:
      Successfully uninstalled diffusers-0.3.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
diffusers-interpret 0.5.0 requires diffusers~=0.3.0, but you have diffusers 0.31.0 which is incompatible.[0m[31m
[0mSuccessfully installed diffusers-0.31.0


### 0 - Login in HuggingFace's Hub

In [1]:
from google.colab import output
output.enable_custom_widget_manager()

from huggingface_hub import notebook_login
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

### 1 - Initialize StableDiffusionPipeline normally

In [2]:
# make sure you're logged in by running the previous cell or `huggingface-cli login`
import torch
from diffusers import StableDiffusionPipeline
from contextlib import nullcontext

device = 'cuda' if torch.cuda.is_available() else 'cpu'

pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4",torch_dtype=torch.float16).to(device)
pipe.enable_attention_slicing() # comment this line if you wish to deactivate this option

The cache for model files in Diffusers v0.14.0 has moved to a new location. Moving your existing cached models. This is a one-time operation, you can interrupt it or run it later by calling `diffusers.utils.hub_utils.move_cache()`.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Fetching 16 files:   0%|          | 0/16 [00:00<?, ?it/s]

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

### 2 - Pass `StableDiffusionPipeline` to `StableDiffusionPipelineExplainer`

In [3]:
from diffusers_interpret import StableDiffusionPipelineExplainer

explainer = StableDiffusionPipelineExplainer(
    pipe,

    # We pass `True` in here to be able to have a higher `n_last_diffusion_steps_to_consider_for_attributions` in the cell below
    gradient_checkpointing=True
)

ImportError: cannot import name 'preprocess_mask' from 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_inpaint' (/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py)

### 3 - Generate an image with the `StableDiffusionPipelineExplainer` object

Note that the `explainer()` method accepts all the arguments that `pipe()` accepts.

We also pass a `generator` argument so that we get a deterministic output.

In [None]:
prompt = "A cute corgi with the Eiffel Tower in the background"

generator = torch.Generator(device).manual_seed(2023)
with torch.autocast('cuda') if device == 'cuda' else nullcontext():
    output = explainer(
        prompt,
        num_inference_steps=50,
        generator=generator,
        height=448,
        width=448,

        # for this model, the GPU VRAM usage will raise drastically if we increase this argument. feel free to experiment with it
        # if you are not interested in checking the token attributions, you can pass 0 in here
        n_last_diffusion_steps_to_consider_for_attributions=5
    )

#### 3.1 - Final generated image

In [None]:
# Final image
output.image

#### 3.2 -  Check all the generated images during the diffusion process

In [None]:
# Google Colab does not render the IFrame from the code below, only works locally on your Jupyter Notebook.
#output.all_images_during_generation.show(width="100%", height="400px")

# but we can see the GIF!
output.all_images_during_generation.gif(file_name="diffusion_process.gif", duration=400)

You can also check the images individually:

In [None]:
# Image at first generation
output.all_images_during_generation[0]

In [None]:
# Image at 33rd generation
output.all_images_during_generation[33]

#### 3.3 - Normalized and unnormalized token attributions

We are now able to see what were the importances of each token in the input text to generate when generating the image.

The token `corgi` was the most important feature according to our explainability method.

In [None]:
# (token, attribution)
output.token_attributions

We can also see the normalized version

In [None]:
# (token, attribution_percentage)
output.normalized_token_attributions

or plot them!

In [None]:
output.token_attributions.plot(normalize=True)

### 4 - Get explanations for a specific part of the image

`diffusers-interpret` also computes the tokens importances for generating a particular part of the output image.

In the current implementation, we only need to re-run the `explainer` and pass it the `explanation_2d_bounding_box` argument with the bounding box we are interested in seeing.

In [None]:
prompt = "A cute corgi with the Eiffel Tower in the background"

generator = torch.Generator(device).manual_seed(2023)
with torch.autocast('cuda') if device == 'cuda' else nullcontext():
    output = explainer(
        prompt,
        num_inference_steps=50,
        generator=generator,
        height=448,
        width=448,

        # for this model, the GPU VRAM usage will raise drastically if we increase this argument. feel free to experiment with it
        # if you are not interested in checking the token attributions, you can pass 0 in here
        n_last_diffusion_steps_to_consider_for_attributions=5,

        explanation_2d_bounding_box=((305, 180), (448, 448)), # (upper left corner, bottom right corner)
    )


#### 4.1 - Check generated image

A red bounding box is now visible in the picture, to indicate the area that `explainer` is looking at when calculating the token attributions.

In [None]:
output.image

#### 4.2 - Check token attributions for bounding box

In [None]:
# (token, attribution_percentage)
output.token_attributions

In [None]:
output.token_attributions.plot(normalize=True)

### 5 - Same generation, but with a different `explanation_2d_bounding_box`

In [None]:
prompt = "A cute corgi with the Eiffel Tower in the background"

generator = torch.Generator(device).manual_seed(2023)
with torch.autocast('cuda') if device == 'cuda' else nullcontext():
    output = explainer(
        prompt,
        num_inference_steps=50,
        generator=generator,
        height=448,
        width=448,

        # for this model, the GPU VRAM usage will raise drastically if we increase this argument. feel free to experiment with it
        # if you are not interested in checking the token attributions, you can pass 0 in here
        n_last_diffusion_steps_to_consider_for_attributions=5,

        explanation_2d_bounding_box=((140, 0), (270, 190)), # (upper left corner, bottom right corner)
    )

In [None]:
output.image

In [None]:
# (token, attribution_percentage)
output.token_attributions.normalized

In [None]:
output.token_attributions.plot(normalize=True)