<a href="https://colab.research.google.com/github/sihwapark/SDXL_Turbo_Demo_Notebook/blob/main/Stable_Diffusion_XL_Turbo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Stable Diffusion XL Turbo Image-to-Image Demo Notebook
By Sihwa Park (shpark@yorku.ca) <br/>
December 2, 2023

References:<br/>
- [SDXL Turbo Online Demo](https://stablediffusionweb.com/SDXL-Turbo)
- [SDXL Turbo Model Card](https://huggingface.co/stabilityai/sdxl-turbo)
- [SDXL Turbo Documentation](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/sdxl_turbo)
- [Jupyter widgets](https://github.com/jupyter-widgets/ipywidgets)
- [redromnon's notebook](https://colab.research.google.com/github/redromnon/stable-diffusion-interactive-notebook/blob/main/stable_diffusion_interactive_notebook.ipynb) for the use of Jupyter widgets


## Install Prerequisites

In [1]:
!pip -q install torch diffusers transformers accelerate scipy safetensors xformers mediapy ipywidgets==7.7.1

## Import Libraries

In [2]:
import torch
import ipywidgets as widgets, mediapy, random
import importlib
from pathlib import Path

#Enable third party widget support
from google.colab import output
output.enable_custom_widget_manager()

from diffusers import AutoPipelineForImage2Image
from diffusers.utils import load_image

# See https://github.com/tin2tin/Pallaidium/issues/72
# WARNING:xformers:WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
  # PyTorch 2.1.0+cu121 with CUDA 1201 (you have 2.1.0+cu118)
  # Python  3.10.13 (you have 3.10.12)

    PyTorch 2.1.0+cu121 with CUDA 1201 (you have 2.1.0+cu118)
    Python  3.10.13 (you have 3.10.12)
  Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
  Memory-efficient attention, SwiGLU, sparse and more won't be available.
  Set XFORMERS_MORE_DETAILS=1 for more details


## Load Pipeline and Model

In [3]:
pipe = AutoPipelineForImage2Image.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16")

# To resolve the error: "RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'"
# It is caused when trying to load the diffusion model in float16(Half) format on CPU which is not supported.
# For float16 format, GPU needs to be used. For CPU run the model in float32 format.
# https://stackoverflow.com/questions/75641074/i-run-stable-diffusion-its-wrong-runtimeerror-layernormkernelimpl-not-implem
pipe = pipe.to("cuda")

# or call .enable_model_cpu_offload() that offloads all models to CPU using accelerate, reducing memory usage with a low impact on performance.
# https://huggingface.co/docs/diffusers/v0.24.0/en/api/pipelines/stable_diffusion/gligen#diffusers.StableDiffusionGLIGENTextImagePipeline.enable_model_cpu_offload
# pipe.enable_model_cpu_offload()

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

## Image-to-Image Generation with GUI

In [22]:
# @title

init_image = None;

image_file_path = widgets.Text(
    value="/content/1.jpeg",
    description="Image:",
    layout=widgets.Layout(width="80%")
)

button_select = widgets.Button(
    description="Select",
    button_style="success",
    layout=widgets.Layout(width="20%")
)

reference_image = widgets.Output()

def preload_image(e):
  reference_image.clear_output()

  global init_image
  with reference_image:
    init_image = load_image(image_file_path.value).resize((512, 512))
    mediapy.show_image(init_image)

button_select.on_click(preload_image)

# When using SDXL-Turbo for image-to-image generation, make sure that num_inference_steps * strength is larger or equal to 1.
# The image-to-image pipeline will run for int(num_inference_steps * strength) steps, e.g. 0.5 * 2.0 = 1 step in our example below.
steps = widgets.IntSlider(
    value=5,
    min=1,
    max=10,
    step=1,
    orientation='horizontal',
    readout_format='d',
    description="Steps:",
    layout=widgets.Layout(width="99%")
)

# strength (float, optional, defaults to 0.3) — Conceptually, indicates how much to transform the reference image.
# Must be between 0 and 1. the reference image will be used as a starting point, adding more noise to it the larger the strength.
# The number of denoising steps depends on the amount of noise initially added.
# When strength is 1, added noise will be maximum and the denoising process will run for the full number of iterations specified in num_inference_steps.
strength = widgets.FloatSlider(
    value=0.2,
    min=0,
    max=1.0,
    step=0.001,
    orientation='horizontal',
    readout_format='.3f',
    description="Strength:",
    layout=widgets.Layout(width="99%")
)


# guidance_scale (float, optional, defaults to 7.5) — Guidance scale as defined in Classifier-Free Diffusion Guidance. guidance_scale is defined as w of equation 2. of Imagen Paper.
# Guidance scale is enabled by setting guidance_scale > 1.
# Higher guidance scale encourages to generate images that are closely linked to the text prompt, usually at the expense of lower image quality.
# NOTE: SDXL Turbo should disable guidance scale by setting guidance_scale=0.0
# CFG = widgets.widgets.FloatSlider(
#     value=7.5,
#     min=0,
#     max=1.0,
#     step=0.001,
#     orientation='horizontal',
#     readout_format='.3f',
#     description="CFG:",
#     layout=widgets.Layout(width=width)
# )

random_seed = widgets.IntSlider(
    value=random.randint(0, 12013012031030),
    min=0,
    max=12013012031030,
    step=1,
    orientation='horizontal',
    readout_format='d',
    description="Seed:",
    layout=widgets.Layout(width="99%")
)

prompt = widgets.Textarea(
    value="",
    placeholder="Enter prompt",
    rows=1,
    layout=widgets.Layout(width="80%")
)

# neg_prompt = widgets.Textarea(
#     value="",
#     placeholder="Enter negative prompt",
#     rows=5,
#     layout=widgets.Layout(width="600px")
# )

generate = widgets.Button(
    description="Generate",
    button_style="primary",
    layout=widgets.Layout(width="20%")
)

display_imgs = widgets.Output()

process_info = widgets.Output() #widgets.HTML(value="")

generated_image = None;

def generate_img(i):
  global generated_image
  #Clear output
  process_info.clear_output()
  display_imgs.clear_output()
  generate.disabled = True
  generated_image = None;

  #Calculate seed
  seed = random.randint(0, 12013012031030) if random_seed.value == -1 else random_seed.value
  # print(init_image)
  if init_image == None:
    preload_image(None)

  with process_info:
    print("Running...")
    # process_info.value = "Running..."
    # https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/sdxl_turbo
    images = pipe(
        prompt.value,
        image=init_image,
        num_inference_steps=steps.value,
        strength=strength.value,
        # num_images_per_prompt = 1,
        # guidance_scale=CFG.value,
        guidance_scale=0.0,
        generator=torch.Generator("cuda").manual_seed(seed)
      ).images;

    generated_image = images[0]
    print("Done!")

  with display_imgs:
    mediapy.show_images(images)

  generate.disabled = False

generate.on_click(generate_img)

save = widgets.Button(
    description="Save",
    button_style="primary",
    layout=widgets.Layout(width="20%")
)

def save_image(b):
  global generated_image
  generated_image.save("result.png")

save.on_click(save_image);


widgets.VBox(
  [
    widgets.AppLayout(
      header=None,
      # widgets.HTML(
      #     value="<h2>Stable Diffusion XL Turbo</h2>",
      # ),
      left_sidebar=widgets.VBox(
          [
            widgets.HBox([image_file_path, button_select]),
            steps, strength, random_seed
          ], layout=widgets.Layout(width="95%")
      ),
      center=None,
      right_sidebar=widgets.VBox(
          [
            widgets.HBox([prompt, generate]),
            process_info
          ],
          layout=widgets.Layout(width="80%")
      ),
      footer=None,
      pane_widths=[1, 0, 2],
    ),
    widgets.AppLayout(
      header=None,
      left_sidebar=reference_image,
      center=None,
      right_sidebar=display_imgs,
      footer=None,
      pane_widths=[1, 0, 2],
    ),
    # save
  ]
)

VBox(children=(AppLayout(children=(VBox(children=(HBox(children=(Text(value='/content/1.jpeg', description='Im…

## Upscale (WIP)

In [None]:
from diffusers import StableDiffusionUpscalePipeline

# load model and scheduler
model_id = "stabilityai/stable-diffusion-x4-upscaler"
upscale_pipeline = StableDiffusionUpscalePipeline.from_pretrained(
    model_id, revision="fp16", torch_dtype=torch.float16
)
upscale_pipeline = upscale_pipeline.to("cuda")

In [None]:
upscaled_image = upscale_pipeline(prompt=prompt.value, image=generated_image).images[0]
display(upscaled_image)