#### Stable Diffusion is a text-to-image latent diffusion model created by the researchers from CompVis, Stability AI and LAION. Tha paper available here: https://arxiv.org/abs/2112.10752. The model is trained on 512x512 images from a subset of the LAION-5B database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. 

#### Please make sure you are using a GPU runtime in order to run this notebook.

#### We need to install install diffusers==0.4.0, transformers, torch and ipywidgets

In [1]:
!pip install diffusers==0.4.0
!pip install transformers 
!pip install "ipywidgets>=7,<8"
import torch
from diffusers import StableDiffusionPipeline

Collecting diffusers==0.4.0
  Downloading diffusers-0.4.0-py3-none-any.whl (229 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m229.1/229.1 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Installing collected packages: diffusers
  Attempting uninstall: diffusers
    Found existing installation: diffusers 0.7.2
    Uninstalling diffusers-0.7.2:
      Successfully uninstalled diffusers-0.7.2
Successfully installed diffusers-0.4.0
Collecting ipywidgets<8,>=7
  Downloading ipywidgets-7.7.2-py2.py3-none-any.whl (123 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m123.4/123.4 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
Collecting widgetsnbextension~=3.6.0
  Downloading widgetsnbextension-3.6.1-py2.py3-none-any.whl (1.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0mm
Collecting jupyterlab-widgets<3,>=1.0.0
  Downloading jupyterl

#### We will use a pre-trained model from huggenface. That why we need a authentication key from hugging face. We can have a authetication key from: https://huggingface.co/docs/hub/security-tokens/ .

In [2]:
auth_token = 'Put here your hugingface authentication token'

#### Now we have to load the pretrained model from huggingface. It is necessery to ensure that every free Google Colab can run Stable Diffusion, hence we're loading the weights from the half-precision branch fp16 and also tell diffusers to expect the weights in float16 precision by passing torch_dtype=torch.float16.

In [None]:
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16,use_auth_token=auth_token)  

#### Now we need to move the pipeline to GPU. It will inference faster.

In [None]:
pipe = pipe.to("cuda")

#### Now we can run the inference. The input will be text and the model will generate a random image based on the input text.

In [None]:
prompt = "A girl is playing chess"
image = pipe(prompt).images[0]  

# Now to display an image you can do either save it such as:
image.save(f"tmp_image.png")

# or if you're in a google colab you can directly display it with 
image

####  If we want a deterministic output, we need to pass a random seed to the pipe. It will generate the same imgage with the same seed. 

In [None]:
prompt = "A girl is playing chess"
generator = torch.Generator("cuda").manual_seed(1024)

image = pipe(prompt, generator=generator).images[0]

image