<a href="https://colab.research.google.com/github/RainbowPowerr/ML-thesis/blob/main/Stable_Diffusion%2C_testfil.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Stable Diffusion** 🎨 
*...using `🧨diffusers`*

Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/) and [LAION](https://laion.ai/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM.
See the [model card](https://huggingface.co/CompVis/stable-diffusion) for more information.

This Colab notebook shows how to use Stable Diffusion with the 🤗 Hugging Face [🧨 Diffusers library](https://github.com/huggingface/diffusers). 

Let's get started!

### Setup

First, please make sure you are using a GPU runtime to run this notebook, so inference is much faster. If the following command fails, use the `Runtime` menu above and select `Change runtime type`.


Next, you should install `diffusers==0.4.0` as well `scipy`, `ftfy` and `transformers`.

In [None]:
!nvidia-smi

!pip install diffusers==0.4.0
!pip install transformers scipy ftfy
!pip install "ipywidgets>=7,<8"

from google.colab import output
output.enable_custom_widget_manager()

from huggingface_hub import notebook_login

notebook_login()

You also need to accept the model license before downloading or using the weights. In this post we'll use model version `v1-4`, so you'll need to  visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. 

You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens).

In [None]:
from diffusers import StableDiffusionPipeline
import torch

model_id = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16, revision="fp16")
pipe = pipe.to("cuda")

from PIL import Image

def image_grid(imgs, rows, cols):
    assert len(imgs) == rows*cols

    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols*w, rows*h))
    grid_w, grid_h = grid.size
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid
all_images = []

# Nytt avsnitt

In [None]:
# 
generator = torch.Generator("cuda").manual_seed(123)
prompt = "tiny red house surrounded by gold coin stacks, board game, 4k, hd"
image = pipe(prompt, num_inference_steps=50, guidance_scale = 8,  height= 480, width= 800).images[0]  # image here is in [PIL format](https://pillow.readthedocs.io/en/stable/)

# or if you're in a google colab you can directly display it with 
image

In [None]:
def image_grid(imgs, rows, cols):
    assert len(imgs) == rows*cols

    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols*w, rows*h))
    grid_w, grid_h = grid.size
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid
all_images = []

num_cols = 3
num_rows = 4

prompt = ["/content/coinstack.jpg, straight view of a tiny red house next to a stack of golden coins, 4k, hd"] * num_cols
generator = torch.Generator("cuda").manual_seed(984799)
#negative_prompt = ["text on boxes"] * num_cols

for i in range(num_rows):
  images = pipe(prompt, num_inference_steps=50, guidance_scale = 8, generator=generator).images
  all_images.extend(images)

grid = image_grid(all_images, rows=num_rows, cols=num_cols)
grid


In [None]:
generator = torch.Generator("cuda").manual_seed(1234)
prompt = "/content/coinstack.jpg, straight view of a tiny red house next to a stack of golden coins, 4k, hd"
image = pipe(prompt, num_inference_steps=50, guidance_scale = 20, generator=generator, height=400 , width=720 ).images[0]

image.save("/content/Test_images/Housing_market.png")

In [None]:
generator = torch.Generator("cuda").manual_seed(1234)
prompt = "/content/forest.jpg, a forest in Sweden by a lake, autumn, sun is shining,  hyperrealism, 4k, photo realistic, hd"
image = pipe(prompt, num_inference_steps=50, guidance_scale = 20, generator=generator, height=400 , width=720 ).images[0]

image.save("/content/Test_images/forest.png")

In [None]:
generator = torch.Generator("cuda").manual_seed(12345)
prompt = "/content/boxes.jpg, cardboard boxes and plants, on a table, in an office, folders, sideview, 4k, hd"
image = pipe(prompt, num_inference_steps=50, guidance_scale = 20, generator=generator, height=400 , width=720 ).images[0]

image.save("/content/Test_images/bankruptcy.png")

In [None]:
# Loop för att spara flera bilder i mappen Test_images

num_images = 2

prompt = ["/content/coinstack.jpg, straight view of a tiny red house next to a stack of golden coins, 4k, hd"]
generator = torch.Generator("cuda").manual_seed(984799)

for i in range(num_images):
  images = pipe(prompt, num_inference_steps=50, guidance_scale = 8, generator=generator).images
  images[0].save(f'/content/Test_images/{i}.png')

In [None]:
!zip -r /content/Test_images.zip /content/Test_images

In [None]:
#generator = torch.Generator("cuda").manual_seed(1234)
prompt = "/content/forest.jpg, a forest in Sweden by a lake, autumn, sun is shining,  hyperrealism, 4k, photo realistic, hd"
image = pipe(prompt, num_inference_steps=50, guidance_scale = 20, height=400 , width=720 ).images[0]

image

In [None]:
cinematic, colorful background, concept
art, dramatic lighting, high detail, highly detailed, hyper realistic, intricate, intricate sharp details,
octane render, smooth, studio lighting, trending on artstation