# IMAGE VARIATION USING STABLE DIFFUSION

<hr></hr>

This is a image to image stable diffusion notebook that takes in an input image and adds some changes to it, depending on some parameters that you can change. Hope you like it!

## **INSTALLING DEPENDENCIES**

In [1]:
!pip install diffusers==0.3.0 transformers ftfy
!pip install  "ipywidgets>=7,<8"
!pip install gradio 


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting diffusers==0.3.0
  Downloading diffusers-0.3.0-py3-none-any.whl (153 kB)
[K     |████████████████████████████████| 153 kB 11.3 MB/s 
[?25hCollecting transformers
  Downloading transformers-4.23.1-py3-none-any.whl (5.3 MB)
[K     |████████████████████████████████| 5.3 MB 63.1 MB/s 
[?25hCollecting ftfy
  Downloading ftfy-6.1.1-py3-none-any.whl (53 kB)
[K     |████████████████████████████████| 53 kB 2.1 MB/s 
Collecting huggingface-hub>=0.8.1
  Downloading huggingface_hub-0.10.1-py3-none-any.whl (163 kB)
[K     |████████████████████████████████| 163 kB 67.4 MB/s 
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[K     |████████████████████████████████| 7.6 MB 37.9 MB/s 
Installing collected packages: tokenizers, huggingface-hub, transformers, ftfy, diffusers
Successfully installe

## **IMPORTING DEPENDENCIES**

In [2]:
import gradio as gr
import inspect
import warnings
import numpy as np
from typing import List, Optional, Union
import requests
from io import BytesIO
from PIL import Image
import torch
from torch import autocast
from tqdm.auto import tqdm
from diffusers import StableDiffusionImg2ImgPipeline


## **GENERATE USER ACCESS TOKEN**

## **INITIALIZE AND DOWNLOAD THE MODEL PIPELINE**

In [3]:
device = "cuda"
model_path = "CompVis/stable-diffusion-v1-4"
access_token = "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"

pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
    model_path,
    revision="fp16", 
    torch_dtype=torch.float16,
    use_auth_token=access_token
)
pipe = pipe.to(device)

Fetching 19 files:   0%|          | 0/19 [00:00<?, ?it/s]

Downloading:   0%|          | 0.00/1.34k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/12.5k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/342 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/543 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/4.63k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/608M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/209 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/209 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/572 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/246M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/525k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/472 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/788 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.06M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/772 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.72G [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/71.2k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/550 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/167M [00:00<?, ?B/s]

{'trained_betas'} was not found in config. Values will be initialized to default values.


## **DEFINE THE FUNCTION FOR GRADIO INTEGRATION**



*   The `img` parameter is the image, strength parameter is a floating point between 0 to 1. It specifies how much variation we need in our image. The closer strength is to 1, the more changed the input image will become. A rule of thumb is to keep it between 0.2 - 0.5 for best variations.
*   The `seed` parameter is just random. Like if you generated an image and you want to change it totally, jsut change the seed parameter.
*   The `prompt` parameter takes in a text prompt that specifies what change we want in our image. If you do not want any change and just want to add a jittery effect to your original image, leave it blank, and play around with the strength parameter.
*   The `num_inference_steps` determines the quality of image. Greater the steps, greater will be the quality, but it will take long to generate the image with greater steps.
*   The guidance scale (guide_scale) parameter determines how closely the model should stick to the provided text prompt.

*    This [Medium article](https://fahimfarook.medium.com/stable-diffusion-parameter-variations-6d4895a135a3) explains how the guidance scale and num_inference_steps work in a great way.



<hr></hr>
<br></br>


Now let's see the workflow of the function!

The function first convert the seed parameter to an integer because the model requires seed parameter to be an integer. We then convert the input `img` to a numpy array and resize it to 768 x 512 dimensions. The dimensions are fixed because the model requires the input image to be of the specified dimensions. We then define the model using `torch.Generator` and pass the parameters to the pipeline named `pipe`. We assign it to variable `image`, which contains our changed image. We then return the image.





In [13]:
def generate(img, strength, seed, prompt, width, height, guide_scale, inference_steps):
  height = int(height)
  width = int(width)
  inference_steps = int(inference_steps)

  seed =  int(seed)
  img1 = np.asarray(img)
  img2 = Image.fromarray(img1)

  init_image = img2.resize((width, height))

  generator = torch.Generator(device=device).manual_seed(seed)
  with autocast("cuda"):
    image = pipe(prompt=prompt, init_image=init_image, strength=strength, num_inference_steps=inference_steps,guidance_scale=guide_scale, generator=generator).images[0]

  return image


## **DEFINE GRADIO INTERFACE**

1. Here, we initialize the Gradio interface using `Interface` method, pass in our function `predict`, and specify the inputs to the interface. 
2. The first input is an image drop-down feature for the `img`.
3. The second input is a slider for the `strength` parameter between 0 to 1.
4. The third input is a textbox for the `seed` parameter.
5. The fourth input is also a textbox for the `prompt` parameter. 
6. After  that, we define the outputs for the interface, which is an image using `gr.Image` method of Gradio. 
We then launch the interface!






In [9]:
import gradio as gr

In [14]:
gr.Interface(
    
    generate,
    title = 'Image to Image using Diffusers',
    inputs=[
        gr.Image(elem_id = "input-image"),
        gr.Slider(0, 1, value=0.05, label ="Strength (keep close to 0 for minimal changes)"),
        gr.Number(label = "Seed"),
        gr.Textbox(label="Prompt (leave blank if you want minimal changes)"),
        gr.Slider(768, 2768, value=64, label ="Width (make sure width is a multiple of 64)"),
        gr.Slider(512, 2512, value=64, label="Height (make sure width is a multiple of 64)"),
        gr.Slider(0, 10, value=7.5, label="Guidance Scale (generally between 6-8)"),
        gr.Slider(30, 500, value=50, label="Number of inference steps (generally kept at 50)")
    ],
    outputs = [
        gr.Image(elem_id="output-image")
        ], css = "#output-image, #input-image, #image-preview {border-radius: 40px !important; background-color : gray !important;} "
).launch()



Colab notebook detected. To show errors in colab notebook, set `debug=True` in `launch()`
Running on public URL: https://2d28e38f9097b2f7.gradio.app

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces


(<gradio.routes.App at 0x7fee2e0d9fd0>,
 'http://127.0.0.1:7864/',
 'https://2d28e38f9097b2f7.gradio.app')

## **AAAND THAT'S ALL, THANK YOU FOR LETTING ME PROVIDE MY SERVICES!**