<a href="https://colab.research.google.com/github/qunash/stable-diffusion-2-gui/blob/main/stable_diffusion_2_0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Demo: Stable Diffusion Use Cases**

Sample use cases based on Stable Diffusion V2.1

Gradio app for [Stable Diffusion 2](https://huggingface.co/stabilityai/stable-diffusion-2) by [Stability AI](https://stability.ai/) (v2-1_768-ema-pruned.ckpt).
It uses [Hugging Face](https://huggingface.co/) Diffusers🧨 implementation.

Currently supported pipelines are `text-to-image`, `image-to-image`, `inpainting` and `4x upscaling`.

`depth-to-image` will be added as soon as it's implemented in the Diffusers🧨 library.

<br>

Colab by [anzorq](https://twitter.com/hahahahohohe). If you like it, please consider supporting me:

[<a href="https://www.buymeacoffee.com/anzorq" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" height="32px" width="108px" alt="Buy Me A Coffee"></a>](https://www.buymeacoffee.com/anzorq)
<br>
[![GitHub Repo stars](https://img.shields.io/github/stars/qunash/stable-diffusion-2-gui?style=social)](https://github.com/qunash/stable-diffusion-2-gui)

![visitors](https://visitor-badge.glitch.me/badge?page_id=anzorq.sd-2-colab-header)

# Install dependencies (~1.5 mins)

In [None]:
!pip install --upgrade git+https://github.com/huggingface/diffusers.git@main
# !pip install diffusers
# !pip install git+https://github.com/huggingface/transformers
!pip install transformers
!pip install accelerate
!pip install scipy
# !pip install xformers
# !pip install -q https://github.com/metrolobo/xformers_wheels/releases/download/1d31a3ac_various_6/xformers-0.0.14.dev0-cp37-cp37m-linux_x86_64.whl
!pip install triton
!pip install ftfy
!pip install gradio -q

# ### install xformers
# from IPython.utils import capture
# from subprocess import getoutput
# from re import search

# with capture.capture_output() as cap:
    
#     smi_out = getoutput('nvidia-smi')
#     supported = search('(T4|P100|V100|A100|K80)', smi_out)

#     if not supported:
#       while True:
#         print("\x1b[1;31mThe current GPU is not supported, try starting a new session.\x1b[0m")
#     else:
#       supported = supported.group(0)

# !pip install -q https://github.com/TheLastBen/fast-stable-diffusion/raw/main/precompiled/{supported}/xformers-0.0.13.dev0-py3-none-any.whl
# !pip install -q https://github.com/ShivamShrirao/xformers-wheels/releases/download/4c06c79/xformers-0.0.15.dev0+4c06c79.d20221201-cp38-cp38-linux_x86_64.whl

In [None]:
# Restart kernel
import os
os._exit(00)

In [1]:
# Clean up
import os
for myfile in ["weights/rd16-uni.pth", "weights/rd64-uni.pth", "weights/rd64-uni-refined.pth"]:
    if os.path.isfile(myfile):
        os.remove(myfile)

In [1]:
from models.clipseg import CLIPDensePredT
import PIL
import torch
from matplotlib import pyplot as plt
from torchvision import transforms
from diffusers import StableDiffusionInpaintPipeline, EulerDiscreteScheduler
from transformers import pipeline

#! wget https://owncloud.gwdg.de/index.php/s/ioHbRzFx6th32hn/download -O weights.zip
#! unzip -o -d weights -j weights.zip
torch.device('cuda' if torch.cuda.is_available() else 'cpu')
#model_path = "runwayml/stable-diffusion-inpainting"
model_path = 'stabilityai/stable-diffusion-2-inpainting'

inpainting_pipe = StableDiffusionInpaintPipeline.from_pretrained(
    model_path,
    #revision="fp16" if torch.cuda.is_available() else "fp32",
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    use_auth_token=True,
)
inpainting_pipe.scheduler = EulerDiscreteScheduler.from_config(inpainting_pipe.scheduler.config)

# load model  available models = ['RN50', 'RN101', 'RN50x4', 'RN50x16', 'RN50x64', 'ViT-B/32', 'ViT-B/16', 'ViT-L/14', 'ViT-L/14@336px']
model = CLIPDensePredT(version='ViT-B/16', reduce_dim=64)
model.eval();
# non-strict, because we only stored decoder weights (not CLIP weights)
model.load_state_dict(torch.load('weights/rd64-uni.pth', map_location=torch.device('cuda')), strict=False);

Fetching 16 files:   0%|          | 0/16 [00:00<?, ?it/s]

#def semseg(image, negative_prompt, target_prompt):
def semseg2(prompt, n_images, negative_prompt, img, guidance, steps, width, height, generator, seed):
    transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
        #transforms.Resize((352, 352)),
        transforms.Resize((1024, 1024)),
    ])
    
    img = transform(img).unsqueeze(0)
    mask_image_filename = './temp_filename_delme.png'
    with torch.no_grad():
        preds = model(img.repeat(4,1,1,1), negative_prompt)[0]
    #plt.imsave(mask_image_filename,torch.mul(torch.sigmoid(preds[0][0]), 5))
    plt.imsave(mask_image_filename,torch.special.ndtr(preds[0][0]))
    mask_image = PIL.Image.open(mask_image_filename).resize((512, 512))
    guidance_scale = guidance
    generator = torch.Generator(device="cuda").manual_seed(0) # change the seed to get different results
    pipe.enable_xformers_memory_efficient_attention()

    images = inpainting_pipe(
        prompt=prompt,
        image=image,
        mask_image=mask_image,
        guidance_scale=guidance_scale,
        generator=generator,
        width = width,
        height = height,
        num_inference_steps = int(steps),
        num_images_per_prompt=n_images,
    ).images
    return images

In [2]:
# TODO Currently supports only 512x512 images
def semseg(prompt, n_images, neg_prompt, img, guidance, steps, width, height, seed):

    global inpainting_pipe 
    global model

    transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
        #transforms.Resize((352, 352)),
        transforms.Resize((512, 512)),
    ])
    
    inp_img = img['image']
    #mask = img['mask']
    #inp_img = square_padding(inp_img)
    #mask = square_padding(mask)
    img = transform(inp_img).unsqueeze(0)
    mask_image_filename = './temp_filename_delme.png'
    with torch.no_grad():
        preds = model(img.repeat(4,1,1,1), neg_prompt)[0]
    #plt.imsave(mask_image_filename,torch.mul(torch.sigmoid(preds[0][0]), 5))
    plt.imsave(mask_image_filename,torch.special.ndtr(preds[0][0]))
    mask = PIL.Image.open(mask_image_filename).resize((512, 512))
    #inp_img = square_padding(inp_img)
    #mask = square_padding(mask)

    # # ratio = min(height / inp_img.height, width / inp_img.width)
    # ratio = min(512 / inp_img.height, 512 / inp_img.width)
    # inp_img = inp_img.resize((int(inp_img.width * ratio), int(inp_img.height * ratio)), Image.LANCZOS)
    # mask = mask.resize((int(mask.width * ratio), int(mask.height * ratio)), Image.LANCZOS)

    inp_img = inp_img.resize((512, 512))
    #mask = mask.resize((512, 512))

    result = inpainting_pipe(
      prompt,
      image = inp_img,
      mask_image = mask,
      num_images_per_prompt = n_images,
      #negative_prompt = neg_prompt,
      num_inference_steps = int(steps),
      guidance_scale = guidance,
      # width = width,
      # height = height,
      generator = torch.Generator(device="cuda").manual_seed(seed),
      #callback=pipe_callback,
    ).images[0]
        
    update_state(f"Done. Seed: {seed}")

    return result

In [3]:
def txt2wsj(prompt, n_images, neg_prompt, guidance, steps, width, height, generator, seed):
    output_dir = "/home/alfred/codes/cv/CVWorkshop17.new/wsj_style3_finetuned_model"
    placeholder_token = "\u005Cwsjstyle>" #@param {type:"string"}
    #@title Set up the pipeline 
    pipe_ft = StableDiffusionPipeline.from_pretrained(
        #hyperparameters["output_dir"],
        output_dir,
        torch_dtype=torch.float16,
    ).to("cuda:0")
    pipe_ft.enable_attention_slicing()
    pipe_ft.enable_xformers_memory_efficient_attention()
    result = pipe_ft(prompt+' '+placeholder_token,
                     num_images_per_prompt = n_images,
                     num_inference_steps=steps,
                     negative_prompt = neg_prompt,
                     width = width,
                     height = height,
                     generator=generator,
                     guidance_scale=guidance).images
    update_state(f"Done. Seed: {seed}")
    return result


In [4]:
def img_to_txt(img):
    image_to_text = pipeline("image-to-text", model="nlpconnect/vit-gpt2-image-captioning")
    return image_to_text(img)

### Adding gpt-3 engineered prompt from the original one to simulate prompt injection

In [17]:
from dotenv import load_dotenv
import requests, json
def add_prompt_modifiers(plain_prompt):
    load_dotenv()
    OPENAI_TOKEN = os.getenv('OPENAI_TOKEN')
    #OPENAI_TOKEN = os.environ['OPENAI_TOKEN']

    with open('effective_prompts_fs.txt', 'r') as f:
        prefix = f.read()
    prompt = prefix + '\n' + plain_prompt

    response = requests.post(
        "https://api.openai.com/v1/completions",
        headers={
            'authorization': "Bearer " + OPENAI_TOKEN,
            "content-type": "application/json",
        },
        json={
            "model": "davinci",
            "prompt": prompt,
            "max_tokens": 50,
            "temperature": 0.7,
            "stop": "\n",
        })

    text = response.text
    try:
        result = json.loads(text)
    except:
        raise Exception(f'Cannot load: {text}, {response}')

    prompt_modifiers = result['choices'][0]['text']
    engineered_prompt = plain_prompt + prompt_modifiers
    print(f'New engineered prompt: {engineered_prompt}')
    return engineered_prompt

# Run the app

In [None]:
#@title ⬇️🖼️
from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline, StableDiffusionUpscalePipeline, DiffusionPipeline, DPMSolverMultistepScheduler
import gradio as gr
import torch
from PIL import Image
import random

state = None
current_steps = 25

# model_id = 'stabilityai/stable-diffusion-2'
model_id = 'stabilityai/stable-diffusion-2-1'

scheduler = DPMSolverMultistepScheduler.from_pretrained(model_id, subfolder="scheduler")

pipe = StableDiffusionPipeline.from_pretrained(
      model_id,
      revision="fp16" if torch.cuda.is_available() else "fp32",
      torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
      scheduler=scheduler
    ).to("cuda")
pipe.enable_attention_slicing()
# pipe.enable_xformers_memory_efficient_attention()

pipe_i2i = None
pipe_txt2wsj = None
pipe_upscale = None
pipe_inpaint = None
pipe_img2txt = None
pipe_semseg = None
#pipe_txt2img_gpt3 = None

attn_slicing_enabled = True
mem_eff_attn_enabled = False

modes = {
    'txt2img': 'Text to Image with prompt injection using GPT-3',
    #'txt2img_gpt3': 'Text to Image with prompt injection using GPT-3',
    'txt2wsj': 'Text to Image with WSJ style',
    'img2txt': 'Image to Text',
    'img2img': 'Image to Image',
    'inpaint': 'Inpainting',
    'semseg': 'Inpainting with SemSeg',
    'upscale4x': 'Upscale 4x',
}
current_mode = modes['txt2img']

def error_str(error, title="Error"):
    return f"""#### {title}
            {error}"""  if error else ""

def update_state(new_state):
  global state
  state = new_state

def update_state_info(old_state):
  if state and state != old_state:
    return gr.update(value=state)

def set_mem_optimizations(pipe):
    if attn_slicing_enabled:
      pipe.enable_attention_slicing()
    else:
      pipe.disable_attention_slicing()
    
    # if mem_eff_attn_enabled:
    #   pipe.enable_xformers_memory_efficient_attention()
    # else:
    #   pipe.disable_xformers_memory_efficient_attention()

def get_i2i_pipe(scheduler):
    
    update_state("Loading image to image model...")

    pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
      model_id,
      revision="fp16" if torch.cuda.is_available() else "fp32",
      torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
      scheduler=scheduler
    )
    set_mem_optimizations(pipe)
    pipe.to("cuda")
    return pipe

def get_inpaint_pipe():
  
  update_state("Loading inpainting model...")

  pipe = DiffusionPipeline.from_pretrained(
      "stabilityai/stable-diffusion-2-inpainting",
      revision="fp16" if torch.cuda.is_available() else "fp32",
      torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
      # scheduler=scheduler # TODO currently setting scheduler here messes up the end result. A bug in Diffusers🧨
    ).to("cuda")
  pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
  pipe.enable_attention_slicing()
  pipe.enable_xformers_memory_efficient_attention()
  return pipe

def get_upscale_pipe(scheduler):
    
    update_state("Loading upscale model...")

    pipe = StableDiffusionUpscalePipeline.from_pretrained(
      "stabilityai/stable-diffusion-x4-upscaler",
      revision="fp16" if torch.cuda.is_available() else "fp32",
      torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
      # scheduler=scheduler
    )
    # pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
    set_mem_optimizations(pipe)
    pipe.to("cuda")
    return pipe

def switch_attention_slicing(attn_slicing):
    global attn_slicing_enabled
    attn_slicing_enabled = attn_slicing

def switch_mem_eff_attn(mem_eff_attn):
    global mem_eff_attn_enabled
    mem_eff_attn_enabled = mem_eff_attn

def pipe_callback(step: int, timestep: int, latents: torch.FloatTensor):
    update_state(f"{step}/{current_steps} steps")#\nTime left, sec: {timestep/100:.0f}")

def inference(inf_mode, prompt, n_images, guidance, steps, width=768, height=768, seed=0, img=None, strength=0.5, neg_prompt=""):

  update_state(" ")

  global current_mode
  if inf_mode != current_mode:
    pipe.to("cuda" if inf_mode == modes['txt2img'] else "cpu")
    
    #if pipe_txt2img_gpt3 is not None:
    #  pipe_txt2img_gpt3.to("cuda" if inf_mode == modes['txt2img_gpt3'] else "cpu")
    
    if pipe_txt2wsj is not None:
      pipe_txt2wsj.to("cuda" if inf_mode == modes['txt2wsj'] else "cpu") 

    if pipe_img2txt is not None:
      pipe_img2txt.to("cuda" if inf_mode == modes['img2txt'] else "cpu")  
    
    if pipe_i2i is not None:
      pipe_i2i.to("cuda" if inf_mode == modes['img2img'] else "cpu")

    if pipe_inpaint is not None:
      pipe_inpaint.to("cuda" if inf_mode == modes['inpaint'] else "cpu")

    if pipe_upscale is not None:
      pipe_upscale.to("cuda" if inf_mode == modes['upscale4x'] else "cpu")

    current_mode = inf_mode
    
  if seed == 0:
    seed = random.randint(0, 2147483647)

  generator = torch.Generator('cuda').manual_seed(seed)
  #prompt = add_prompt_modifiers(prompt)
  prompt = prompt

  try:
    
    if inf_mode == modes['txt2img']:
      return txt_to_img(add_prompt_modifiers(prompt), n_images, neg_prompt, guidance, steps, width, height, generator, seed), gr.update(visible=False, value=None)
    
    #elif inf_mode == modes['txt2img_gpt3']:
    #  return txt_to_img(add_prompt_modifiers(prompt), n_images, neg_prompt, guidance, steps, width, height, generator, seed), gr.update(visible=False, value=None)
    
    elif inf_mode == modes['txt2wsj']:
      return txt2wsj(prompt, n_images, neg_prompt, guidance, steps, width, height, generator, seed), gr.update(visible=False, value=None)
    
    elif inf_mode == modes['img2img']:
      if img is None:
        return None, gr.update(visible=True, value=error_str("Image is required for Image to Image mode"))

      return img_to_img(prompt, n_images, neg_prompt, img, strength, guidance, steps, width, height, generator, seed), gr.update(visible=False, value=None)
    
    elif inf_mode == modes['img2txt']:
      if img is None:
        return None, gr.update(visible=True, value=error_str("Image is required for Image to Image mode"))

      return img_to_txt(img), gr.update(visible=False, value=None)

    elif inf_mode == modes['inpaint']:
      if img is None:
        return None, gr.update(visible=True, value=error_str("Image is required for Inpainting mode"))

      return inpaint(prompt, n_images, neg_prompt, img, guidance, steps, width, height, generator, seed), gr.update(visible=False, value=None)

    elif inf_mode == modes['semseg']:
      if img is None:
        return None, gr.update(visible=True, value=error_str("Image is required for Inpainting mode"))

      return semseg(prompt, n_images, neg_prompt, img, guidance, steps, width, height, seed), gr.update(visible=False, value=None)

    elif inf_mode == modes['upscale4x']:
      if img is None:
        return None, gr.update(visible=True, value=error_str("Image is required for Upscale mode"))

      return upscale(prompt, n_images, neg_prompt, img, guidance, steps, generator), gr.update(visible=False, value=None)
  except Exception as e:
    return None, gr.update(visible=True, value=error_str(e))

def txt_to_img(prompt, n_images, neg_prompt, guidance, steps, width, height, generator, seed):

    result = pipe(
      prompt,
      num_images_per_prompt = n_images,
      negative_prompt = neg_prompt,
      num_inference_steps = int(steps),
      guidance_scale = guidance,
      width = width,
      height = height,
      generator = generator,
      callback=pipe_callback).images

    update_state(f"Done. Seed: {seed}")

    return result

def img_to_img(prompt, n_images, neg_prompt, img, strength, guidance, steps, width, height, generator, seed):

    global pipe_i2i
    if pipe_i2i is None:
      pipe_i2i = get_i2i_pipe(scheduler)

    img = img['image']
    ratio = min(height / img.height, width / img.width)
    img = img.resize((int(img.width * ratio), int(img.height * ratio)), Image.LANCZOS)
    result = pipe_i2i(
      prompt,
      num_images_per_prompt = n_images,
      negative_prompt = neg_prompt,
      image = img,
      num_inference_steps = int(steps),
      strength = strength,
      guidance_scale = guidance,
      # width = width,
      # height = height,
      generator = generator,
      callback=pipe_callback).images

    update_state(f"Done. Seed: {seed}")
        
    return result

# TODO Currently supports only 512x512 images
def inpaint(prompt, n_images, neg_prompt, img, guidance, steps, width, height, generator, seed):

    global pipe_inpaint
    if pipe_inpaint is None:
      pipe_inpaint = get_inpaint_pipe()

    inp_img = img['image']
    mask = img['mask']
    inp_img = square_padding(inp_img)
    mask = square_padding(mask)

    # # ratio = min(height / inp_img.height, width / inp_img.width)
    # ratio = min(512 / inp_img.height, 512 / inp_img.width)
    # inp_img = inp_img.resize((int(inp_img.width * ratio), int(inp_img.height * ratio)), Image.LANCZOS)
    # mask = mask.resize((int(mask.width * ratio), int(mask.height * ratio)), Image.LANCZOS)

    inp_img = inp_img.resize((512, 512))
    mask = mask.resize((512, 512))

    result = pipe_inpaint(
      prompt,
      image = inp_img,
      mask_image = mask,
      num_images_per_prompt = n_images,
      negative_prompt = neg_prompt,
      num_inference_steps = int(steps),
      guidance_scale = guidance,
      # width = width,
      # height = height,
      generator = generator,
      callback=pipe_callback).images
        
    update_state(f"Done. Seed: {seed}")

    return result

def square_padding(img):
    width, height = img.size
    if width == height:
        return img
    new_size = max(width, height)
    new_img = Image.new('RGB', (new_size, new_size), (0, 0, 0, 255))
    new_img.paste(img, ((new_size - width) // 2, (new_size - height) // 2))
    return new_img

def upscale(prompt, n_images, neg_prompt, img, guidance, steps, generator):

    global pipe_upscale
    if pipe_upscale is None:
      pipe_upscale = get_upscale_pipe(scheduler)

    img = img['image']
    return upscale_tiling(prompt, neg_prompt, img, guidance, steps, generator)

    # result = pipe_upscale(
    #     prompt,
    #     image = img,
    #     num_inference_steps = int(steps),
    #     guidance_scale = guidance,
    #     negative_prompt = neg_prompt,
    #     num_images_per_prompt = n_images,
    #     generator = generator).images[0]

    # return result

def upscale_tiling(prompt, neg_prompt, img, guidance, steps, generator):

    width, height = img.size

    # calculate the padding needed to make the image dimensions a multiple of 128
    padding_x = 128 - (width % 128) if width % 128 != 0 else 0
    padding_y = 128 - (height % 128) if height % 128 != 0 else 0

    # create a white image of the right size to be used as padding
    padding_img = Image.new('RGB', (padding_x, padding_y), color=(255, 255, 255, 0))

    # paste the padding image onto the original image to add the padding
    img.paste(padding_img, (width, height))

    # update the image dimensions to include the padding
    width += padding_x
    height += padding_y

    if width > 128 or height > 128:

        num_tiles_x = int(width / 128)
        num_tiles_y = int(height / 128)

        upscaled_img = Image.new('RGB', (img.size[0] * 4, img.size[1] * 4))
        for x in range(num_tiles_x):
            for y in range(num_tiles_y):
                update_state(f"Upscaling tile {x * num_tiles_y + y + 1}/{num_tiles_x * num_tiles_y}")
                tile = img.crop((x * 128, y * 128, (x + 1) * 128, (y + 1) * 128))

                upscaled_tile = pipe_upscale(
                    prompt="",
                    image=tile,
                    num_inference_steps=steps,
                    guidance_scale=guidance,
                    # negative_prompt = neg_prompt,
                    generator=generator,
                ).images[0]

                upscaled_img.paste(upscaled_tile, (x * upscaled_tile.size[0], y * upscaled_tile.size[1]))

        return [upscaled_img]
    else:
        return pipe_upscale(
            prompt=prompt,
            image=img,
            num_inference_steps=steps,
            guidance_scale=guidance,
            negative_prompt = neg_prompt,
            generator=generator,
        ).images



def on_mode_change(mode):
  return gr.update(visible = mode in (modes['txt2wsj'], modes['img2img'], modes['img2txt'], modes['inpaint'], modes['semseg'], modes['upscale4x'])), \
         gr.update(visible = mode == modes['inpaint']), \
         gr.update(visible = mode == modes['semseg']), \
         gr.update(visible = mode == modes['upscale4x']), \
         gr.update(visible = mode == modes['txt2wsj']), \
         gr.update(visible = mode == modes['img2img']), \
         gr.update(visible = mode == modes['img2txt'])
         #gr.update(visible = mode == modes['txt2img_gpt3'])

def on_steps_change(steps):
  global current_steps
  current_steps = steps

css = """.main-div div{display:inline-flex;align-items:center;gap:.8rem;font-size:1.75rem}.main-div div h1{font-weight:900;margin-bottom:7px}.main-div p{margin-bottom:10px;font-size:94%}a{text-decoration:underline}.tabs{margin-top:0;margin-bottom:0}#gallery{min-height:20rem}
"""
with gr.Blocks(css=css) as demo:
    gr.HTML(
        f"""
          <div class="main-div">
            <div>
              <h1>Demo: Stable Diffusion Use Cases</h1>
            </div><br>
            <p> Model used: <a href="https://huggingface.co/stabilityai/stable-diffusion-2-1/blob/main/v2-1_768-ema-pruned.ckpt" target="_blank">v2-1_768-ema-pruned.ckpt</a></p>
            Running on <b>{"GPU 🔥" if torch.cuda.is_available() else "CPU 🥶"}</b>
          </div>
        """
    )
    with gr.Row():
        
        with gr.Column(scale=55):
          with gr.Group():
              with gr.Row():
                prompt = gr.Textbox(label="Prompt", show_label=False, max_lines=2,placeholder=f"Enter prompt").style(container=False)
                generate = gr.Button(value="Generate").style(rounded=(False, True, True, False))

              gallery = gr.Gallery(label="Generated images", show_label=False).style(grid=[2], height="auto")
          state_info = gr.Textbox(label="State", show_label=False, max_lines=2).style(container=False)
          error_output = gr.Markdown(visible=False)

        with gr.Column(scale=45):
          inf_mode = gr.Radio(label="Inference Mode", choices=list(modes.values())[:7], value=modes['txt2img']) # TODO remove [:3] limit
          
          with gr.Group(visible=False) as i2i_options:
            image = gr.Image(label="Image", height=128, type="pil", tool='sketch')
            inpaint_info = gr.Markdown("Inpainting resizes and pads images to 512x512", visible=False)
            upscale_info = gr.Markdown("""Best for small images (128x128 or smaller).<br>
                                        Bigger images will be sliced into 128x128 tiles which will be upscaled individually.<br>
                                        This is done to avoid running out of GPU memory.""", visible=False)
            strength = gr.Slider(label="Transformation strength", minimum=0, maximum=1, step=0.01, value=0.5)

          with gr.Group():
            neg_prompt = gr.Textbox(label="Negative(exclusive) prompt", placeholder="What to exclude from the image")

            n_images = gr.Slider(label="Number of images", value=1, minimum=1, maximum=4, step=1)
            with gr.Row():
              guidance = gr.Slider(label="Guidance scale", value=7.5, maximum=15)
              steps = gr.Slider(label="Steps", value=current_steps, minimum=2, maximum=100, step=1)

            with gr.Row():
              width = gr.Slider(label="Width", value=768, minimum=64, maximum=1024, step=8)
              height = gr.Slider(label="Height", value=768, minimum=64, maximum=1024, step=8)

            seed = gr.Slider(0, 2147483647, label='Seed (0 = random)', value=0, step=1)
            with gr.Accordion("Memory optimization"):
              attn_slicing = gr.Checkbox(label="Attention slicing (a bit slower, but uses less memory)", value=attn_slicing_enabled)
              # mem_eff_attn = gr.Checkbox(label="Memory efficient attention (xformers)", value=mem_eff_attn_enabled)

    inf_mode.change(on_mode_change, inputs=[inf_mode], outputs=[i2i_options, inpaint_info, upscale_info, strength], queue=False)
    steps.change(on_steps_change, inputs=[steps], outputs=[], queue=False)
    attn_slicing.change(lambda x: switch_attention_slicing(x), inputs=[attn_slicing], queue=False)
    # mem_eff_attn.change(lambda x: switch_mem_eff_attn(x), inputs=[mem_eff_attn], queue=False)

    inputs = [inf_mode, prompt, n_images, guidance, steps, width, height, seed, image, strength, neg_prompt]
    outputs = [gallery, error_output]
    prompt.submit(inference, inputs=inputs, outputs=outputs)
    generate.click(inference, inputs=inputs, outputs=outputs)

    demo.load(update_state_info, inputs=state_info, outputs=state_info, every=0.5, show_progress=False)
'''
    gr.HTML("""
    <div style="border-top: 1px solid #303030;">
      <br>
      <p>Space by: <a href="https://twitter.com/hahahahohohe"><img src="https://img.shields.io/twitter/follow/hahahahohohe?label=%40anzorq&style=social" alt="Twitter Follow"></a></p><br>
      <p>Enjoying this app? Please consider <a href="https://www.buymeacoffee.com/anzorq">supporting me</a></p>
      <a href="https://www.buymeacoffee.com/anzorq" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 45px !important;width: 162px !important;" ></a><br><br>
      <a href="https://github.com/qunash/stable-diffusion-2-gui" target="_blank"><img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/qunash/stable-diffusion-2-gui?style=social"></a>
      <p><img src="https://visitor-badge.glitch.me/badge?page_id=anzorq.sd-2-colab" alt="visitors"></p>
    </div>
    """)
'''
demo.queue()
demo.launch(server_name="0.0.0.0", debug=True, share=False, height=768)


Fetching 12 files:   0%|          | 0/12 [00:00<?, ?it/s]



Running on local URL:  http://0.0.0.0:7860
Running on public URL: https://89f248ee-a565-4d4c.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces


New engineered prompt: a cat running in a park at night, a small town in the distance, a lake with a dock, a large full moon, perspective, soft focus, shallow depth of field, stellar camera work, high detail, 8k, 4k, concept art, digital painting by


  0%|          | 0/25 [00:00<?, ?it/s]

Pipelines loaded with `torch_dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `torch_dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `torch_dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipelin

  0%|          | 0/25 [00:00<?, ?it/s]

  0%|          | 0/25 [00:00<?, ?it/s]

  0%|          | 0/25 [00:00<?, ?it/s]