# Versatile Diffusion
The [Versatile Diffusion](https://arxiv.org/abs/2211.08332) paper expands the existing single-flow diffusion pipeline into a multi-flow network that handles diverse generation tasks in one unified model. This notebook is to evaluate some of these tasks: text-to-image, image-variation and dual-guided generation. These tasks are executed here through the Hugging Face's [Diffusers](https://github.com/huggingface/diffusers) library.  
![The Versatile Diffusion structure](https://github.com/SHI-Labs/Versatile-Diffusion/raw/master/assets/figures/vd_combined.png)  
An hardware accelerated runtime (GPU) is required to execute the code in this notebook.  
No need to execute the three tasks in this notebook in sequence: once the code cells in the *Settings* section have been successfully executed, you can then jump directly to the section(s) of interest.  

## Settings

Install any missing requirement in the Colab VM. Only *diffusers* (for PyTorch) and *transformers* need to be installed. Their installation will automaticall install also *huggingface-hub*, *accelerate* and *tokenizers*.

In [None]:
!pip install diffusers[torch]
!pip install transformers

Define a function to upload images to the Colab VM.

In [None]:
from google.colab import files

def upload_files():
  uploaded = files.upload()
  for k, v in uploaded.items():
    open(k, 'wb').write(v)
  return list(uploaded.keys())

Import the general dependencies across multiple tasks.

In [None]:
import torch
from PIL import Image

## Text to Image

Create the Versatile Diffusion pipeline for this task. For all the pipelines in this notebook, the float 16 version of the pre-trained models are used, as their size is half of the the same for the float 32 models.

In [None]:
from diffusers import VersatileDiffusionTextToImagePipeline

pipe = VersatileDiffusionTextToImagePipeline.from_pretrained("shi-labs/versatile-diffusion", torch_dtype=torch.float16)
pipe.remove_unused_weights()
pipe = pipe.to("cuda")

Setup prompt, seed and strength to perform a text to image task.

In [None]:
text2img_prompt = "Sticker of a cute spider, white border, die cut, head, cute, trending on artstation" #@param {type: "string"}
text2img_seed = 0 #@param {type: "number"}
text2img_strength = 0.75 #@param {type:"slider", min:0, max:1, step:0.05}

Do text to image.

In [None]:
generator = torch.Generator(device="cuda").manual_seed(text2img_seed)
image = pipe(text2img_prompt, 
             generator=generator,
             strength=text2img_strength).images[0]
display(image)

## Image Variation

Create the Versatile Diffusion pipeline for this task.

In [None]:
from diffusers import VersatileDiffusionImageVariationPipeline

pipe = VersatileDiffusionImageVariationPipeline.from_pretrained("shi-labs/versatile-diffusion", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

Upload source image(s).

In [None]:
uploaded_image_variation_list = upload_files()
uploaded_image_variation_list

Select a source image and set seed and strength. Because the Colab forms dropdown doesn't support variables, the only way to specify the source image filename is to copy and paste it from the previous code cell output.

In [None]:
image_variation_source = "pexels-pixabay-48785.jpg" #@param {type: "string"}
image_variation_seed = 222222 #@param {type: "number"}
image_variation_strength = 0.75 #@param {type:"slider", min:0, max:1, step:0.05}

Do image variation.

In [None]:
image = Image.open(image_variation_source)
display(image)

generator = torch.Generator(device="cuda").manual_seed(image_variation_seed)
image = pipe(image, 
             generator=generator,
             strength=image_variation_strength).images[0]
display(image)

## Dual-guided Generation.

Create the Versatile Diffusion pipeline for this task.

In [None]:
from diffusers import VersatileDiffusionDualGuidedPipeline

pipe = VersatileDiffusionDualGuidedPipeline.from_pretrained("shi-labs/versatile-diffusion", torch_dtype=torch.float16)
pipe.remove_unused_weights()
pipe = pipe.to("cuda")

Upload some image(s).

In [None]:
uploaded_dual_guided_list = upload_files()
uploaded_dual_guided_list

Select the source image and set the prompt, the seed and the strength value. Again, the only way to specify the source image filename is to copy and paste it from the previous code cell output.

In [None]:
dual_guided_source = "marvel-spiderman-i15585.jpg" #@param {type: "string"}
dual_guided_prompt = "Spider man. blade runner 2049 concept painting.  painting with vivid color." #@param {type: "string"}
dual_guided_seed = 555557 #@param {type: "number"}
dual_guided_strength = 0.5 #@param {type:"slider", min:0, max:1, step:0.05}

Do dual-guided generation.

In [None]:
image = Image.open(dual_guided_source)
display(image)

generator = torch.Generator(device="cuda").manual_seed(dual_guided_seed)

image = pipe(prompt=dual_guided_prompt, 
             image=image, 
             text_to_image_strength=dual_guided_strength, 
             generator=generator).images[0]
display(image)