# Mobile Version - Latent Diffusion with HuggingFace Diffusers 

NSWF filter Off to remove censorship on greek statues

# GPU Check

In [None]:
!nvidia-smi

# Setup

Next, you should install `diffusers==0.2.4` as well `scipy`, `ftfy` and `transformers`.

In [None]:
!pip install diffusers==0.2.4
!pip install transformers scipy ftfy
!pip install "ipywidgets>=7,<8"

You also need to accept the model license before downloading or using the weights. In this post we'll use model version `v1-4`, so you'll need to  visit [its card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. 

You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section of the documentation](https://huggingface.co/docs/hub/security-tokens).

As google colab has disabled external widgtes, we need to enable it explicitly. Run the following cell to be able to use `notebook_login`

In [None]:
from google.colab import output
output.enable_custom_widget_manager()

## Import Libs

In [None]:
import torch
from torch import autocast
from PIL import Image
from diffusers import StableDiffusionPipeline
from tqdm.autonotebook import tqdm

## Token Login
Now you can login with your user token.

In [None]:
from huggingface_hub import notebook_login

notebook_login()

## Stable Diffusion Pipeline

`StableDiffusionPipeline` is an end-to-end inference pipeline that you can use to generate images from text with just a few lines of code.

First, we load the pre-trained weights of all components of the model.

In addition to the model id [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4), we're also passing a specific `revision`, `torch_dtype` and `use_auth_token` to the `from_pretrained` method.
`use_auth_token` is necessary to verify that you have indeed accepted the model's license.

We want to ensure that every free Google Colab can run Stable Diffusion, hence we're loading the weights from the half-precision branch [`fp16`](https://huggingface.co/CompVis/stable-diffusion-v1-4/tree/fp16) and also tell `diffusers` to expect the weights in float16 precision by passing `torch_dtype=torch.float16`.

If you want to ensure the highest possible precision, please make sure to remove `revision="fp16"` and `torch_dtype=torch.float16` at the cost of a higher memory usage.

In [None]:
# make sure you're logged in with `huggingface-cli login`
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16, use_auth_token=True)

Next, let's move the pipeline to GPU to have faster inference.

In [None]:
pipe = pipe.to("cuda")
pipe.safety_checker = (lambda images, clip_input: (images, False))

### code examples (all commented)

Using `autocast` will run inference faster because it uses half-precision.

In [None]:
# from torch import autocast

# prompt = "a photograph of an astronaut riding a horse"
# with autocast("cuda"):
#   image = pipe(prompt)["sample"][0]  # image here is in [PIL format](https://pillow.readthedocs.io/en/stable/)

# # Now to display an image you can do either save it such as:
# image.save(f"astronaut_rides_horse.png")

# # or if you're in a google colab you can directly display it with 
# image

Running the above cell multiple times will give you a different image every time. If you want deterministic output you can pass a random seed to the pipeline. Every time you use the same seed you'll have the same image result.

In [None]:
# import torch

# generator = torch.Generator("cuda").manual_seed(1024)

# with autocast("cuda"):
#   image = pipe(prompt, generator=generator)["sample"][0]

# image

You can change the number of inference steps using the `num_inference_steps` argument. In general, results are better the more steps you use. Stable Diffusion, being one of the latest models, works great with a relatively small number of steps, so we recommend to use the default of `50`. If you want faster results you can use a smaller number.

The following cell uses the same seed as before, but with fewer steps. Note how some details, such as the horse's head or the helmet, are less defin realistic and less defined than in the previous image:

In [None]:
# import torch

# generator = torch.Generator("cuda").manual_seed(1024)

# with autocast("cuda"):
#   image = pipe(prompt, num_inference_steps=15, generator=generator)["sample"][0]

# image

The other parameter in the pipeline call is `guidance_scale`. It is a way to increase the adherence to the conditional signal which in this case is text as well as overall sample quality. In simple terms classifier free guidance forces the generation to better match with the prompt. Numbers like `7` or `8.5` give good results, if you use a very large number the images might look good, but will be less diverse. 

You can learn about the technical details of this parameter in [the last section](https://colab.research.google.com/drive/1ALXuCM5iNnJDNW5vqBm5lCtUQtZJHN2f?authuser=1#scrollTo=UZp-ynZLrS-S) of this notebook.

To generate multiple images for the same prompt, we simply use a list with the same prompt repeated several times. We'll send the list to the pipeline instead of the string we used before.



Let's first write a helper function to display a grid of images. Just run the following cell to create the `image_grid` function, or disclose the code if you are interested in how it's done.

In [None]:
# def image_grid(imgs, rows, cols):
#     assert len(imgs) == rows*cols

#     w, h = imgs[0].size
#     grid = Image.new('RGB', size=(cols*w, rows*h))
#     grid_w, grid_h = grid.size
    
#     for i, img in enumerate(imgs):
#         grid.paste(img, box=(i%cols*w, i//cols*h))
#     return grid

Now, we can generate a grid image once having run the pipeline with a list of 3 prompts.

In [None]:
# num_images = 3
# prompt = ["a photograph of an astronaut riding a horse"] * num_images

# with autocast("cuda"):
#   images = pipe(prompt)["sample"]

# grid = image_grid(images, rows=1, cols=3)
# grid

And here's how to generate a grid of `n × m` images.

# Define PATH variables (& Gdrive save auth)

In [None]:
stable_diffusion_path  = '/content/drive/MyDrive/AI/stable_diffusion/stable_diffusion_output/'

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Define functions

In [None]:
def image_grid(imgs, rows, cols):
    assert len(imgs) == rows*cols

    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols*w, rows*h))
    grid_w, grid_h = grid.size
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid

def generate_image_grid(prompt, iteration_steps, num_cols, num_rows):
  prompt *= num_cols

  all_images = []
  print('generating...')
  for i in tqdm(range(num_rows)):
    with autocast("cuda"):
      images = pipe(prompt, num_inference_steps=iteration_steps)["sample"]
    all_images.extend(images)


  grid = image_grid(all_images, rows=num_rows, cols=num_cols)

  # save section (images + grid)
  print('saving...')
  for cpt, image in enumerate(tqdm(all_images)):
    image.save(stable_diffusion_path+prompt[0]+'_img_'+str(cpt)+'.png')
  grid.save(stable_diffusion_path+prompt[0]+'_grid.png')
  return grid

# feature test
# generate_image_grid(["a photograph of an astronaut riding a horse"], 50, 1, 1)

# Run prompts

### previous prompts

In [None]:
  # ["painting in the style of Salvador Dali expressing the explosion of the senses when art is introduced"],
  # ["painting of the highest mountains in the style of Caspar David Friedrich"],
  # ["painting of a gothic cathedral built on a giant waterfall"]
  # ['Exploding Raphaelesque Head, Dali painting from 1951']
  # ['painting of a multitude of 3d shapes forming a human face in the style of Salvador Dali']
  # ['spatial composition of 3d shapes giving the illusion of a human head, painting by Salvador Dali'],
  # ['geometric composition of 3d shapes giving the illusion of a human head, painting by Salvador Dali'],
  # ['geometric spatial 3d shapes giving the illusion of a human head, painting by Salvador Dali'],
  # ['geometric volumetric composition of 3d shapes giving the illusion of a human head, painting by Salvador Dali'],
  # ['volumetric spatial composition of 3d shapes giving the illusion of a human head, painting by Salvador Dali'],
  # ['geometric spatial and volumetric shapes giving the illusion of a human head, painting by Salvador Dali'],
  # ['volumetric 3d shapes giving the illusion of a human head, painting by Salvador Dali']
  # ['3d spheres floating in space giving the illusion of a human head, painting by Salvador Dali'],
  # ['3d spheres floating in space giving an illusion of a human head, painting by Salvador Dali'],
  # ['3d spheres floating in space arranged to look like a human head, painting by Salvador Dali'],
  # ['3d spheres floating in space forming a human head, painting by Salvador Dali'],
  # ['3d spheres floating in space forming a face, painting by Salvador Dali'],
  # ['3d spheres floating in space forming a smiling face, painting by Salvador Dali']
  # ['painting of a castle in the Gothic Revival architecture style, trending on Artstation'],
  # ['painting of a giant submarine surfacing the rough sea, by greg rutkowski and thomas kinkade, trending on Artstation']
  # ['painting of an important woman, by Leonardo da Vinci, trending on Artstation'],
  # ['painting of a business woman, by Leonardo da Vinci, trending on Artstation'],
  # ['painting of a group of business women, by Leonardo da Vinci, trending on Artstation'],
  # ['realistic oil on canvas of a female emperor on a throne, trending on Artstation'],
  # ['render of walle, featured on zbrushcentral'],
  # ['render of pixar walle, featured on zbrushcentral'],
  # ['render of pixar walle robot, featured on zbrushcentral'],
  # ['render of wall-e robot from the pixar movie, featured on zbrushcentral'],
  # ['render of pixar wall-e robot, featured on zbrushcentral'],
  # ['render of venus de milo statue with tentacle instead of arms, featured on zbrushcentral'],
  # ['render of venus de milo statue with octopus tentacles, featured on zbrushcentral'],
  # ['render of venus de milo statue with octopus tentacles instead of arms, featured on zbrushcentral'],
  # ['complete medieval style alphabet'],
  # ['complete medieval alphabet'],
  # ['medieval alphabet'],
  # ['complete alphabet'],
  # ['complete alphabet in illumination style'],
  # ['complete alphabet in mediavel illumination style']
  # ['painting of a medieval relic of a human skull with gold jewelry and large precious stones'],


  # ['painting of the relic skull of a saint covered in medieval jewelry']
  # ['mural of the skull relic of a saint covered in gold mesh set with gemstones and crowned with fractal gilded crown']
  
  
  # ['painting of a medieval skull with gold jewelry and large precious stones'],
  # ['painting of a relic of a human skull with medieval jewelry and large precious stones'],
  # ['painting of a medieval relic of a human skull with gold jewelry']

  # ["render asian zen modern design of a tropical villa garden and pool"]
  # ['picture of a luxury mahogany master bedroom ceiling beams']
  # ['picture of a German Rococo grandiose staircase in the center, the ceiling is covered in a fresco by a Venetian artist, large royal palace windows, black and white marble flooring']

  # ["Los Angeles skyline at sunset. Detailed ink wash."],
  # ["Donald Trump dressed as willy wonka in the style of Quentin Blake illustrations from Roald Dahl books"],

## option: archive old pictures

In [None]:
!mkdir /content/drive/MyDrive/AI/stable_diffusion/$(date +%Y%m%d_%H%M%S) && mv /content/drive/MyDrive/AI/stable_diffusion/stable_diffusion_output/*.png /content/drive/MyDrive/AI/stable_diffusion/$(date +%Y%m%d_%H%M%S)

## run the actual generation

In [None]:
prompt_master_list = [
  # ["ornate gilded wood frame in the baroque style with gemstones in the frames, photo from Christies catalog"]
  # ['Neo-Baroque frame of a 19th-century painting'],
  # ['a painting by Hubert Robert in a gilded baroque frame on the wall of a museum'],
  # ['a painting by Edouard Manet in a gilded baroque frame on the wall of a museum'],
  # ['a painting by Thomas Cole in a gilded baroque frame on the wall of a museum'],
  # ['a painting by Van Gogh in a gilded baroque frame on the wall of a museum'],
  # ['Argent, an eagle displayed gules armed and wings charged with trefoils Or. Arms of Brandenburg.']
  # ['Gules illuminations, a Griffin with dragon wings tail and tongue rampant, codex page scan'],
  # ['blazon descrption Azure, a Bend Or'],
  #['very detailed painting of a complex and intricate medieval city inside of a giant tree, by Thomas Cole']
  #['retro travel poster showing the lushious green shire from the lord of the rings in the background and a hobbit house in the foreground, artdeco style poster']
 ['baroque square frame, gilded wood, ornate engravings, embossed gemstones, hd picture']
]

In [None]:
for current_prompt in prompt_master_list:
  grid = generate_image_grid(current_prompt, 50,4,5)
  display(grid)

# post run