# 03. Fast Image Generation

## 02. Latent Consistency Models (LCM)
#### Content

1. [Basic LCM Pipeline](#lcm)
2. [Key-Findings](#keyfindings)

## Description + Links

For latent consistency distillation, each model needs to be distilled separately. The core idea with LCM LoRA is to train just a small number of adapters, known as LoRA layers, instead of the full model. The resulting LoRAs can then be applied to any fine-tuned version of the model without having to distil them separately. 

---
**Documentation**

https://huggingface.co/docs/diffusers/api/pipelines/latent_consistency_models

https://huggingface.co/blog/lcm_lora

**Paper**

[Luo, S., et al. (2023): Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference](https://arxiv.org/abs/2310.04378)

## Setup

In [None]:
%env HF_HOME=/cluster/user/ehoemmen/.cache
%env HF_DATASETS_CACHE=/cluster/user/ehoemmen/.cache
%env TRANSFORMERS_CACHE=/cluster/user/ehoemmen/.cache

In [None]:
pip install -U invisible_watermark transformers accelerate safetensors peft diffusers

In [None]:
from diffusers import DiffusionPipeline, LCMScheduler
import torch

# image grid
from PIL import Image

#Image Grid
def image_grid(imgs, rows, cols):
    assert len(imgs) == rows*cols

    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols*w, rows*h))
    grid_w, grid_h = grid.size
    
    for i, img in enumerate(imgs):
        grid.paste(img, box=(i%cols*w, i//cols*h))
    return grid

In [None]:
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
lcm_lora_id = "latent-consistency/lcm-lora-sdxl"

pipe = DiffusionPipeline.from_pretrained(model_id, variant="fp16",cache_dir="/cluster/user/ehoemmen/.cache")

pipe.load_lora_weights(lcm_lora_id)
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
# pipe.to(device="cuda", dtype=torch.float16)
pipe.enable_sequential_cpu_offload()

<a id="lcm"></a>

## 1. Basic LCM Pipeline

In [None]:
prompt = "close-up photography of old man standing in the rain at night, in a street lit by lamps, leica 35mm summilux"
images = pipe(
    prompt=prompt,
    num_inference_steps=4,
    guidance_scale=1,
).images[0]

images

#### Diffusion Process Grid (1 - 8 steps)

In [None]:
images = []
for steps in range(8):
    generator = torch.Generator().manual_seed(1337)
    image = pipe(
        prompt=prompt,
        num_inference_steps=steps+1,
        guidance_scale=1,
        generator=generator,
    ).images[0]
    images.append(image)

grid = image_grid(images, rows=2, cols=4)

grid

In [None]:
prompt = "cereal package, bees on the package, honey flavoured, food mockup"
images = pipe(
    prompt=prompt,
    num_inference_steps=4,
    guidance_scale=1,
).images[0]

images

In [None]:
prompt = "Croissant  on a wodden table, awesome food photography, studio lightning, warm colors and sunlight in background"
images = pipe(
    prompt=prompt,
    num_inference_steps=4,
    guidance_scale=1,
).images[0]

images

<a id="keyfindings"></a>

## Key Findings

The general image quality is very good, sometimes with only one step. As soon as it came to packaging and more specific requirements were made in the prompt, it became imprecise and the quality deteriorated (e. g.cornflakes packaging was difficult). 

However, good results are achieved, especially with photorealistic images.