# Latent Consistency Model using torch.compile with OpenVINO backend
This notebook provides instructions how to run Latent Consistency Model (LCM). It allows to setup standard Hugging Face diffusers pipeline and Optimum Intel pipeline optimized for Intel hardware including CPU and GPU. Running inference on CPU and GPU it is easy to compare performance and time required to generate an image for provided prompt. The notebook can be also used on other Intel hardware with minimal or no modifications.  

Optimum Intel is an interface from Hugging Face between both diffusers and transformers libraries and various tools provided by Intel to accelerate pipelines on Intel hardware. It allows to perform quantization of the models hosted on Hugging Face.
In this notebook OpenVINO is used for AI-inference acceleration as a backend for Optimum Intel! 

For more details please refer to Optimum Intel repository
https://github.com/huggingface/optimum-intel

<img src="https://github.com/openvinotoolkit/openvino_notebooks/assets/105707993/a668529a-e1bd-46c6-9be4-1e6ca705c939"/>


LCMs are the next generation of generative models after Latent Diffusion Models (LDMs). They are proposed to overcome the slow iterative sampling process of Latent Diffusion Models (LDMs), enabling fast inference with minimal steps (from 2 to 4) on any pre-trained LDMs (e.g. Stable Diffusion). To read more about LCM please refer to https://latent-consistency-models.github.io/

#### Table of contents:
- [Prerequisites](#Prerequisites)
- [Full precision model on the CPU](#Using-full-precision-model-in-CPU-with-LatentConsistencyModelPipeline)


### Installation Instructions

This is a self-contained example that relies solely on its own code.

<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/latent-consistency-models-image-generation/latent-consistency-models-optimum-demo.ipynb" />


### Prerequisites
[back to top ⬆️](#Table-of-contents:)

Install required packages

In [1]:
import sys
!{sys.executable} -m pip install -q "openvino>=2023.3.0"
!{sys.executable} -m pip install -q "accelerate" "diffusers" "ipywidgets" "torch>=2.1.1" "transformers>=4.33.0" --extra-index-url https://download.pytorch.org/whl/cpu

In [2]:
import torch

from diffusers import StableDiffusionPipeline

import openvino.frontend.pytorch.torchdynamo.backend

import time

import os

In [3]:
import warnings

warnings.filterwarnings("ignore")

### Using full precision model in CPU with `LatentConsistencyModelPipeline`
[back to top ⬆️](#Table-of-contents:)

Standard pipeline for the Latent Consistency Model(LCM) from Diffusers library is used here. For more information please refer to  https://huggingface.co/docs/diffusers/en/api/pipelines/latent_consistency_models


In [None]:
generator = torch.Generator("cpu").manual_seed(1024)

 

model_id = "runwayml/stable-diffusion-v1-5"

pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float32)

 

pipe.unet = torch.compile(pipe.unet, backend="openvino", options={"device": "GPU.0"})

pipe.vae.decode = torch.compile(pipe.vae.decode, backend="openvino")
    

In [None]:
prompt = "A cute squirrel in the forest, portrait, 8k"
 
start_time = time.time()

image = pipe(prompt, num_inference_steps=50, generator=generator).images[0]

end_time = time.time()

print("Time taken: ", end_time - start_time)   

image