# Latent consistency model using Optimum-Intel OpenVINO on AI PC
This notebook provides instructions how to run Latent consistency model (LCM) on AI PC. It allows to setup standard Hugging Face diffuser pipeline and Optimum Intel pipeline optimized for Intel hardware including CPU and integrated GPU (iGPU). Running inference on CPU and iGPU it is easy to compare performance and time required to generate an image for provided prompt.

![](https://github.com/openvinotoolkit/openvino_notebooks/assets/10940214/1858dae4-72fd-401e-b055-66d503d82446)

Optimum Intel is a interface from Hugging Face between both diffusers and transformers libraries and various tools provided by Intel to accelerate pipelines on Intel hardware. It allows  to perform quantization of the models hosted on Hugging Face.
In this notebook OpenVINO and NNCF are used for AI-inference acceleration and quantization tools as a backend for Optimum Intel! 

For more details, please refer to Optimum Intel repository on github
https://github.com/huggingface/optimum-intel

<img src="lcm.png"/>

LCMs are the next generation of generative models after Latent Diffusion Models (LDMs). They are proposed to overcome the slow iterative sampling process of Latent Diffusion Models (LDMs), enabling fast inference with minimal steps (from 2 to 4) on any pre-trained LDMs (e.g. Stable Diffusion). To read more about LCM please refer to https://latent-consistency-models.github.io/

#### Table of contents:
- [Prerequisites](#Prerequisites)
- [Full precision model on the CPU](#Full-precision-model-on-the-CPU)
- [Full precision model on the CPU with OpenVINO acceleration](#Full-precision-model-on-the-CPU-OV)
- [Runnning AI-inference on the GPU with OpenVINO acceleration](#Full-precision-model-on-the-GPU-OV)


### Prerequisites
[back to top ⬆️](#Table-of-contents:)

Install required packages

In [None]:
%pip install -q "optimum-intel[openvino,diffusers]@git+https://github.com/huggingface/optimum-intel.git" "ipywidgets" "transformers>=4.33.0" --extra-index-url https://download.pytorch.org/whl/cpu

In [None]:
import warnings
warnings.filterwarnings('ignore')

### Showing Info Available Devices
[back to top ⬆️](#Table-of-contents:)

The `available_devices` property shows the available devices in your system. The "FULL_DEVICE_NAME" option to `ie.get_property()` shows the name of the device. Check what is the ID name for the discrete GPU, if you have integrated GPU (iGPU) and discrete GPU (dGPU), it will show `device_name="GPU.0"` for iGPU and `device_name="GPU.1"` for dGPU. If you just have either an iGPU or dGPU that will be assigned to `"GPU"`

Note: For more details about GPU with OpenVINO visit this [link](https://docs.openvino.ai/nightly/openvino_docs_install_guides_configurations_for_intel_gpu.html). If you have been facing any issue in Ubuntu 20.04 or Windows 11 read this [blog](https://blog.openvino.ai/blog-posts/install-gpu-drivers-windows-ubuntu).

In [None]:
from openvino.runtime import Core

ie = Core()
devices = ie.available_devices

for device in devices:
    device_name = ie.get_property(device, "FULL_DEVICE_NAME")
    print(f"{device}: {device_name}")

### Using full precision model in CPU with `LatentConsistencyModelPipeline`
[back to top ⬆️](#Table-of-contents:)


In [None]:
from diffusers import LatentConsistencyModelPipeline
import gc

pipeline = LatentConsistencyModelPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7")
pipeline.save_pretrained("./lcm_cpu")

prompt = "green wood dragon in the sky 8k"

image = pipeline(
    prompt=prompt, num_inference_steps=4, guidance_scale=8.0
).images[0]
image.save("image_cpu.png")
image

In [None]:
del pipeline
gc.collect()

### Using full precision model in CPU with `OVLatentConsistencyModelPipeline`
[back to top ⬆️](#Table-of-contents:)


In [None]:
from optimum.intel import OVLatentConsistencyModelPipeline

ov_pipeline = OVLatentConsistencyModelPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", export=True, compile=False)
ov_pipeline.reshape(batch_size=1, height=512, width=512, num_images_per_prompt=1)
ov_pipeline.save_pretrained("./openvino_ir")
ov_pipeline.compile()

In [None]:
prompt = "green wood dragon in the sky 8k"
image_ov_cpu = ov_pipeline(prompt=prompt, num_inference_steps=4, guidance_scale=8.0).images[0]
image_ov_cpu.save("image_opt_cpu.png")
image_ov_cpu

### Running inference on iGPU with `OVLatentConsistencyModelPipeline`
[back to top ⬆️](#Table-of-contents:)

The model in this notebook is FP32 precision, but accelerated AI-inference using XMX is supported for FP16 data type and FP32 precision for GPU may produce high memory footprint and latency. Therefore, default precision for GPU in OpenVINO is FP16. OpenVINO GPU Plug-In converts FP32 to FP16 on the fly and there is no need to do it manually  

In [None]:
ov_pipeline.to("GPU")
ov_pipeline.compile()

In [None]:
image_ov_gpu = ov_pipeline(prompt=prompt, num_inference_steps=4, guidance_scale=8.0).images[0]
image_ov_gpu.save("image_opt_cpu.png")
image_ov_gpu

In [None]:
del ov_pipeline
gc.collect()