<img src='https://github.com/Ikomia-dev/notebooks/blob/main/examples/img/banner_ikomia.png?raw=true'>




#  Create your image with Kandinsky 2.2


**Kandinsky 2.2** is a text-conditional diffusion model based on unCLIP and latent diffusion. This model series, developed by a team from Russia, has evolved through several iterations, each bringing new features and improvements in image synthesis from text descriptions.


![illustration kandinsky](https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/kandinskyv22/%20blue%20eyes.png)

## Setup

Please use a GPU for this tutorial.

In the menu, select "Runtime" then "Change runtime type", choose GPU in "Hardware accelerator".

Check your GPU with the following command:

In [None]:
!nvidia-smi

You need to install Ikomia Python API with pip


In [None]:
!pip install ikomia

---

**-Google Colab ONLY- Restart runtime**

Click on the "RESTART RUNTIME" button at the end the previous window.

---

## Run Kandinsky 2.2 text2img

In [1]:
from ikomia.dataprocess.workflow import Workflow


# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name = "infer_kandinsky_2", auto_connect=False)

# Edit paramerters
algo.set_parameters({
    'model_name': 'kandinsky-community/kandinsky-2-2-decoder',
    'prompt': 'A Woman Jedi fighter performs a beautiful move with one lightsabre, full body, dark galaxy background, look at camera, Ancient Chinese style, cinematic, 4K.',
    'negative_prompt': 'low quality, bad quality',
    'prior_num_inference_steps': '25',
    'prior_guidance_scale': '4.0',
    'num_inference_steps': '100',
    'guidance_scale': '1.0',
    'seed': '-1',
    'width': '1280',
    'height': '768',
    })


# Generate your image
wf.run()

In [None]:
from ikomia.utils.displayIO import display

from PIL import ImageShow
ImageShow.register(ImageShow.IPythonViewer(), 0)

# Display the image
display(algo.get_output(0).get_image())

### List of parameters

- **model_name** (str) - default 'kandinsky-community/kandinsky-2-2-decoder': Name of the latent diffusion model. 
- **prompt** (str) - default 'portrait of a young women, blue eyes, cinematic' : Text prompt to guide the image generation .
- **negative_prompt** (str, *optional*) - default 'low quality, bad quality': The prompt not to guide the image generation. Ignored when not using guidance (i.e., ignored if `guidance_scale` is less than `1`).
- **prior_num_inference_steps** (int) - default '25': Number of denoising steps of the prior model (CLIP).
- **prior_guidance_scale** (float) - default '4.0':  Higher guidance scale encourages to generate images that are closely linked to the text prompt, usually at the expense of lower image quality. (minimum: 1; maximum: 20).
- **num_inference_steps** (int) - default '100': The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.
- **guidance_scale** (float) - default '1.0':  Higher guidance scale encourages to generate images that are closely linked to the text prompt, usually at the expense of lower image quality. (minimum: 1; maximum: 20).
- **height** (int) - default '768: The height in pixels of the generated image.
- **width** (int) - default '768: The width in pixels of the generated image.
- **seed** (int) - default '-1': Seed value. '-1' generates a random number between 0 and 191965535.


*note:"prior model" interprets and encodes the input text to understand the desired image content, while the "decoder model" translates this encoded information into the actual visual representation, effectively generating the image based on the text description.*