# Use CLIP-guided VQGAN to generate images

This notebook enables you to generated images from text.

Using AI as a design tool will become more and more relevant in the future. An easy way to use AI as a tool is via so called 'Text-to-Image models'. They enable you to generate images from a text prompt you formulated.

In this notebook, you are going to use a combination of two models to generate images from a text prompt.
One model is called CLIP, the other one VQGAN. VQGAN is able to generate images, CLIP describe images with text. In our case, the task of CLIP is to understand the text we give it. CLIP gives then VQGAN instructions what images it should generate and guides it through its creation process to a good result.

Do not worry, it is far easier to use the AI models than you probably think. Let us continue with the next cell to get the process started. Use, for that, the play button in the left, upper corner. 

## 1. Set requirements

First, you need to set some requirements and install some libraries.  
Click the play button to execute the next cell.

Be aware that it could take some time to set all requirements.

In [1]:
# ------------------
# IMPORTS
# ------------------
import os
from IPython.display import display, clear_output
import ipywidgets as widgets
import io_widget

# -----------
# SETUP WIDGET
# -----------
Setup = io_widget.Setup()
display(Setup)

# -----------
# VARIABLES
# -----------
HiddenOutput = widgets.Output()

# ------------------
# CREATE DIRECTORY
# ------------------
with HiddenOutput:
    if os.path.isdir('/home/jovyan/utilities/VQGAN-CLIP/') is False:
        if os.path.isdir('/home/jovyan/utilities/') is False:
            os.mkdir('/home/jovyan/utilities/')

        os.mkdir('/home/jovyan/utilities/VQGAN-CLIP')

# ------------------
# CLONE REPOSITORY
# ------------------
with HiddenOutput:
    !git clone 'https://github.com/Francesco-Sch/VQGAN-CLIP' /home/jovyan/utilities/VQGAN-CLIP
    %cd /home/jovyan/utilities/VQGAN-CLIP
    !git clone 'https://github.com/openai/CLIP'
    !git clone 'https://github.com/CompVis/taming-transformers'

# ------------------
# INSTALL PACKAGES
# ------------------
with HiddenOutput:
    !pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
    !pip install ftfy regex tqdm omegaconf pytorch-lightning IPython kornia imageio imageio-ffmpeg einops torch_optimizer

# ------------------
# INSTALL PRE-TRAINED MODELS
# ------------------
with HiddenOutput:
    !mkdir checkpoints

    !curl -L -o checkpoints/vqgan_imagenet_f16_16384.yaml -C - 'https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/files/?p=%2Fconfigs%2Fmodel.yaml&dl=1' #ImageNet 16384
    !curl -L -o checkpoints/vqgan_imagenet_f16_16384.ckpt -C - 'https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/files/?p=%2Fckpts%2Flast.ckpt&dl=1' #ImageNet 16384

# Re-render widget after all processes are done
Setup.SetupProcessing = False
clear_output()
display(Setup)

Setup(SetupProcessing=False)

## 2. Enter text prompt

In the next section, you can define a text prompt. The AI models will then try to generate an image that is as close to the text prompt as possible. Try to give creative and very precise text prompts in order to achieve interesting results. Some example text prompts could be something like: 

```
A raccoon wearing formal clothes, wearing a tophap and holding a cane. The raccoon is holding a garbage bag. Oil painting in the style of Rembrandt.
```


```
A cyperpunk moth in front of a city skyline with the moon rising above.
```
&nbsp;  

Apart from the text prompt, you can also set attributes or adjectives to create an impact on the creation process. For example, if you want to create a more surrealistic look you can define `surreal` as an attribute. The AI models will then try to generate a more surrealistic look. Some interesting attributes to play with are: 
- surreal
- weird
- abstract
- high-resolution
- detailed
- naturalistic
&nbsp; 

In addition to defining attributes, you can also set weights for the attributes. These weights are set in percent. For example, you could give the `surreal` attribute a weight of 50% percent. This attribute has then only an impact of 50% percent on the generation process.

It is not necessary to set attributes and weights. But they are a handy tool to have more impact on the creation process of the AI.

Through experimenting with different prompts, attributes and weights you can generate a wide variety of different looking results.  
Execute the next cell to set a text prompt and add some attributes.



In [2]:
TextToImagePrompt = io_widget.TextToImagePrompt()
TextToImagePrompt

TextToImagePrompt()

## 3. Optionally set inital image

CAUTION: This is optional and not required to get the model running. If you do not want to use an inital image just leave it empty.

```
The inital image widget is still under development and not available yet.
```

## 4. Set generation options

With the following widget, you can set some general options to get the creation process started.

First, you have to give the output folder a name. You will find your generated results under this folder.  
You can then find the defined folder in the output folder. You can find the folder in the file explorer. To open the file explorer, click on the file icon in the left right corner.

Second, you can set the width and height of the generated image. VQGAN and CLIP are pretty good at generating square-sized images, so I would recommend keeping the square aspect ratio.  
Keep also in mind, that a very high resolution results in a long creation process. A resolution of maximum 1024 x 1024 is normally sufficient. If the creation process takes too long, try to go down with the resolution.

Third, you can set the iterations the AI models should do over one text prompt. This is also a value worth experimenting with. Because here, it does not necessarily mean that a higher value brings more. From my experience, a value between 300 and 600 works best. But feel free to experiment on your own with this value.

In [3]:
TextToImageOptions = io_widget.TextToImageOptions()
TextToImageOptions

TextToImageOptions()

## 5. Validate your settings

With the execution of the next cell, you can check if all your settings and inputs are correct.
If everything is correct, you can continue with the next cell. If something is wrong, you will get an error message which tells you what is missing and how to fix it.

In [4]:
Validation = io_widget.Validation()

try:
    TextToImagePrompt.TextToImagePrompt
except NameError:
    Validation.ValidationStatus = "error"
    Validation.ValidationMessage = "There is no text prompt set. Execute the cell under '2. Enter text prompt' again."

    display(Validation)
    raise

try:
    TextToImageOptions.TextToImageOptionFolder
except NameError:
    Validation.ValidationStatus = "error"
    Validation.ValidationMessage = "There is no output folder defined. Execute the cell under '4. Set generation options' again." 

    display(Validation)
    raise

try:
    TextToImageOptions.TextToImageOptionWidth
except NameError:
    Validation.ValidationStatus = "error"
    Validation.ValidationMessage = "Width for the generated image is not defined. Execute the cell under '4. Set generation options' again."

    display(Validation)
    raise

try:
    TextToImageOptions.TextToImageOptionHeight
except NameError:
    Validation.ValidationStatus = "error"
    Validation.ValidationMessage = "Height for the generated image is not defined. Execute the cell under '4. Set generation options' again."

    display(Validation)
    raise

try:
    TextToImageOptions.TextToImageOptionIterations
except NameError:
    Validation.ValidationStatus = "error"
    Validation.ValidationMessage = "Iterations for the generation process are not set. Execute the cell under '4. Set generation options' again."

    display(Validation)
    raise

Validation.ValidationStatus = "success"
Validation.ValidationMessage = "Everthing seems good. Continue with the next cell."

display(Validation)

Validation(ValidationMessage='Everthing seems good. Continue with the next cell.', ValidationStatus='success')

## 6. Start the generation process

Now that all settings are set, you are ready to start the creation process! :D 

Execute the next cell and press the button "Start the generation process".  
You will then see how slowly an image gets generated from pure noise.

In [5]:
TextToImageInit = io_widget.TextToImageInit()
TextToImageInitOutput = widgets.Output()

TextToImageInit.TextToImageInitClick = False
TextToImageInit.TextToImageInitFinished = False

with TextToImageInitOutput:
    display(TextToImageInit)

# Create directory, if it is not existing
if os.path.isdir('/home/jovyan/output/') is False:
    os.mkdir('/home/jovyan/output/')

if os.path.isdir(f'/home/jovyan/output/{TextToImageOptions.TextToImageOptionFolder}/') is False:
    os.mkdir(f'/home/jovyan/output/{TextToImageOptions.TextToImageOptionFolder}/')

# Output folder to read from
TextToImageInit.TextToImageInitFolder = TextToImageOptions.TextToImageOptionFolder

# Compute attributes
computedAttributes = ''

for attribute in TextToImagePrompt.TextToImageAttributes:
    computedAttributes += ' |' + ' ' + attribute['attribute'] + ':' + str(attribute['weight'])


def generate(change):
    if(change.new is True):
        with TextToImageInitOutput:
            # Re-render widget to show process running
            clear_output()
            display(TextToImageInit)
        with HiddenOutput:
            !python generate.py -p '{TextToImagePrompt.TextToImagePrompt} {computedAttributes}' -i '{TextToImageOptions.TextToImageOptionIterations}' -o '/home/jovyan/output/{TextToImageOptions.TextToImageOptionFolder}/image' -s '{TextToImageOptions.TextToImageOptionWidth}' '{TextToImageOptions.TextToImageOptionHeight}' -cuts '5'

        with TextToImageInitOutput:
            # Re-render widget to show finished process
            TextToImageInit.TextToImageInitFinished = True
            clear_output()
            display(TextToImageInit)


TextToImageInit.observe(generate, names='TextToImageInitClick')

display(TextToImageInitOutput)

Output()

## 7. Output

After the creation process is complete, you can execute the next cell to see the final results. On the bottom you find a button to download the generated image.

In [None]:
TextToImageShow = io_widget.TextToImageShow()
TextToImageShow

TextToImageShow.TextToImageShowFolder = TextToImageOptions.TextToImageOptionFolder

display(TextToImageShow)

## 8. How to continue

Well done, you created an image via a text prompt! :)  

Another interesting use case for designers is the curation and creation of datasets.
You can find out more under this notebook:

- [Scarping data with Pinterest](http://localhost:8888/lab/workspaces/pinterest/tree/notebooks/Scraping-Data-with-Pinterest.ipynb)

After you created a dataset, you could use it directly to fine tune StyleGAN3, to create your first own AI model.  
You can find out more under this notebook:

- [Fine tune StyleGAN3](#)

If you want to use this notebook again, click on the reload button in the left, top corner of this editor window. After you have pressed the reload button, you can start at the top of this document and execute it cell by cell.