# Generative AI Challenge: Image Generation with WebUI

This notebook demonstrates the process of generating images and creating a simple storyboard using the [WebUI](https://github.com/AUTOMATIC1111/stable-diffusion-webui) framework. It provides a basic workflow for converting textual scene descriptions into visual illustrations using a text-to-image model. The focus of this notebook is to familiarize you with WebUI's capabilities for image generation, without implementing any methods to maintain character consistency across different scenes.

In this notebook, you will:
- Load scene descriptions and convert them into images using WebUI.
- Generate visual representations of consecutive scenes for a storyboard.
- Understand the limitations of existing models in maintaining consistent character appearances across multiple generated images.

### 1. Setting Up a Python Environment for the Challenge

To get started, you need to set up a Python environment for this challenge and install the necessary libraries. Follow these steps:

1. **Creating a Python Environment (skip this step if you already created the environment following the instructions of the previous notebook)**: 

    First, you will create a dedicated Python environment for the challenge. This will allow you to install the required packages without affecting other projects. To do this:
    - Open a terminal (or the Anaconda Prompt on Windows or macOS).
    - Run the following command to create a new environment. You can name it "genai_challenge" or choose your own name:

        ``conda create --name genai_challenge``

    - After creating the environment, activate it by running the command:
        
        ``conda activate genai_challenge``
        
2. **Installing Required Libraries**: 

    Most libraries needed for this challenge are standard Python libraries and come pre-installed. However, you will need to install the ``requests`` library manually to access the WebUI's API. To install it:
    - In the terminal with your activated environment, run the following command:

        ``conda install requests``

    - If you are not using a Conda environment, but another virtual environment tool, you can use the following command:
        
        ``pip install requests``

3. **Selecting the Environment and Verifying the Installation**

    After setting up the environment, you need to ensure that Visual Studio Code uses the correct Python environment for this notebook:

    - At the top right corner of the window, click on "Select Kernel".
    - Select your newly created environment ("genai_challenge") from the list of available environments.

    To verify that the ``websocket-client`` library is installed correctly, you can run the following code cell:

In [None]:
import requests
print("Requests library installed correctly.")

If the code above runs without any errors, you are all set and ready to continue.

### 2. Basic Image Generation Using WebUI

Let's start by importing the libraries that we will use:

In [None]:
import os
import requests
from IPython.display import Image, display

import base64

The images in this example are generated by a Stable Diffusion model running on the ADS&AI server. There are four instances of the WebUI framework available on the server:
- buas5.edirlei.com
- buas6.edirlei.com
- buas7.edirlei.com
- buas8.edirlei.com

Since all participants of the Generative AI Challenge will be sharing these instances, they may become overloaded at times. If you notice that image generation is taking longer than expected or encounter access errors while trying to run this notebook, try switching to another instance.

Next, we'll specify which WebUI instance to use by assigning its address to the variable `webui_server_address`.

In [None]:
webui_server_address = "buas5.edirlei.com"
webui_username = "adsai"
webui_password = "uuUP4whjX29cF3cxwrX3SY5Mm9TmtASkdgpaNCTHZ9"
webui_credentials = base64.b64encode(f"{webui_username}:{webui_password}".encode('utf-8')).decode('utf-8')
webui_auth_header = f"Basic {webui_credentials}"

To generate images in WebUI from Python, we need to interact with an API endpoint provided by WebUI. The function defined below will handle this process for you, so you don't need to worry about the technical details of accessing this endpoint.

In [None]:
def webui_generate_image(model_name, prompt, negative_prompt, seed, width, height, output_filename):
    randseed = (seed != -1)
    payload = {
        "prompt": prompt,
        "steps": 6,
        "cfg_scale": 2,
        "randomize_seed": randseed,
        "seed": seed,
        "enable_hr": False,
        "denoising_strength": 0,
        "firstphase_width": 0,
        "firstphase_height": 0,
        "hr_scale": 2,
        "hr_upscaler": "Hr Upscaler",
        "hr_second_pass_steps": 0,
        "hr_resize_x": 0,
        "hr_resize_y": 0,
        "styles": [ ],
        "subseed": -1,
        "subseed_strength": 0,
        "seed_resize_from_h": -1,
        "seed_resize_from_w": -1,
        "sampler_name": "DPM++ SDE",
        "batch_size": 1,
        "n_iter": 1,
        "width": width,
        "height": height,
        "restore_faces": False,
        "tiling": False,
        "negative_prompt": negative_prompt,
        "eta": 0,
        "s_churn": 0,
        "s_tmax": 0,
        "s_tmin": 0,
        "s_noise": 1,
        "override_settings": {
            "sd_model_checkpoint": model_name
        },
        "override_settings_restore_afterwards": True,
        "script_args": []
    }  
    response = requests.post(url=f'http://{webui_server_address}/sdapi/v1/txt2img', json=payload, headers = {"Authorization": webui_auth_header})
    result = response.json()
    with open(output_filename, 'wb') as f:
        f.write(base64.b64decode(result['images'][0]))

To generate an image, we will use the function `webui_generate_image`. Below are the parameters that this function accepts:
- `model_name`: The name of the model to use for image generation (e.g., `juggernautXL_v9Rdphoto2Lightning`). This allows you to specify which model the WebUI should use for generating the image. The name used for this parameter is the same as the one displayed in the top left corner of the WebUI interface, without the file extension.
- `prompt`: The main prompt used for the image generation, describing what the model should depict.
- `negative_prompt`: A prompt that specifies what the model should avoid including in the image. This helps refine the output by filtering unwanted elements.
- `seed`: The seed used for image generation. Using the same seed with the same prompts will consistently produce the same image. If you would like a random seed for each generation, set this value to -1.
- `width`: The width (in pixels) of the generated image.
- `height`: The height (in pixels) of the generated image.
- `output_filename`: The file path where the generated image will be saved.

Additionally, several parameters have been manually set internally in the function `webui_generate_image`, such as the number of steps (`steps`), the configuration scale (`cfg_scale`), and the sampler (`sampler_name`). While these are preconfigured for typical image generation, you are encouraged to explore and experiment with these parameters to see how they affect the quality and style of the generated images.

Let's generate an image using the function `webui_generate_image`:

In [None]:
webui_generate_image("juggernautXL_v9Rdphoto2Lightning", "A serene lakeside at sunset, reflecting the vibrant colors of the sky", "", -1, 1024, 1024, "my_first_test_webui.png")


After generating the image, we can visualize it:

In [None]:
display(Image(filename="my_first_test_webui.png", width=500))

### 3. Generating a Storyboard with WebUI

Now that you know how to generate individual images, let's explore how to create an image sequence within a storyboard context.

A storyboard represents a sequence of events that together form a story. We can store this sequence of events in a list, where each item describes a scene in the story:

In [None]:
story_2 = [ "Jake sprints through an abandoned street, gripping a bloodied baseball bat. His torn jacket flaps as he glances back at the horde of zombies shuffling behind him.",
            "Jake barricades a door inside an empty diner, pushing tables and chairs against it. His bat rests beside him as he catches his breath, sweat dripping down his face.",
            "Inside the dark diner, Jake peers out the window. His jacket is now tied around his waist, and he grips the bat tightly as zombies gather outside, their groans echoing in the distance.",
            "Jake moves cautiously through the kitchen, picking up a flashlight. His bat is slung over his shoulder, and the flickering light casts shadows on the rusted metal walls.",
            "Suddenly, a zombie lunges at Jake from the shadows. He swings his bat, smashing the creature aside. His face is determined, but fear flickers in his eyes.",
            "Jake runs through the back alley, the bat still in hand. His jacket is gone now, and his shirt is ripped as he sprints toward a rusty truck parked at the end of the alley.",
            "Jake jumps into the truck, slamming the door behind him. As the engine roars to life, he looks back at the zombies closing in, his grip still tight on the bat. He drives off into the night, the undead disappearing in the rearview mirror."]

Once we have defined the sequence of events, we can call the function `webui_generate_image` for each story event to generate the corresponding scene illustrations:

In [None]:
os.makedirs(".//images", exist_ok=True)  # create a subfolder to store the generated images

for i in range(len(story_2)):
    webui_generate_image("juggernautXL_v9Rdphoto2Lightning", story_2[i], "", -1, 1024, 1024, ".//images//story_2_scene_" + str(i + 1) + ".png")

Finally, we can display the generated images along with the corresponding text for each story event to create a storyboard representation:

In [None]:
for i in range(len(story_2)):
    display(Image(filename=".//images//story_2_scene_" + str(i + 1) + ".png", width=400))
    print(story_2[i])

Although the generated storyboard looks visually appealing, it is clear that the main character, Jake, is represented inconsistently across the scenes. He appears with different physical attributes and completely different clothing, even in situations where she should be wearing the same outfit.

**This brings us to your challenge: how can you improve character consistency in image sequences like the one illustrated in this notebook?**