# Generative AI Challenge: Improving Scene Descriptions

In this notebook, you will learn how to use the Llama model through the Ollama API to enhance scene descriptions by adding more details to the descriptions of the characters in a story. This is useful when working with generative AI models, where enriching the descriptions can help improve the generated images or story contexts.

### 1. Setting Up the Environment

To start, you need to install the necessary package to interact with the Ollama server. This package cannot be installed via Conda, so you'll need to use pip:

`pip install ollama`

Once installed, you can proceed to interact with the Llama model via the Ollama API.

### 2. Basic Text Generation Using the  Ollama API

We begin by importing the necessary library for this task:

In [1]:
import base64
from ollama import Client

Next, we define the server address, authentication details, and the Llama model we will use for the task:
- `ollama_server_address`: This is the address of the Ollama server where the Llama model is running.
- `ollama_llm_model`: This specifies which model to use. In this example, we are using the Llama 3.1 8B model.

In [2]:
ollama_server_address = "buas9.edirlei.com"
ollama_username = "adsai"
ollama_password = "uuUP4whjX29cF3cxwrX3SY5Mm9TmtASkdgpaNCTHZ9"
ollama_credentials = base64.b64encode(f"{ollama_username}:{ollama_password}".encode('utf-8')).decode('utf-8')
ollama_auth_header = f"Basic {ollama_credentials}"

ollama_llm_model = "llama3.1:8b"

Now we cab initialize the Ollama client with the server address:

In [3]:
ollama_client = Client(host=f"http://{ollama_server_address}", headers = {"Authorization": ollama_auth_header})

To see how the Ollama API works, we'll start with a simple example by asking the model a common question: Why is the sky blue?. The following code sends a request to the model, and then prints the response:

In [4]:
response = ollama_client.chat(model=ollama_llm_model, messages=[
  {
    "role": "user",
    "content": "Why is the sky blue?",
  },
])
print(response["message"]["content"])

The sky appears blue to us during the daytime due to a phenomenon called scattering, which is the interaction between light and the tiny molecules of gases in the atmosphere. This is known as Rayleigh scattering.

Here's what happens:

1.  **Sunlight**: The sun emits all colors of the visible spectrum (red, orange, yellow, green, blue, indigo, violet) as white light.
2.  **Entry into Earth's Atmosphere**: When sunlight enters Earth's atmosphere, it encounters various particles and gases like nitrogen and oxygen molecules.
3.  **Scattering**: These tiny gas molecules scatter the light in all directions. However, shorter wavelengths (like blue light) are scattered more than longer wavelengths (like red light). This is because the smaller molecules interact more strongly with the shorter wavelengths of light.
4.  **Our View**: When we look up at the sky, we see the scattered blue light from all directions. Our atmosphere acts as a filter, allowing mainly the blue part of the spectrum to r

### 3. Improving Scene Descriptions Using the Ollama API

Now, let's use the Llama model to enhance scene descriptions by describing the physical characteristics of the characters in the scenes. Let's first define a sequence of story events:

In [5]:
story_1 = ["Elara stares at an old map spread out on a wooden table, her hand resting on a lantern. She wears a worn leather jacket, and a satchel hangs at her side. The dim light flickers as she studies the map, deep in thought.", 
           "Elara hikes through a dense jungle, using a machete to clear her path. Sweat drips down her forehead, and her satchel bounces with each step. The sun filters through the thick canopy above her.",
           "Elara stands at the entrance of an ancient stone temple, the wind whipping her hair. She adjusts her satchel and steps forward, determination in her eyes.",
           "Inside the temple, Elara examines carvings on the wall, her lantern illuminating the worn stone. Her jacket is now unzipped, revealing a plain shirt underneath, as she runs her fingers over the markings.",
           "Elara crouches beside a hidden chest, carefully prying it open. Inside, a glowing artifact rests on a bed of cloth. She glances around warily before gently lifting the artifact into her hands.",
           "Elara races through the temple corridors, the artifact glowing brightly in her satchel. Her jacket flutters behind her as she dodges crumbling debris, her expression a mix of fear and urgency.",
           "Outside the temple, Elara catches her breath, standing on a cliff with the jungle stretching out below. She looks at the artifact in her hand, a relieved smile spreading across her face as the sun sets behind her."]

Next, we define the prompt that will instruct the model to rewrite the scene descriptions by adding physical characteristics:

In [6]:
prompt_1 = "Rewrite the provided scene description by adding the physical characteristics of characters mentioned in the scene. Write only the final scene description in a concise way without adding any commentary or header. Here is the scene: "

We then loop through the story scenes and send each one to the Llama model, storing the improved versions in a new list:

In [7]:
story_1_improved = []

for i in range(len(story_1)):
    response = ollama_client.chat(model=ollama_llm_model, messages=[
        {
            "role": "user",
            "content": prompt_1 + story_1[i],
        },
    ])
    story_1_improved.append(response["message"]["content"])

Finally, we print the improved scene descriptions:

In [8]:
for i in range(len(story_1_improved)):
    print("IMPROVED SCENE " + str(i + 1) + ": " + story_1_improved[i])

IMPROVED SCENE 1: Elara, a petite woman with short, dark hair and piercing green eyes, stares at an old map spread out on a wooden table, her hand resting on a lantern. She wears a worn leather jacket that fits snugly over her slender frame, and a battered satchel hangs at her side, revealing the curves of her hips beneath the faded fabric of her jeans. The dim light flickers as she studies the map, deep in thought, her brow furrowed in concentration.
IMPROVED SCENE 2: Elara, a tall and athletic woman in her mid-twenties with long, curly brown hair tied back in a ponytail, hikes through a dense jungle, using a machete to clear her path. Her tanned skin glistens with sweat as it drips down her forehead and onto her nose ring. A worn satchel bounces against her lean legs as she takes each step, the straps digging slightly into her shoulders. The sun filters through the thick canopy above, casting dappled shadows on the forest floor.
IMPROVED SCENE 3: A petite, raven-haired woman with pie

Although the enhanced scene descriptions produced by the Llama model add useful character details, there are still noticeable inconsistencies across the scenes. For instance, the descriptions may introduce conflicting or overly vague physical characteristics for the same character in different scenes, which can disrupt continuity.

If you choose to continue with a prompt-based approach, consider the following improvements:
- Investigate ways to standardize prompts to ensure that character attributes, such as physical features and clothing, remain consistent across all scenes.
- Explore how to provide more specific instructions in the prompt to maintain continuity, particularly in terms of character appearance and behavior.
- Test different variations of prompts to see how minor adjustments affect the model's ability to generate coherent and consistent outputs.
- Find ways to connect the improved scene descriptions with the way text-to-image models generate characters to ensure consistent images.