# Setup

In [20]:
%pip install -r requirements.txt

I0000 00:00:1740260792.310302  231755 fork_posix.cc:75] Other threads are currently calling into gRPC, skipping fork() handlers


Collecting openinference-instrumentation-openai (from -r requirements.txt (line 3))
  Downloading openinference_instrumentation_openai-0.1.21-py3-none-any.whl.metadata (4.7 kB)
Collecting opentelemetry-instrumentation (from openinference-instrumentation-openai->-r requirements.txt (line 3))
  Using cached opentelemetry_instrumentation-0.51b0-py3-none-any.whl.metadata (6.3 kB)
Downloading openinference_instrumentation_openai-0.1.21-py3-none-any.whl (24 kB)
Using cached opentelemetry_instrumentation-0.51b0-py3-none-any.whl (30 kB)
Installing collected packages: opentelemetry-instrumentation, openinference-instrumentation-openai
Successfully installed openinference-instrumentation-openai-0.1.21 opentelemetry-instrumentation-0.51b0
Note: you may need to restart the kernel to use updated packages.


In [21]:
import requests
import os
import re
from openai import OpenAI
from dotenv import load_dotenv
from phoenix.otel import register
from openinference.instrumentation.openai import OpenAIInstrumentor
import phoenix as px

In [22]:
# defaults to endpoint="http://localhost:4317"
tracer_provider = register(
  project_name="fourth_wall_app", # Default is 'default'
  endpoint="http://localhost:4317",  # Sends traces using gRPC
  set_global_tracer_provider=False
) 

OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

# tracer_provider = register(endpoint="http://127.0.0.1:6006/v1/traces")

🔭 OpenTelemetry Tracing Details 🔭
|  Phoenix Project: fourth_wall_app
|  Span Processor: SimpleSpanProcessor
|  Collector Endpoint: localhost:4317
|  Transport: gRPC
|  Transport Headers: {'user-agent': '****'}
|  
|  Using a default SpanProcessor. `add_span_processor` will overwrite this default.



In [26]:
# Launch phoenix in notebook
# px.launch_app().view()

In [4]:
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
client = OpenAI(
    api_key=api_key
)

# Gather Story Summary

## Wikipedia

In [6]:
def get_wikipedia_summary(title):
    """
    Fetches the first extract of a Wikipedia article using the public MediaWiki API.
    Returns a text summary or an empty string if not found.
    """
    base_url = "https://en.wikipedia.org/w/api.php"
    params = {
        "action": "query",
        "prop": "extracts",
        "explaintext": True,
        "format": "json",
        "titles": title
    }
    try:
        response = requests.get(base_url, params=params, timeout=10)
        response.raise_for_status()
        data = response.json()
        pages = data.get("query", {}).get("pages", {})
        for page_id, page_content in pages.items():
            if "extract" in page_content:
                # This is a raw textual extract from Wikipedia
                return page_content["extract"]
    except requests.RequestException as e:
        print(f"[Wikipedia] Error fetching summary: {e}")

    return ""

In [7]:
# testing with my fav solarpunk novel
book = "The Windup Girl"
summary = get_wikipedia_summary(book)
print(summary)

The Windup Girl is a biopunk science fiction novel by American writer Paolo Bacigalupi. It was his debut novel and was published by Night Shade Books on September 1, 2009. The novel is set in a future Thailand and covers a number of contemporary issues such as global warming and biotechnology.
The Windup Girl was named as the ninth best fiction book of 2009 by Time magazine. It won the 2010 Nebula Award and the 2010 Hugo Award (tied with The City & the City by China Miéville), both for best novel. The book also won the 2010 Campbell Memorial Award, the 2010 Compton Crook Award and the 2010 Locus Award for best first novel.


== Setting ==
The Windup Girl is set in 23rd-century Thailand. Global warming has raised the levels of world's oceans, carbon fuel sources have become depleted, and manually wound springs are used as energy storage devices. Biotechnology is dominant and megacorporations (called calorie companies) like AgriGen, PurCal and RedStar control food production through "gen

# Pass Book Summary thru the First Agent

In [8]:
def query_openai(system_prompt, user_prompt):
    """Send a prompt to OpenAI and return the response."""
    
    completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
    )
    
    return completion.choices[0].message.content

In [24]:
# Load the system prompt from system_prompts/story_analysis_prompt.md
with open("system_prompts/story_analysis_prompt.md", "r") as file:
    system_prompt = file.read()

# Generate a response using the OpenAI API
trailer_description = query_openai(system_prompt, summary)


# Create Specific clip/audio Prompts from the long description

In [13]:
print(trailer_description)

**Series of Clips for "The Windup Girl" Trailer**

**Background Music:** 
A haunting, ethereal blend of Thai traditional instruments with a modern, electronic beat. The music starts with a slow, steady rhythm, gradually building in intensity, reflecting the tension and underlying chaos of a biopunk future. The soundtrack swells and crescendos as the trailer progresses, underscoring the mounting drama and action.

---

**Clip 1: A Dystopian Bangkok (0:00-0:15)**

**Visual:** The camera pans over a sprawling, futuristic Bangkok. Elevated levees and shimmering floodwaters contrast with neon-lit skyscrapers and tangled, bustling streets below. Sunbeams cut through the smog, catching the distant silhouette of the child queen's palace.
  
**Audio:** The sound of churning water from the levees mingles with the buzzing of market crowds and distant thunder. Voice-over (Anderson Lake): "In a world drowning from its own excess, survival requires loyalty to secrets and shadows."

---

**Clip 2: An

In [26]:
import re

def time_to_seconds(timestr: str) -> int:
    """
    Convert a time string in MM:SS format to an integer number of seconds.
    E.g. "0:10" -> 10, "1:05" -> 65, etc.
    """
    minutes_str, seconds_str = timestr.split(':')
    return int(minutes_str) * 60 + int(seconds_str)

def parse_trailer_script(script_text: str):
    """
    Parse the LLM-generated trailer script text into a structured list of
    dictionaries. Each dict contains:
      clip_number, clip_title, start_time, end_time,
      visual_description, audio_description, length (seconds).
    """

    # Explanation of this regex:
    #
    # 1. \*\*Clip (\d+): (.*?)\*\*:
    #    - Matches "**Clip 1: Title**"
    #    - group(1) = "1", group(2) = "Establishing the World"
    #
    # 2. \s*-\s*\*\*Visual \((\d+:\d+)-(\d+:\d+)\)\:\*\*\s*(.*?)
    #    - Matches "- **Visual (0:00-0:10):**"
    #      group(3) = "0:00", group(4) = "0:10"
    #      group(5) = everything up until we hit the next "- **Audio:**" line
    #
    # 3. \s*-\s*\*\*Audio:\*\*\s*(.*?)\s*(?=\*\*Clip|\Z)
    #    - Matches "- **Audio:** " line, capturing everything up to the next "**Clip"
    #      or the end of text (\Z).
    #
    # We use DOTALL so that (.*?) can include newlines.
    #
    pattern = re.compile(
        r"\*\*Clip (\d+): (.*?)\*\*\s*"                # e.g. **Clip 1: Establishing the World**
        r"-\s*\*\*Visual \((\d+:\d+)-(\d+:\d+)\)\:\*\*\s*(.*?)\s*"  
        r"-\s*\*\*Audio:\*\*\s*(.*?)\s*(?=\*\*Clip|\Z)",
        re.DOTALL
    )

    clips = []
    matches = pattern.findall(script_text)
    for match in matches:
        clip_num_str, clip_title, start_str, end_str, visual_desc, audio_desc = match

        start_seconds = time_to_seconds(start_str)
        end_seconds = time_to_seconds(end_str)
        length_seconds = end_seconds - start_seconds

        clip_info = {
            "clip_number": int(clip_num_str),
            "clip_title": clip_title.strip(),
            "start_time": start_str,
            "end_time": end_str,
            "visual_description": visual_desc.strip(),
            "audio_description": audio_desc.strip(),
            "length_seconds": length_seconds,
        }
        clips.append(clip_info)

    return clips

# print the trailer description 
trailer_clips = parse_trailer_script(trailer_description)

In [27]:
trailer_clips

[{'clip_number': 1,
  'clip_title': 'Establishing the World',
  'start_time': '0:00',
  'end_time': '0:10',
  'visual_description': 'A sweeping aerial shot over the flooded, labyrinthine streets of 23rd-century Bangkok. Vibrant yet dystopian, the city is encased by towering levees as faint sunlight pierces through a thick haze.',
  'audio_description': 'Background chatter in Thai and the sound of flowing water, underscored by a haunting synth swell.',
  'length_seconds': 10},
 {'clip_number': 2,
  'clip_title': "Anderson Lake's Ambition",
  'start_time': '0:11',
  'end_time': '0:21',
  'visual_description': "Close-up on Anderson Lake's intense eyes as he examines a torn blueprint of a kink-spring. His factory looms in the background, casting long shadows.",
  'audio_description': 'Anderson whispers, "We need the seeds, the real ones... the key to power."',
  'length_seconds': 10},
 {'clip_number': 3,
  'clip_title': "Emiko's Vulnerability",
  'start_time': '0:22',
  'end_time': '0:32',