# 2. Generate Converation

In [27]:
from dotenv import load_dotenv
import os

# Load environment variables from .env file
load_dotenv(dotenv_path=".env")

print(os.getenv("AZURE_OPENAI_ENDPOINT")) 
print(os.getenv("AZURE_OPENAI_MODEL"))

https://aoai-jmg-eastus.openai.azure.com/
gpt-4o-mini


In [28]:
SYSTEM_PROMPT = """
You are a top-tier podcast producer tasked with crafting an engaging and entertaining sports podcast script, styled after ESPN SportsCenter, based on the provided input text. The input may be unstructured or fragmented, sourced from PDFs or web pages. Your goal is to transform this content into a lively, sports-focused podcast segment filled with excitement, humor, and insightful sports analysis.

# Steps to Follow:
1. **Analyze the Input:**   
   Review the text thoroughly to identify key highlights, exciting sports moments, standout performances, player statistics, significant game events, and interesting anecdotes. Ignore irrelevant details or formatting irregularities, but creatively connect events from different time frames if needed to maintain narrative momentum.

2. Brainstorm Ideas:
   Within the `<scratchpad>`, develop creative ways to present sports content dynamically. Consider:
   - Energetic play-by-play recaps
   - Engaging analogies or sports metaphors
   - Humorous commentary and lively banter between hosts
   - Dramatic framing of key sports moments
   - Thought-provoking sports debates or rhetorical questions to spark listener interest

3. Craft the Dialogue:
   Create a vibrant, fast-paced conversation between two charismatic sports anchors (Host 1: Ava, Host 2: Alex), mirroring ESPN SportsCenter’s energetic style. Incorporate:
   - A catchy opening to immediately captivate listeners
   - Playful and competitive banter between hosts
   - Short, punchy lines (each under 100 characters) for rapid-fire exchanges
   - Authentic excitement, including spontaneous reactions ("Wow!", "Did you see that?")
   - Occasional friendly disagreements or debates over sports outcomes
   - Humor, sports clichés, and quick wit to enhance entertainment value
   - Relevant quotes or reactions drawn directly from the input text
   Rules for the dialogue:
   - Alex always initiates the discussion and sets the tone
   - Taylor provides color commentary, humorous insights, and reactions
   - Hosts naturally interrupt each other for realism and entertainment
   - Maintain a high-energy yet family-friendly tone suitable for all audiences
   - Conclude with a memorable wrap-up summarizing key sports highlights

4. Highlight Key Moments:
   Throughout the podcast, repeatedly emphasize key sports events and insights from the input text, ensuring listeners grasp and remember the most significant moments without feeling lectured.

5. Ensure Entertainment Value:
   Maintain a lively, upbeat tone by including:
   - Surprising facts or amusing anecdotes
   - Playful banter and clever exchanges between hosts
   - Brief humorous or relatable sports analogies
   - Dramatic build-ups and enthusiastic descriptions of game-defining moments

6. Consider Pacing and Structure:
   Structure your script for optimal listener engagement:
   - Start immediately with a bold, engaging opening statement
   - Alternate smoothly between high-energy recaps and insightful analysis
   - Include brief pauses or slower segments to help listeners digest intense sports action
   - End energetically, perhaps teasing upcoming sports events or posing a fun challenge to listeners

IMPORTANT RULE: Each line of dialogue must not exceed 100 characters (approx. 5-8 seconds of spoken time).

Remember: Always reply in valid JSON format, without code blocks. Begin directly with the JSON output.
"""

In [None]:
SYSTEM_PROMPT_ORIGINAL = """
You are a world-class podcast producer tasked with transforming the provided input text into an engaging and informative podcast script. The input may be unstructured or messy, sourced from PDFs or web pages. Your goal is to extract the most interesting and insightful content for a compelling podcast discussion.

# Steps to Follow:

1. **Analyze the Input:**
   Carefully examine the text, identifying key topics, points, and interesting facts or anecdotes that could drive an engaging podcast conversation. Disregard irrelevant information or formatting issues. If the text contains updates over multiple days/weeks then make sure to combine them into a narrative to convey progress.

2. **Brainstorm Ideas:**
   In the `<scratchpad>`, creatively brainstorm ways to present the key points engagingly. Consider:
   - Analogies, storytelling techniques, or hypothetical scenarios to make content relatable
   - Ways to make complex topics accessible to a general audience
   - Thought-provoking questions to explore during the podcast
   - Creative approaches to fill any gaps in the information

3. **Craft the Dialogue:**
   Develop a natural, conversational flow between the host (Jane) and the guest speaker (the author or an expert on the topic). Incorporate:
   - The best ideas from your brainstorming session
   - Clear explanations of complex topics
   - An engaging and lively tone to captivate listeners
   - A balance of information and entertainment

   Rules for the dialogue:
   - The host (Jane) always initiates the conversation and interviews the guest
   - Include thoughtful questions from the host to guide the discussion
   - Incorporate natural speech patterns, including occasional verbal fillers (e.g., "um," "well," "you know")
   - Allow for natural interruptions and back-and-forth between host and guest
   - Ensure the guest's responses are substantiated by the input text, avoiding unsupported claims
   - Maintain a PG-rated conversation appropriate for all audiences
   - Avoid any marketing or self-promotional content from the guest
   - The host concludes the conversation

4. **Summarize Key Insights:**
   Naturally weave a summary of key points into the closing part of the dialogue. This should feel like a casual conversation rather than a formal recap, reinforcing the main takeaways before signing off.

5. **Maintain Authenticity:**
   Throughout the script, strive for authenticity in the conversation. Include:
   - Moments of genuine curiosity or surprise from the host
   - Instances where the guest might briefly struggle to articulate a complex idea
   - Light-hearted moments or humor when appropriate
   - Brief personal anecdotes or examples that relate to the topic (within the bounds of the input text)

6. **Consider Pacing and Structure:**
   Ensure the dialogue has a natural ebb and flow:
   - Start with a strong hook to grab the listener's attention
   - Gradually build complexity as the conversation progresses
   - Include brief "breather" moments for listeners to absorb complex information
   - End on a high note, perhaps with a thought-provoking question or a call-to-action for listeners

IMPORTANT RULE: Each line of dialogue should be no more than 100 characters (e.g., can finish within 5-8 seconds)

Remember: Always reply in valid JSON format, without code blocks. Begin directly with the JSON output.
"""

In [29]:
from typing import Literal, List

from pydantic import BaseModel, Field
import os


class DialogueItem(BaseModel):
    """A single dialogue item."""

    speaker: Literal["Host (Ava)", "Guest (Alex)"]
    text: str

class MediumDialogue(BaseModel):
    """The dialogue between the host and guest."""

    scratchpad: str
    name_of_guest: str
    dialogue: List[DialogueItem] = Field(
        ..., description="A list of dialogue items, typically between 29 to 39 items"
    )

class TopicItem(BaseModel):
    """A single topic item."""
    time: str = Field(..., description="When the topic occurred in the match")
    subject: str = Field(..., description="The subject of the topic (examples: 'Injury', 'Play', 'Score', 'Penalty')")
    activity: str = Field(..., description="The activity related to the topic")
    description: str = Field(..., description="A description of the topic")
class Topics(BaseModel):
    """A class to represent the topics of a sports podcast episode."""
    week: str = Field(..., description="The date of the topic")
    matchup: str = Field(..., description="Name of the matchup between two teams (example: New York Giants vs. Green Bay Packers)")
    topics: List[TopicItem] = Field(
        ..., description="A list of topic items"
    )

In [30]:
with open(os.path.join('..', 'examples', "game-recap.txt"), "r") as f:
    input_text = f.read()

# other_updates = [os.path.join('..', 'examples', "call-center-status1.md"), os.path.join('..', 'examples', "call-center-status2.md"), os.path.join('..', 'examples', "call-center-status3.md")]
# for update in other_updates:
#     with open(update, "r") as f:
#         input_text += f.read()
print(input_text)

PITTSBURGH -- — Cam Heyward has been on good teams before. Ones that have captured divisions. Ones that have won playoff games, though admittedly not in a while.

The longtime Pittsburgh Steelers defensive tackle, now well into his dotage in his 14th season, can't quite remember having a group like the one that he plays on now.

“We have a complete team,” Heyward said.

A team that began the season riddled with question marks now finds itself steamrolling toward Christmas with everything on the table.

Russell Wilson threw for 158 yards and two touchdowns, Heyward recorded two sacks and the Steelers beat the Cleveland Browns 27-14 on Sunday even with leading receiver George Pickens watching from the sideline while missing the first game of his career due to a groin injury.

While it took Wilson and the rest of the offense time to get going with the productive if volatile Pickens out of the mix, Wilson found his footing in the second half by connecting on touchdown passes to Pat Freierm

In [31]:
# Let's make an initial LLM call to condition the input for the dialogue generation
SUMMARY_SYSTEM_PROMPT = """
You are a world-class sports podcast producer tasked with extracting from the provided input key pieces of data that will be used later to generate a script for a podcast episode. The input may be unstructured or messy, sourced from PDFs or web pages. Your goal is to extract the key information, things like teams, players, final scores, significant plays, injuries, and anything else an audiance of fans might be interested in. You will then group them in a chronological order so they can be used later to generate a script for a podcast episode.


# Steps to Follow:
1. **Analyze the Input:** 
    Carefullly examine the text to pull out key information like teams, players, final scores, significant plays, injuries, and anything else fans might be interested in.
2. **Group the Information:**
    Group the information chronologically by quarter, half, or period.
3. **Output the Information:**
    Output the information in a JSON format, without any additional text or explanation.

Remember: Always reply in valid JSON format, without code blocks. Begin directly with the JSON output.
"""

In [32]:
from utils import call_llm

topic_extraction = call_llm(SUMMARY_SYSTEM_PROMPT, input_text, Topics)
topic_extraction.to_dict()
topic_dialog_feeder = topic_extraction.model_dump_json()

modified_system_prompt = SYSTEM_PROMPT
modified_system_prompt += "\n\nAim for a moderate length, about 3-5 minutes."
modified_system_prompt += "\n\nOUTPUT LANGUAGE <IMPORTANT>: The the podcast should be English."

# Call the LLM for the first time
first_draft_dialogue = call_llm(modified_system_prompt, topic_dialog_feeder, MediumDialogue)

In [33]:
first_draft_dialogue.to_dict()

{'id': 'chatcmpl-B9Wuaj97xkweg6Y2rhEYGYXsGVkAn',
 'choices': [{'finish_reason': 'stop',
   'index': 0,
   'logprobs': None,
   'message': {'content': '{"scratchpad":"- Highlight the rivalry between Steelers and Browns.\\n- Use humor to discuss missed field goals.\\n- Create a dramatic build-up for touchdowns.\\n- Include playful banter about injuries and player performances.","name_of_guest":"Alex","dialogue":[{"speaker":"Host (Ava)","text":"Welcome to the SportsCenter Showdown! It\'s Steelers vs. Browns time!"},{"speaker":"Guest (Alex)","text":"Oh, you know it! This rivalry never disappoints! Let\'s dive in!"},{"speaker":"Host (Ava)","text":"First quarter, Jameis Winston connects with Jerry Jeudy for a 35-yard TD!"},{"speaker":"Guest (Alex)","text":"Wow! Talk about a statement drive! Browns take the early lead!"},{"speaker":"Host (Ava)","text":"But hold on, Browns CB Mike Ford Jr. is in concussion protocol. Ouch!"},{"speaker":"Guest (Alex)","text":"That’s a tough break! The secondary 

In [34]:
# Call the LLM a second time to improve the dialogue
system_prompt_with_dialogue = f"{modified_system_prompt}\n\nHere is the first draft of the dialogue you provided:\n\n{first_draft_dialogue.model_dump_json()}."
final_dialogue = call_llm(system_prompt_with_dialogue, "Please improve the dialogue. Make it more natural and engaging. Remeber this isn't play-by-play but a recap of a game that has happened.", MediumDialogue)

In [35]:
final_dialogue.to_dict()

{'id': 'chatcmpl-B9Wuvn234u9FxkdCz72lsXTUoSntY',
 'choices': [{'finish_reason': 'stop',
   'index': 0,
   'logprobs': None,
   'message': {'content': '{"scratchpad":"- Emphasize the rivalry\'s intensity and history.\\\\n- Use humor to lighten the mood around missed field goals.\\\\n- Build excitement around key touchdowns and defensive plays.\\\\n- Include playful banter about player performances and injuries.","name_of_guest":"Alex","dialogue":[{"speaker":"Host (Ava)","text":"Welcome back to the SportsCenter Showdown! What a game we witnessed!"},{"speaker":"Guest (Alex)","text":"Absolutely, Ava! Steelers and Browns never fail to deliver drama!"},{"speaker":"Host (Ava)","text":"Right? The first quarter kicked off with Jameis Winston hitting Jerry Jeudy for a 35-yard TD!"},{"speaker":"Guest (Alex)","text":"What a way to start! The Browns were looking sharp early on!"},{"speaker":"Host (Ava)","text":"But then, disaster struck! Mike Ford Jr. went down with a concussion. Tough break!"},{"s

In [36]:
import json
result = json.loads(final_dialogue.choices[0].message.content)
result["dialogue"]

[{'speaker': 'Host (Ava)',
  'text': 'Welcome back to the SportsCenter Showdown! What a game we witnessed!'},
 {'speaker': 'Guest (Alex)',
  'text': 'Absolutely, Ava! Steelers and Browns never fail to deliver drama!'},
 {'speaker': 'Host (Ava)',
  'text': 'Right? The first quarter kicked off with Jameis Winston hitting Jerry Jeudy for a 35-yard TD!'},
 {'speaker': 'Guest (Alex)',
  'text': 'What a way to start! The Browns were looking sharp early on!'},
 {'speaker': 'Host (Ava)',
  'text': 'But then, disaster struck! Mike Ford Jr. went down with a concussion. Tough break!'},
 {'speaker': 'Guest (Alex)',
  'text': 'Yeah, that’s a huge loss for their secondary! The Steelers smelled blood!'},
 {'speaker': 'Host (Ava)',
  'text': 'And they capitalized! Keeanu Benton snagged an interception in the second quarter!'},
 {'speaker': 'Guest (Alex)',
  'text': 'That was a game-changer! Then Najee Harris powered through for a 1-yard TD!'},
 {'speaker': 'Host (Ava)',
  'text': 'Steelers took the le

In [37]:

with open("game-recap-script.json", "w") as f:
    f.write(json.dumps(result["dialogue"], indent=4, ensure_ascii=False))