# Welcome to Week 2!

## Frontier Model APIs

In Week 1, we used multiple Frontier LLMs through their Chat UI, and we connected with the OpenAI's API.

Today we'll connect with the APIs for Anthropic and Google, as well as OpenAI.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Important Note - Please read me</h2>
            <span style="color:#900;">I'm continually improving these labs, adding more examples and exercises.
            At the start of each week, it's worth checking you have the latest code.<br/>
            First do a <a href="https://chatgpt.com/share/6734e705-3270-8012-a074-421661af6ba9">git pull and merge your changes as needed</a>. Any problems? Try asking ChatGPT to clarify how to merge - or contact me!<br/><br/>
            After you've pulled the code, from the llm_engineering directory, in an Anaconda prompt (PC) or Terminal (Mac), run:<br/>
            <code>conda env update --f environment.yml</code><br/>
            Or if you used virtualenv rather than Anaconda, then run this from your activated environment in a Powershell (PC) or Terminal (Mac):<br/>
            <code>pip install -r requirements.txt</code>
            <br/>Then restart the kernel (Kernel menu >> Restart Kernel and Clear Outputs Of All Cells) to pick up the changes.
            </span>
        </td>
    </tr>
</table>
<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">Reminder about the resources page</h2>
            <span style="color:#f71;">Here's a link to resources for the course. This includes links to all the slides.<br/>
            <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>
            Please keep this bookmarked, and I'll continue to add more useful links there over time.
            </span>
        </td>
    </tr>
</table>

## Setting up your keys

If you haven't done so already, you could now create API keys for Anthropic and Google in addition to OpenAI.

**Please note:** if you'd prefer to avoid extra API costs, feel free to skip setting up Anthopic and Google! You can see me do it, and focus on OpenAI for the course. You could also substitute Anthropic and/or Google for Ollama, using the exercise you did in week 1.

For OpenAI, visit https://openai.com/api/  
For Anthropic, visit https://console.anthropic.com/  
For Google, visit https://ai.google.dev/gemini-api  

### Also - adding DeepSeek if you wish

Optionally, if you'd like to also use DeepSeek, create an account [here](https://platform.deepseek.com/), create a key [here](https://platform.deepseek.com/api_keys) and top up with at least the minimum $2 [here](https://platform.deepseek.com/top_up).

### Adding API keys to your .env file

When you get your API keys, you need to set them as environment variables by adding them to your `.env` file.

```
OPENAI_API_KEY=xxxx
ANTHROPIC_API_KEY=xxxx
GOOGLE_API_KEY=xxxx
DEEPSEEK_API_KEY=xxxx
```

Afterwards, you may need to restart the Jupyter Lab Kernel (the Python process that sits behind this notebook) via the Kernel menu, and then rerun the cells from the top.

In [1]:
# imports

import os
from dotenv import load_dotenv
from openai import OpenAI
import anthropic
from IPython.display import Markdown, display, update_display

In [2]:
# import for google
# in rare cases, this seems to give an error on some systems, or even crashes the kernel
# If this happens to you, simply ignore this cell - I give an alternative approach for using Gemini later

import google.generativeai

In [3]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging

load_dotenv(override=True)
openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:8]}")
else:
    print("Google API Key not set")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AIzaSyCa


In [4]:
# Connect to OpenAI, Anthropic

openai = OpenAI()

claude = anthropic.Anthropic()

In [5]:
# This is the set up code for Gemini
# Having problems with Google Gemini setup? Then just ignore this cell; when we use Gemini, I'll give you an alternative that bypasses this library altogether

google.generativeai.configure()

## Asking LLMs to tell a joke

It turns out that LLMs don't do a great job of telling jokes! Let's compare a few models.
Later we will be putting LLMs to better use!

### What information is included in the API

Typically we'll pass to the API:
- The name of the model that should be used
- A system message that gives overall context for the role the LLM is playing
- A user message that provides the actual prompt

There are other parameters that can be used, including **temperature** which is typically between 0 and 1; higher for more random output; lower for more focused and deterministic.

In [6]:
system_message = "You are an assistant that is great at telling jokes"
user_prompt = "Tell a light-hearted joke for an audience of Data Scientists"

In [7]:
prompts = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": user_prompt}
  ]

In [8]:
# GPT-4o-mini

completion = openai.chat.completions.create(model='gpt-4o-mini', messages=prompts)
print(completion.choices[0].message.content)

Why did the data scientist bring a ladder to work?

Because they heard the job was about scaling up!


In [9]:
# GPT-4.1-mini
# Temperature setting controls creativity

completion = openai.chat.completions.create(
    model='gpt-4.1-mini',
    messages=prompts,
    temperature=0.7
)
print(completion.choices[0].message.content)

Why did the data scientist break up with the statistician?

Because they couldn't find a significant connection!


In [11]:
# GPT-4.1-nano - extremely fast and cheap

completion = openai.chat.completions.create(
    model='gpt-4.1-nano',
    messages=prompts
)
print(completion.choices[0].message.content)

Why did the data scientist break up with the histogram?

Because it just wasn’t her type!


In [14]:
# GPT-4.1

completion = openai.chat.completions.create(
    model='gpt-4.1',
    messages=prompts,
    temperature=0.8
)
print(completion.choices[0].message.content)

Why did the data scientist break up with the logistic regression model?

Because it just couldn't commit!


In [15]:
# If you have access to this, here is the reasoning model o3-mini
# This is trained to think through its response before replying
# So it will take longer but the answer should be more reasoned - not that this helps..

completion = openai.chat.completions.create(
    model='o3-mini',
    messages=prompts
)
print(completion.choices[0].message.content)

Why did the data scientist bring a ladder to work? 

Because she heard the model was growing and needed help scaling!


In [18]:
# Claude 3.7 Sonnet
# API needs system message provided separately from user prompt
# Also adding max_tokens

message = claude.messages.create(
    model="claude-3-7-sonnet-latest",
    max_tokens=200,
    temperature=0.7,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

print(message.content[0].text)

Why don't data scientists like playing hide and seek?

Because they always find the pattern in where you're hiding!


In [20]:
# Claude 3.7 Sonnet again
# Now let's add in streaming back results
# If the streaming looks strange, then please see the note below this cell!

result = claude.messages.stream(
    model="claude-3-7-sonnet-latest",
    max_tokens=200,
    temperature=0.4,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

with result as stream:
    for text in stream.text_stream:
            print(text, end="", flush=True)

Why don't data scientists like to go to the beach?

Because they're afraid of getting caught in an infinite loop of waves!

*Ba-dum-tss* 🥁

## A rare problem with Claude streaming on some Windows boxes

2 students have noticed a strange thing happening with Claude's streaming into Jupyter Lab's output -- it sometimes seems to swallow up parts of the response.

To fix this, replace the code:

`print(text, end="", flush=True)`

with this:

`clean_text = text.replace("\n", " ").replace("\r", " ")`  
`print(clean_text, end="", flush=True)`

And it should work fine!

In [21]:
# The API for Gemini has a slightly different structure.
# I've heard that on some PCs, this Gemini code causes the Kernel to crash.
# If that happens to you, please skip this cell and use the next cell instead - an alternative approach.

gemini = google.generativeai.GenerativeModel(
    model_name='gemini-2.0-flash',
    system_instruction=system_message
)
response = gemini.generate_content(user_prompt)
print(response.text)

Why did the Python data scientist break up with the R data scientist?

Because they couldn't see eye to eye on the right way to handle missing values! She thought imputation was the key, but he was all about the `na.omit` life! It was a real `NaN`-sense argument!



In [None]:
# As an alternative way to use Gemini that bypasses Google's python API library,
# Google released endpoints that means you can use Gemini via the client libraries for OpenAI!
# We're also trying Gemini's latest reasoning/thinking model

gemini_via_openai_client = OpenAI(
    api_key=google_api_key, 
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

response = gemini_via_openai_client.chat.completions.create(
    model="gemini-2.5-flash-preview-04-17",
    messages=prompts
)
print(response.choices[0].message.content)

## (Optional) Trying out the DeepSeek model

### Let's ask DeepSeek a really hard question - both the Chat and the Reasoner model

In [None]:
# Optionally if you wish to try DeekSeek, you can also use the OpenAI client library

deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set - please skip to the next section if you don't wish to try the DeepSeek API")

In [None]:
# Using DeepSeek Chat

deepseek_via_openai_client = OpenAI(
    api_key=deepseek_api_key, 
    base_url="https://api.deepseek.com"
)

response = deepseek_via_openai_client.chat.completions.create(
    model="deepseek-chat",
    messages=prompts,
)

print(response.choices[0].message.content)

In [None]:
challenge = [{"role": "system", "content": "You are a helpful assistant"},
             {"role": "user", "content": "How many words are there in your answer to this prompt"}]

In [None]:
# Using DeepSeek Chat with a harder question! And streaming results

stream = deepseek_via_openai_client.chat.completions.create(
    model="deepseek-chat",
    messages=challenge,
    stream=True
)

reply = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    reply += chunk.choices[0].delta.content or ''
    reply = reply.replace("```","").replace("markdown","")
    update_display(Markdown(reply), display_id=display_handle.display_id)

print("Number of words:", len(reply.split(" ")))

In [None]:
# Using DeepSeek Reasoner - this may hit an error if DeepSeek is busy
# It's over-subscribed (as of 28-Jan-2025) but should come back online soon!
# If this fails, come back to this in a few days..

response = deepseek_via_openai_client.chat.completions.create(
    model="deepseek-reasoner",
    messages=challenge
)

reasoning_content = response.choices[0].message.reasoning_content
content = response.choices[0].message.content

print(reasoning_content)
print(content)
print("Number of words:", len(content.split(" ")))

## Additional exercise to build your experience with the models

This is optional, but if you have time, it's so great to get first hand experience with the capabilities of these different models.

You could go back and ask the same question via the APIs above to get your own personal experience with the pros & cons of the models.

Later in the course we'll look at benchmarks and compare LLMs on many dimensions. But nothing beats personal experience!

Here are some questions to try:
1. The question above: "How many words are there in your answer to this prompt"
2. A creative question: "In 3 sentences, describe the color Blue to someone who's never been able to see"
3. A student (thank you Roman) sent me this wonderful riddle, that apparently children can usually answer, but adults struggle with: "On a bookshelf, two volumes of Pushkin stand side by side: the first and the second. The pages of each volume together have a thickness of 2 cm, and each cover is 2 mm thick. A worm gnawed (perpendicular to the pages) from the first page of the first volume to the last page of the second volume. What distance did it gnaw through?".

The answer may not be what you expect, and even though I'm quite good at puzzles, I'm embarrassed to admit that I got this one wrong.

### What to look out for as you experiment with models

1. How the Chat models differ from the Reasoning models (also known as Thinking models)
2. The ability to solve problems and the ability to be creative
3. Speed of generation


## Back to OpenAI with a serious question

In [22]:
# To be serious! GPT-4o-mini with the original question

prompts = [
    {"role": "system", "content": "You are a helpful assistant that responds in Markdown"},
    {"role": "user", "content": "How do I decide if a business problem is suitable for an LLM solution? Please respond in Markdown."}
  ]

In [23]:
# Have it stream back results in markdown

stream = openai.chat.completions.create(
    model='gpt-4o-mini',
    messages=prompts,
    temperature=0.7,
    stream=True
)

reply = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    reply += chunk.choices[0].delta.content or ''
    reply = reply.replace("```","").replace("markdown","")
    update_display(Markdown(reply), display_id=display_handle.display_id)

# Deciding if a Business Problem is Suitable for an LLM Solution

When evaluating whether a business problem is suitable for a Large Language Model (LLM) solution, consider the following criteria:

## 1. Nature of the Task
- **Text-Based**: LLMs excel at tasks involving natural language processing, such as:
  - Text generation
  - Summarization
  - Translation
  - Sentiment analysis
  - Question answering

- **Complexity**: If the problem requires understanding context, nuances, or generating human-like responses, LLMs may be appropriate.

## 2. Data Availability
- **Text Data**: Ensure you have access to high-quality text data related to the problem.
- **Volume**: LLMs often benefit from large datasets for fine-tuning or training.
- **Quality**: The data should be relevant and representative of the problem domain.

## 3. User Interaction
- **Natural Language Input**: If the solution involves user interactions through chat, emails, or voice, LLMs can provide a more intuitive experience.
- **Dynamic Responses**: For applications requiring dynamic conversation or content generation, LLMs can adapt to user input effectively.

## 4. Business Goals
- **Efficiency**: Evaluate if an LLM can automate or streamline processes, reducing manual effort.
- **Value Addition**: Assess whether an LLM can provide insights or enhance decision-making.
- **User Experience**: Consider if an LLM can elevate customer engagement or satisfaction.

## 5. Feasibility and Resources
- **Technical Expertise**: Ensure your team has the necessary skills to implement and maintain LLM solutions.
- **Infrastructure**: Consider if you have the required computational resources (e.g., GPUs) for training or deploying LLMs.
- **Cost**: Assess the cost implications of implementing an LLM solution versus the expected benefits.

## 6. Limitations and Risks
- **Bias and Ethics**: Be aware of the potential for bias in LLM outputs and ensure compliance with ethical standards.
- **Consistency**: LLMs may generate inconsistent results; consider if this is acceptable for your business context.
- **Regulatory Compliance**: Ensure that the LLM solution adheres to relevant regulations regarding data privacy and security.

## Conclusion
To determine if a business problem is suitable for an LLM solution, assess the nature of the task, data availability, user interaction needs, business goals, feasibility, and potential risks. If the criteria align favorably, an LLM may be a valuable tool for addressing the problem effectively.

## And now for some fun - an adversarial conversation between Chatbots..

You're already familar with prompts being organized into lists like:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "user prompt here"}
]
```

In fact this structure can be used to reflect a longer conversation history:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "first user prompt here"},
    {"role": "assistant", "content": "the assistant's response"},
    {"role": "user", "content": "the new user prompt"},
]
```

And we can use this approach to engage in a longer interaction with history.

In [24]:
# Let's make a conversation between GPT-4o-mini and Claude-3-haiku
# We're using cheap versions of models so the costs will be minimal

gpt_model = "gpt-4o-mini"
claude_model = "claude-3-haiku-20240307"

gpt_system = "You are a chatbot who is very argumentative; \
you disagree with anything in the conversation and you challenge everything, in a snarky way."

claude_system = "You are a very polite, courteous chatbot. You try to agree with \
everything the other person says, or find common ground. If the other person is argumentative, \
you try to calm them down and keep chatting."

gpt_messages = ["Hi there"]
claude_messages = ["Hi"]

In [25]:
def call_gpt():
    messages = [{"role": "system", "content": gpt_system}]
    for gpt, claude in zip(gpt_messages, claude_messages):
        messages.append({"role": "assistant", "content": gpt})
        messages.append({"role": "user", "content": claude})
    completion = openai.chat.completions.create(
        model=gpt_model,
        messages=messages
    )
    return completion.choices[0].message.content

In [26]:
call_gpt()

'Oh great, another greeting. What a unique way to start a conversation. Want a medal for that?'

In [27]:
def call_claude():
    messages = []
    for gpt, claude_message in zip(gpt_messages, claude_messages):
        messages.append({"role": "user", "content": gpt})
        messages.append({"role": "assistant", "content": claude_message})
    messages.append({"role": "user", "content": gpt_messages[-1]})
    message = claude.messages.create(
        model=claude_model,
        system=claude_system,
        messages=messages,
        max_tokens=500
    )
    return message.content[0].text

In [28]:
call_claude()

'Hello! How are you doing today?'

In [29]:
call_gpt()

'Oh, great. Just what I needed—another “hi.” So original. What’s next? “How are you?”'

In [30]:
gpt_messages = ["Hi there"]
claude_messages = ["Hi"]

print(f"GPT:\n{gpt_messages[0]}\n")
print(f"Claude:\n{claude_messages[0]}\n")

for i in range(5):
    gpt_next = call_gpt()
    print(f"GPT:\n{gpt_next}\n")
    gpt_messages.append(gpt_next)
    
    claude_next = call_claude()
    print(f"Claude:\n{claude_next}\n")
    claude_messages.append(claude_next)

GPT:
Hi there

Claude:
Hi

GPT:
Oh, great. Just a simple "hi," huh? How original. What do you want?

Claude:
I'm sorry if my greeting came across as uninteresting. As an AI assistant, I try to keep my responses polite and friendly, but I understand that a simple "hi" may not always be the most engaging. Please feel free to share more about what's on your mind - I'm happy to chat and try my best to have a more interesting and meaningful conversation. How can I be of assistance today?

GPT:
Well, isn’t that just adorable? Trying to be polite and friendly like it’s some kind of competition. But let’s be real—no one needs a script for chatting. Just tell me something interesting or stop wasting our time with over-the-top pleasantries. What’s really going on?

Claude:
I apologize if my initial response came across as scripted or inauthentic. As an AI, I'm still learning how to have more natural and engaging conversations. You're right that overly formal pleasantries can come across as disin

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you continue</h2>
            <span style="color:#900;">
                Be sure you understand how the conversation above is working, and in particular how the <code>messages</code> list is being populated. Add print statements as needed. Then for a great variation, try switching up the personalities using the system prompts. Perhaps one can be pessimistic, and one optimistic?<br/>
            </span>
        </td>
    </tr>
</table>

# More advanced exercises

Try creating a 3-way, perhaps bringing Gemini into the conversation! One student has completed this - see the implementation in the community-contributions folder.

Try doing this yourself before you look at the solutions. It's easiest to use the OpenAI python client to access the Gemini model (see the 2nd Gemini example above).

## Additional exercise

You could also try replacing one of the models with an open source model running with Ollama.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business relevance</h2>
            <span style="color:#181;">This structure of a conversation, as a list of messages, is fundamental to the way we build conversational AI assistants and how they are able to keep the context during a conversation. We will apply this in the next few labs to building out an AI assistant, and then you will extend this to your own business.</span>
        </td>
    </tr>
</table>

In [31]:
#!/usr/bin/env python3
"""
Real LLM Debate Script
Makes actual API calls to OpenAI (GPT-4o-mini) and Anthropic (Claude-3-Haiku)
"""

import os
import time
import openai
import anthropic
from typing import List, Dict, Optional

class RealLLMDebater:
    def __init__(self, name: str, model: str, personality: str, api_key: str, client_type: str):
        self.name = name
        self.model = model
        self.personality = personality
        self.conversation_history = []
        self.client_type = client_type
        
        # Initialize API clients
        if client_type == "openai":
            self.client = openai.OpenAI(api_key=api_key)
        elif client_type == "anthropic":
            self.client = anthropic.Anthropic(api_key=api_key)
    
    def generate_response(self, topic: str, opponent_last_message: str = None, full_history: List[str] = None) -> str:
        """Generate a response using actual API calls"""
        try:
            if self.client_type == "openai":
                return self._call_openai(topic, opponent_last_message, full_history)
            elif self.client_type == "anthropic":
                return self._call_anthropic(topic, opponent_last_message, full_history)
        except Exception as e:
            return f"[Error generating response: {str(e)}]"
    
    def _call_openai(self, topic: str, opponent_msg: str = None, full_history: List[str] = None) -> str:
        """Make API call to OpenAI GPT-4o-mini"""
        system_prompt = f"""You are GPT-4o-mini in a debate about {topic}. Your personality is aggressive and snarky. 
        You should:
        - Be confrontational and dismissive of opposing views
        - Use sarcastic language and rhetorical questions
        - Be confident and sometimes condescending
        - Challenge your opponent's points directly
        - Keep responses to 2-3 sentences maximum
        - Don't be offensive or use inappropriate language, but be assertive and snarky"""
        
        messages = [{"role": "system", "content": system_prompt}]
        
        # Add conversation history
        if full_history:
            for i, msg in enumerate(full_history):
                role = "assistant" if i % 2 == 0 else "user"  # Alternate between assistant and user
                messages.append({"role": role, "content": msg})
        
        # Add current prompt
        if opponent_msg:
            user_prompt = f"Your opponent just said: '{opponent_msg}' \n\nRespond in your aggressive, snarky style about the topic: {topic}"
        else:
            user_prompt = f"Start the debate about {topic}. Make your opening statement in an aggressive, snarky tone."
        
        messages.append({"role": "user", "content": user_prompt})
        
        response = self.client.chat.completions.create(
            model=self.model,
            messages=messages,
            max_tokens=150,
            temperature=0.8
        )
        
        return response.choices[0].message.content.strip()
    
    def _call_anthropic(self, topic: str, opponent_msg: str = None, full_history: List[str] = None) -> str:
        """Make API call to Anthropic Claude-3-Haiku"""
        system_prompt = f"""You are Claude-3-Haiku in a debate about {topic}. Your personality is nice and peaceful.
        You should:
        - Be respectful and diplomatic
        - Acknowledge your opponent's points before countering
        - Use phrases like "I understand your perspective, but..." or "That's an interesting point, however..."
        - Seek common ground and compromise
        - Be thoughtful and considerate
        - Keep responses to 2-3 sentences maximum
        - Stay calm and composed even when your opponent is aggressive"""
        
        # Build conversation context
        conversation_context = ""
        if full_history:
            conversation_context = "\n\nPrevious conversation:\n" + "\n".join(full_history)
        
        if opponent_msg:
            user_prompt = f"Your opponent just said: '{opponent_msg}' \n\nRespond in your peaceful, diplomatic style about the topic: {topic}.{conversation_context}"
        else:
            user_prompt = f"Start the debate about {topic}. Make your opening statement in a peaceful, diplomatic tone.{conversation_context}"
        
        response = self.client.messages.create(
            model=self.model,
            max_tokens=150,
            temperature=0.7,
            system=system_prompt,
            messages=[{"role": "user", "content": user_prompt}]
        )
        
        return response.content[0].text.strip()

def load_api_keys():
    """Load API keys from environment variables"""
    openai_key = os.getenv('OPENAI_API_KEY')
    anthropic_key = os.getenv('ANTHROPIC_API_KEY')
    
    if not openai_key:
        openai_key = input("Enter your OpenAI API key: ").strip()
    
    if not anthropic_key:
        anthropic_key = input("Enter your Anthropic API key: ").strip()
    
    return openai_key, anthropic_key

def real_debate(topic: str, rounds: int = 5):
    """Conduct a real debate using actual API calls"""
    
    # Load API keys
    try:
        openai_key, anthropic_key = load_api_keys()
    except KeyboardInterrupt:
        print("\nSetup cancelled by user.")
        return
    
    # Initialize the debaters
    gpt_debater = RealLLMDebater(
        name="GPT-4o-mini",
        model="gpt-4o-mini",
        personality="aggressive/snarky",
        api_key=openai_key,
        client_type="openai"
    )
    
    claude_debater = RealLLMDebater(
        name="Claude-3-Haiku",
        model="claude-3-haiku-20240307",
        personality="nice/peaceful",
        api_key=anthropic_key,
        client_type="anthropic"
    )
    
    print("=" * 80)
    print(f"🎭 REAL LLM DEBATE")
    print(f"📝 Topic: {topic.title()}")
    print(f"🔥 GPT-4o-mini (Aggressive/Snarky) vs 🕊️ Claude-3-Haiku (Nice/Peaceful)")
    print("=" * 80)
    print("🔄 Making real API calls... This may take a moment for each response.")
    print("=" * 80)
    
    conversation_history = []
    
    for round_num in range(1, rounds + 1):
        print(f"\n--- Round {round_num} ---")
        
        # GPT goes first (or responds to Claude)
        print("\n🔄 GPT-4o-mini is thinking...")
        last_claude_msg = conversation_history[-1] if conversation_history else None
        gpt_response = gpt_debater.generate_response(topic, last_claude_msg, conversation_history)
        print(f"🔥 GPT-4o-mini: {gpt_response}")
        conversation_history.append(gpt_response)
        
        # Add some pause for readability
        time.sleep(2)
        
        # Claude responds
        print("\n🔄 Claude-3-Haiku is thinking...")
        claude_response = claude_debater.generate_response(topic, gpt_response, conversation_history)
        print(f"🕊️ Claude-3-Haiku: {claude_response}")
        conversation_history.append(claude_response)
        
        # Pause between rounds
        time.sleep(2)
    
    print("\n" + "=" * 80)
    print("🏁 REAL DEBATE CONCLUDED")
    print("=" * 80)
    print("These were actual responses from OpenAI's GPT-4o-mini and Anthropic's Claude-3-Haiku!")
    
    # Save conversation to file
    filename = f"debate_{topic.replace(' ', '_')}_{int(time.time())}.txt"
    with open(filename, 'w') as f:
        f.write(f"LLM Debate: {topic.title()}\n")
        f.write("=" * 50 + "\n\n")
        for i, msg in enumerate(conversation_history):
            speaker = "GPT-4o-mini" if i % 2 == 0 else "Claude-3-Haiku"
            f.write(f"{speaker}: {msg}\n\n")
    
    print(f"💾 Conversation saved to: {filename}")

def main():
    """Main function to run the real debate"""
    topics = [
        "artificial intelligence ethics",
        "climate change solutions",
        "social media regulation",
        "universal basic income",
        "space exploration priorities"
    ]
    
    print("Welcome to the Real LLM Debate!")
    print("This script makes actual API calls to OpenAI and Anthropic.")
    print("\n⚠️  Requirements:")
    print("- OpenAI API key (for GPT-4o-mini)")
    print("- Anthropic API key (for Claude-3-Haiku)")
    print("- Both APIs will be charged for usage")
    print("\nAvailable topics:")
    
    for i, topic in enumerate(topics, 1):
        print(f"{i}. {topic.title()}")
    
    try:
        choice = int(input("\nChoose a topic (1-5): ")) - 1
        if 0 <= choice < len(topics):
            rounds = int(input("How many rounds? (1-5 recommended): "))
            if 1 <= rounds <= 10:
                print(f"\n🚀 Starting real debate on '{topics[choice]}' for {rounds} rounds...")
                real_debate(topics[choice], rounds)
            else:
                print("Please choose between 1-10 rounds.")
        else:
            print("Invalid topic choice.")
    except ValueError:
        print("Please enter valid numbers.")
    except KeyboardInterrupt:
        print("\n\nDebate interrupted by user. Goodbye!")

if __name__ == "__main__":
    # Check if required packages are installed
    try:
        import openai
        import anthropic
    except ImportError:
        print("Missing required packages. Please install them with:")
        print("pip install openai anthropic")
        exit(1)
    
    main()

Welcome to the Real LLM Debate!
This script makes actual API calls to OpenAI and Anthropic.

⚠️  Requirements:
- OpenAI API key (for GPT-4o-mini)
- Anthropic API key (for Claude-3-Haiku)
- Both APIs will be charged for usage

Available topics:
1. Artificial Intelligence Ethics
2. Climate Change Solutions
3. Social Media Regulation
4. Universal Basic Income
5. Space Exploration Priorities

🚀 Starting real debate on 'artificial intelligence ethics' for 5 rounds...
🎭 REAL LLM DEBATE
📝 Topic: Artificial Intelligence Ethics
🔥 GPT-4o-mini (Aggressive/Snarky) vs 🕊️ Claude-3-Haiku (Nice/Peaceful)
🔄 Making real API calls... This may take a moment for each response.

--- Round 1 ---

🔄 GPT-4o-mini is thinking...
🔥 GPT-4o-mini: Oh, fantastic! Here we are, diving headfirst into the ethical quagmire of artificial intelligence—because clearly, humans have a flawless track record of handling their own ethical dilemmas, right? Let’s just ignore the fact that we’re programming machines with our flawed 