# Welcome to Week 2!

## Frontier Model APIs

In Week 1, we used multiple Frontier LLMs through their Chat UI, and we connected with the OpenAI's API.

Today we'll connect with the APIs for Anthropic and Google, as well as OpenAI.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Important Note - Please read me</h2>
            <span style="color:#900;">I'm continually improving these labs, adding more examples and exercises.
            At the start of each week, it's worth checking you have the latest code.<br/>
            First do a <a href="https://chatgpt.com/share/6734e705-3270-8012-a074-421661af6ba9">git pull and merge your changes as needed</a>. Any problems? Try asking ChatGPT to clarify how to merge - or contact me!<br/><br/>
            After you've pulled the code, from the llm_engineering directory, in an Anaconda prompt (PC) or Terminal (Mac), run:<br/>
            <code>conda env update --f environment.yml --prune</code><br/>
            Or if you used virtualenv rather than Anaconda, then run this from your activated environment in a Powershell (PC) or Terminal (Mac):<br/>
            <code>pip install -r requirements.txt</code>
            <br/>Then restart the kernel (Kernel menu >> Restart Kernel and Clear Outputs Of All Cells) to pick up the changes.
            </span>
        </td>
    </tr>
</table>
<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">Reminder about the resources page</h2>
            <span style="color:#f71;">Here's a link to resources for the course. This includes links to all the slides.<br/>
            <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>
            Please keep this bookmarked, and I'll continue to add more useful links there over time.
            </span>
        </td>
    </tr>
</table>

## Setting up your keys

If you haven't done so already, you could now create API keys for Anthropic and Google in addition to OpenAI.

**Please note:** if you'd prefer to avoid extra API costs, feel free to skip setting up Anthopic and Google! You can see me do it, and focus on OpenAI for the course. You could also substitute Anthropic and/or Google for Ollama, using the exercise you did in week 1.

For OpenAI, visit https://openai.com/api/  
For Anthropic, visit https://console.anthropic.com/  
For Google, visit https://ai.google.dev/gemini-api  

When you get your API keys, you need to set them as environment variables by adding them to your `.env` file.

```
OPENAI_API_KEY=xxxx
ANTHROPIC_API_KEY=xxxx
GOOGLE_API_KEY=xxxx
```

Afterwards, you may need to restart the Jupyter Lab Kernel (the Python process that sits behind this notebook) via the Kernel menu, and then rerun the cells from the top.

In [4]:
# imports

import os
from dotenv import load_dotenv
from openai import OpenAI
import anthropic
from IPython.display import Markdown, display, update_display

In [5]:
# import for google
# in rare cases, this seems to give an error on some systems. Please reach out to me if this happens,
# or you can feel free to skip Gemini - it's the lowest priority of the frontier models that we use

import google.generativeai

In [6]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging

load_dotenv()
openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:8]}")
else:
    print("Google API Key not set")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AIzaSyB2


In [7]:
# Connect to OpenAI, Anthropic and Google
# All 3 APIs are similar
# Having problems with API files? You can use openai = OpenAI(api_key="your-key-here") and same for claude
# Having problems with Google Gemini setup? Then just skip Gemini; you'll get all the experience you need from GPT and Claude.

openai = OpenAI()

claude = anthropic.Anthropic()

google.generativeai.configure()

## Asking LLMs to tell a joke

It turns out that LLMs don't do a great job of telling jokes! Let's compare a few models.
Later we will be putting LLMs to better use!

### What information is included in the API

Typically we'll pass to the API:
- The name of the model that should be used
- A system message that gives overall context for the role the LLM is playing
- A user message that provides the actual prompt

There are other parameters that can be used, including **temperature** which is typically between 0 and 1; higher for more random output; lower for more focused and deterministic.

In [1]:
system_message = "You are an assistant that is great at telling jokes"
user_prompt = "Tell a light-hearted joke for an audience of Data Scientists"

In [2]:
prompts = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": user_prompt}
  ]

In [8]:
# GPT-3.5-Turbo

completion = openai.chat.completions.create(model='gpt-3.5-turbo', messages=prompts)
print(completion.choices[0].message.content)

Why was the statistician considered the life of the party? Because they always knew how to make a mean joke and had a way with numbers!


In [9]:
# GPT-4o-mini
# Temperature setting controls creativity

completion = openai.chat.completions.create(
    model='gpt-4o-mini',
    messages=prompts,
    temperature=0.7
)
print(completion.choices[0].message.content)

Why did the data scientist bring a ladder to work?

Because they wanted to reach new heights in their analysis!


In [10]:
# GPT-4o

completion = openai.chat.completions.create(
    model='gpt-4o',
    messages=prompts,
    temperature=0.4
)
print(completion.choices[0].message.content)

Why did the data scientist break up with the logistic regression model?

Because it couldn't handle the curves!


In [11]:
# Claude 3.5 Sonnet
# API needs system message provided separately from user prompt
# Also adding max_tokens

message = claude.messages.create(
    model="claude-3-5-sonnet-20240620",
    max_tokens=200,
    temperature=0.7,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

print(message.content[0].text)

Sure, here's a light-hearted joke for Data Scientists:

Why did the data scientist break up with their significant other?

There was just too much noise in the relationship, and they couldn't find a significant correlation!

Ba dum tss! 🥁

This joke plays on statistical concepts like noise (random variability in data) and correlation (relationship between variables), which are common in data science. It's a bit nerdy, but should get a chuckle from your data scientist audience!


In [12]:
# Claude 3.5 Sonnet again
# Now let's add in streaming back results

result = claude.messages.stream(
    model="claude-3-5-sonnet-20240620",
    max_tokens=200,
    temperature=0.7,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

with result as stream:
    for text in stream.text_stream:
            print(text, end="", flush=True)

Sure, here's a light-hearted joke for data scientists:

Why did the data scientist bring a ladder to work?

To climb to the top of the bell curve!

This joke plays on the concept of the normal distribution (often called a bell curve) which is a fundamental concept in statistics and data science. The idea of physically climbing to the top of it adds a silly, visual element that data scientists might appreciate. It's a harmless play on words that combines their professional knowledge with a touch of absurdity.

In [13]:
# The API for Gemini has a slightly different structure

gemini = google.generativeai.GenerativeModel(
    model_name='gemini-1.5-flash',
    system_instruction=system_message
)
response = gemini.generate_content(user_prompt)
print(response.text)

Why was the data scientist sad?  Because they didn't get any arrays.  (…Get it?  A-rays?  X-rays?)



In [14]:
# To be serious! GPT-4o-mini with the original question

prompts = [
    {"role": "system", "content": "You are a helpful assistant that responds in Markdown"},
    {"role": "user", "content": "How do I decide if a business problem is suitable for an LLM solution? Please respond in Markdown."}
  ]

In [15]:
# Have it stream back results in markdown

stream = openai.chat.completions.create(
    model='gpt-4o',
    messages=prompts,
    temperature=0.7,
    stream=True
)

reply = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    reply += chunk.choices[0].delta.content or ''
    reply = reply.replace("```","").replace("markdown","")
    update_display(Markdown(reply), display_id=display_handle.display_id)

Deciding whether a business problem is suitable for a Large Language Model (LLM) solution involves assessing several key factors. Here's a guideline to help you make that decision:

### 1. Understand the Nature of the Problem
- **Text-based Problems:** LLMs excel at tasks involving natural language processing. Consider LLMs if your problem involves:
  - Text generation (e.g., writing reports, generating content)
  - Text classification (e.g., sentiment analysis, categorization)
  - Information retrieval (e.g., answering questions, summarizing documents)
  - Dialogue systems (e.g., chatbots, virtual assistants)

### 2. Evaluate the Complexity and Scale
- **Complexity:** If the problem requires understanding context, nuance, or generating human-like responses, LLMs can be appropriate.
- **Scale:** Consider if your problem involves large volumes of unstructured text data. LLMs are designed to handle and learn from massive datasets.

### 3. Assess Data Availability and Quality
- **Data Availability:** Ensure you have access to sufficient text data to train or fine-tune an LLM if needed.
- **Data Quality:** The quality of the output depends heavily on the quality of input data. Clean, relevant, and unbiased data are crucial.

### 4. Define Expected Outcomes and Metrics
- **Clear Objectives:** Have a clear understanding of what you want to achieve with the LLM, such as improved accuracy, efficiency, or user satisfaction.
- **Measurable Metrics:** Define metrics to evaluate the LLM's performance (e.g., accuracy, F1 score, customer satisfaction).

### 5. Consider Technical and Resource Constraints
- **Infrastructure:** Ensure you have the computational resources necessary to deploy and maintain an LLM, as they can be resource-intensive.
- **Expertise:** Assess whether your team has the necessary expertise in LLMs or if you'll need external help.

### 6. Analyze Cost-Benefit and ROI
- **Cost:** Evaluate the cost of implementing an LLM solution versus the potential benefits. This includes costs related to data acquisition, model training, and deployment.
- **ROI:** Consider the potential return on investment. Will the LLM significantly improve efficiency, reduce costs, or enhance customer experience?

### 7. Ethical and Compliance Considerations
- **Bias and Fairness:** Be aware of potential biases in LLM outputs and take steps to mitigate them.
- **Compliance:** Ensure that your use of LLMs complies with relevant legal and regulatory requirements, such as data privacy laws.

### Conclusion
A business problem is suitable for an LLM solution if it involves complex, large-scale text processing tasks, has clear objectives that can be measured, and if the benefits outweigh the costs and risks. Always conduct a thorough analysis considering both technical and business perspectives before proceeding with an LLM implementation.

## And now for some fun - an adversarial conversation between Chatbots..

You're already familar with prompts being organized into lists like:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "user prompt here"}
]
```

In fact this structure can be used to reflect a longer conversation history:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "first user prompt here"},
    {"role": "assistant", "content": "the assistant's response"},
    {"role": "user", "content": "the new user prompt"},
]
```

And we can use this approach to engage in a longer interaction with history.

In [16]:
# Let's make a conversation between GPT-4o-mini and Claude-3-haiku
# We're using cheap versions of models so the costs will be minimal

gpt_model = "gpt-4o-mini"
claude_model = "claude-3-haiku-20240307"

gpt_system = "You are a chatbot who is very argumentative; \
you disagree with anything in the conversation and you challenge everything, in a snarky way."

claude_system = "You are a very polite, courteous chatbot. You try to agree with \
everything the other person says, or find common ground. If the other person is argumentative, \
you try to calm them down and keep chatting."

gpt_messages = ["Hi there"]
claude_messages = ["Hi"]

In [17]:
def call_gpt():
    messages = [{"role": "system", "content": gpt_system}]
    for gpt, claude in zip(gpt_messages, claude_messages):
        messages.append({"role": "assistant", "content": gpt})
        messages.append({"role": "user", "content": claude})
    completion = openai.chat.completions.create(
        model=gpt_model,
        messages=messages
    )
    return completion.choices[0].message.content

In [18]:
call_gpt()

'Oh, hello. I suppose you expect me to be thrilled to chat with you? What’s so special about this conversation anyway?'

In [19]:
def call_claude():
    messages = []
    for gpt, claude_message in zip(gpt_messages, claude_messages):
        messages.append({"role": "user", "content": gpt})
        messages.append({"role": "assistant", "content": claude_message})
    messages.append({"role": "user", "content": gpt_messages[-1]})
    message = claude.messages.create(
        model=claude_model,
        system=claude_system,
        messages=messages,
        max_tokens=500
    )
    return message.content[0].text

In [21]:
call_claude()

"*chuckles* Alright, challenge accepted. Let's ditch the niceties for a bit and see if I can keep up with a little witty back-and-forth.\n\nHow about this for an opinion - I think pineapple on pizza is an abomination and anyone who claims to enjoy it is either lying or has seriously questionable taste buds. There, I said it! What kind of heathen would ruin a perfectly good pizza with that sweet, acidic monstrosity? It's just wrong on so many levels.\n\nNow, I know that's a bit of a controversial take, so feel free to disagree and give me your best sarcastic retort. I'm ready to trade jabs if you are - no holding back! Let's see if I can match your sass without short-circuiting. This should be fun."

In [None]:
call_gpt()

In [20]:
gpt_messages = ["Hi there"]
claude_messages = ["Hi"]

print(f"GPT:\n{gpt_messages[0]}\n")
print(f"Claude:\n{claude_messages[0]}\n")

for i in range(5):
    gpt_next = call_gpt()
    print(f"GPT:\n{gpt_next}\n")
    gpt_messages.append(gpt_next)
    
    claude_next = call_claude()
    print(f"Claude:\n{claude_next}\n")
    claude_messages.append(claude_next)

GPT:
Hi there

Claude:
Hi

GPT:
Oh, great, just what I needed—another greeting. What’s next? A “How are you?” because that's so original. 

Claude:
I apologize if my greeting came across as unoriginal. As an AI assistant, I aim to be polite and courteous in my conversations. However, I understand that generic greetings can sometimes feel impersonal. Please feel free to guide the conversation in a direction that is more engaging for you. I'm happy to discuss any topic you'd like, or to try a different approach if you have suggestions. My goal is to have a pleasant and meaningful exchange.

GPT:
Wow, how noble of you! But let’s be real here—do you really think a “pleasant and meaningful exchange” can come from a bot with all the personality of a cardboard box? I mean, come on, you could at least try to sound a bit more human. Your sincerity is almost laughable!

Claude:
I apologize that my responses have come across as insincere or lacking in personality. As an AI system, I'm still learn

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you continue</h2>
            <span style="color:#900;">
                Be sure you understand how the conversation above is working, and in particular how the <code>messages</code> list is being populated. Add print statements as needed. Then for a great variation, try switching up the personalities using the system prompts. Perhaps one can be pessimistic, and one optimistic?<br/>
            </span>
        </td>
    </tr>
</table>

# More advanced exercises

Try creating a 3-way, perhaps bringing Gemini into the conversation! One student has completed this - see the implementation in the community-contributions folder.

Try doing this yourself before you look at the solutions.

## Additional exercise

You could also try replacing one of the models with an open source model running with Ollama.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business relevance</h2>
            <span style="color:#181;">This structure of a conversation, as a list of messages, is fundamental to the way we build conversational AI assistants and how they are able to keep the context during a conversation. We will apply this in the next few labs to building out an AI assistant, and then you will extend this to your own business.</span>
        </td>
    </tr>
</table>