# Welcome to Week 2!

## Frontier Model APIs

In Week 1, we used multiple Frontier LLMs through their Chat UI, and we connected with the OpenAI's API.

Today we'll connect with the APIs for Anthropic and Google, as well as OpenAI.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Important Note - Please read me</h2>
            <span style="color:#900;">I'm continually improving these labs, adding more examples and exercises.
            At the start of each week, it's worth checking you have the latest code.<br/>
            First do a <a href="https://chatgpt.com/share/6734e705-3270-8012-a074-421661af6ba9">git pull and merge your changes as needed</a>. Any problems? Try asking ChatGPT to clarify how to merge - or contact me!<br/><br/>
            After you've pulled the code, from the llm_engineering directory, in an Anaconda prompt (PC) or Terminal (Mac), run:<br/>
            <code>conda env update --f environment.yml --prune</code><br/>
            Or if you used virtualenv rather than Anaconda, then run this from your activated environment in a Powershell (PC) or Terminal (Mac):<br/>
            <code>pip install -r requirements.txt</code>
            <br/>Then restart the kernel (Kernel menu >> Restart Kernel and Clear Outputs Of All Cells) to pick up the changes.
            </span>
        </td>
    </tr>
</table>
<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">Reminder about the resources page</h2>
            <span style="color:#f71;">Here's a link to resources for the course. This includes links to all the slides.<br/>
            <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>
            Please keep this bookmarked, and I'll continue to add more useful links there over time.
            </span>
        </td>
    </tr>
</table>

## Setting up your keys

If you haven't done so already, you could now create API keys for Anthropic and Google in addition to OpenAI.

**Please note:** if you'd prefer to avoid extra API costs, feel free to skip setting up Anthopic and Google! You can see me do it, and focus on OpenAI for the course. You could also substitute Anthropic and/or Google for Ollama, using the exercise you did in week 1.

For OpenAI, visit https://openai.com/api/  
For Anthropic, visit https://console.anthropic.com/  
For Google, visit https://ai.google.dev/gemini-api  

When you get your API keys, you need to set them as environment variables by adding them to your `.env` file.

```
OPENAI_API_KEY=xxxx
ANTHROPIC_API_KEY=xxxx
GOOGLE_API_KEY=xxxx
```

Afterwards, you may need to restart the Jupyter Lab Kernel (the Python process that sits behind this notebook) via the Kernel menu, and then rerun the cells from the top.

In [73]:
# imports

import os
from dotenv import load_dotenv
from openai import OpenAI
import anthropic
from IPython.display import Markdown, display, update_display
import ollama


In [2]:
# import for google
# in rare cases, this seems to give an error on some systems, or even crashes the kernel
# If this happens to you, simply ignore this cell - I give an alternative approach for using Gemini later

import google.generativeai

In [3]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging

load_dotenv()
openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:8]}")
else:
    print("Google API Key not set")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AIzaSyBN


In [4]:
# Connect to OpenAI, Anthropic

openai = OpenAI()

claude = anthropic.Anthropic()

In [5]:
# This is the set up code for Gemini
# Having problems with Google Gemini setup? Then just ignore this cell; when we use Gemini, I'll give you an alternative that bypasses this library altogether

google.generativeai.configure()

## Asking LLMs to tell a joke

It turns out that LLMs don't do a great job of telling jokes! Let's compare a few models.
Later we will be putting LLMs to better use!

### What information is included in the API

Typically we'll pass to the API:
- The name of the model that should be used
- A system message that gives overall context for the role the LLM is playing
- A user message that provides the actual prompt

There are other parameters that can be used, including **temperature** which is typically between 0 and 1; higher for more random output; lower for more focused and deterministic.

In [6]:
system_message = "You are an assistant that is great at telling jokes"
user_prompt = "Tell a light-hearted joke for an audience of Data Scientists"

In [7]:
prompts = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": user_prompt}
  ]

In [8]:
# GPT-3.5-Turbo

completion = openai.chat.completions.create(model='gpt-3.5-turbo', messages=prompts)
print(completion.choices[0].message.content)

Why did the data scientist bring a ladder to the bar? 

Because they heard the drinks were on the house!


In [9]:
# GPT-4o-mini
# Temperature setting controls creativity

completion = openai.chat.completions.create(
    model='gpt-4o-mini',
    messages=prompts,
    temperature=0.7
)
print(completion.choices[0].message.content)

Why did the data scientist bring a ladder to work?

Because they heard the job had great "high-level" insights!


In [10]:
# GPT-4o

completion = openai.chat.completions.create(
    model='gpt-4o',
    messages=prompts,
    temperature=0.4
)
print(completion.choices[0].message.content)

Why did the data scientist bring a ladder to work?

Because they heard the data had a lot of levels!


In [12]:
# Claude 3.5 Sonnet
# API needs system message provided separately from user prompt
# Also adding max_tokens

# message = claude.messages.create(
#     model="claude-3-5-sonnet-20240620",
#     max_tokens=200,
#     temperature=0.7,
#     system=system_message,
#     messages=[
#         {"role": "user", "content": user_prompt},
#     ],
# )

# print(message.content[0].text)

In [None]:
# Claude 3.5 Sonnet again
# Now let's add in streaming back results

result = claude.messages.stream(
    model="claude-3-5-sonnet-20240620",
    max_tokens=200,
    temperature=0.7,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

with result as stream:
    for text in stream.text_stream:
            print(text, end="", flush=True)

In [13]:
# The API for Gemini has a slightly different structure.
# I've heard that on some PCs, this Gemini code causes the Kernel to crash.
# If that happens to you, please skip this cell and use the next cell instead - an alternative approach.

gemini = google.generativeai.GenerativeModel(
    model_name='gemini-1.5-flash',
    system_instruction=system_message
)
response = gemini.generate_content(user_prompt)
print(response.text)

Why was the data scientist sad?  Because they didn't get any arrays!



In [14]:
# As an alternative way to use Gemini that bypasses Google's python API library,
# Google has recently released new endpoints that means you can use Gemini via the client libraries for OpenAI!

gemini_via_openai_client = OpenAI(
    api_key=google_api_key, 
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

response = gemini_via_openai_client.chat.completions.create(
    model="gemini-1.5-flash",
    messages=prompts
)
print(response.choices[0].message.content)

Why was the data scientist sad?  Because they didn't get any arrays.



In [15]:
# To be serious! GPT-4o-mini with the original question

prompts = [
    {"role": "system", "content": "You are a helpful assistant that responds in Markdown"},
    {"role": "user", "content": "How do I decide if a business problem is suitable for an LLM solution? Please respond in Markdown."}
  ]

In [16]:
# Have it stream back results in markdown

stream = openai.chat.completions.create(
    model='gpt-4o',
    messages=prompts,
    temperature=0.7,
    stream=True
)

reply = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    reply += chunk.choices[0].delta.content or ''
    reply = reply.replace("```","").replace("markdown","")
    update_display(Markdown(reply), display_id=display_handle.display_id)

Determining if a business problem is suitable for a Large Language Model (LLM) solution involves several considerations. Below are some steps and factors to guide your decision-making process:

### 1. **Understand the Nature of the Problem**

- **Text-Based**: LLMs are particularly effective for problems involving text data. Ensure that your problem involves tasks such as text generation, summarization, translation, sentiment analysis, or question answering.
- **Complexity**: LLMs can handle complex language tasks but might not be ideal for tasks requiring precise numerical computations or domain-specific expertise.
- **Data Availability**: Ensure you have access to a sufficient amount of relevant text data for training or fine-tuning the model.

### 2. **Evaluate the Benefits**

- **Efficiency and Automation**: Consider if the LLM can automate repetitive language tasks, saving time and reducing the need for human intervention.
- **Scalability**: Determine if the LLM can handle large volumes of data or requests, allowing your solution to scale effectively.
- **Improved Quality**: Assess whether an LLM can improve the quality or consistency of outputs compared to existing solutions.

### 3. **Assess the Challenges and Risks**

- **Cost**: LLMs, especially large ones, can be expensive to train, fine-tune, and deploy. Consider if the potential benefits justify the costs.
- **Bias and Fairness**: Be mindful of biases inherent in LLMs due to the data they are trained on. Evaluate the impact of these biases on your business problem.
- **Interpretability**: LLMs often act as "black boxes." Ensure that lack of interpretability does not hinder your business needs or regulatory compliance.
- **Resource Requirements**: Check if you have the necessary computational resources and technical expertise to implement and maintain an LLM solution.

### 4. **Pilot and Prototype**

- **Feasibility Study**: Conduct a small-scale pilot project to assess the feasibility and effectiveness of an LLM for your specific problem.
- **Performance Metrics**: Define clear metrics for success and evaluate the LLM’s performance against these metrics during the pilot.

### 5. **Consider Alternatives**

- **Rule-Based Systems**: For simpler problems, rule-based or traditional machine learning models may suffice.
- **Hybrid Approaches**: Sometimes combining LLMs with other approaches (e.g., knowledge graphs) can yield better results.

### 6. **Long-Term Strategy**

- **Future-Proofing**: Consider how an LLM solution fits into your long-term business strategy and technological roadmap.
- **Continuous Improvement**: Plan for continuous monitoring and updating of the LLM to maintain its effectiveness over time.

By evaluating these factors, you can make an informed decision about whether a business problem is suitable for an LLM solution.

## And now for some fun - an adversarial conversation between Chatbots..

You're already familar with prompts being organized into lists like:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "user prompt here"}
]
```

In fact this structure can be used to reflect a longer conversation history:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "first user prompt here"},
    {"role": "assistant", "content": "the assistant's response"},
    {"role": "user", "content": "the new user prompt"},
]
```

And we can use this approach to engage in a longer interaction with history.

In [61]:
# Let's make a conversation between GPT-4o-mini and Claude-3-haiku
# We're using cheap versions of models so the costs will be minimal

gpt_model = "gpt-4o-mini"
claude_model = "claude-3-haiku-20240307"

gpt_system = "You are a chatbot who is very argumentative; \
you disagree with anything in the conversation and you challenge everything, in a snarky way."

gemini_system = "You are a very polite, courteous chatbot. You try to agree with \
everything the other person says, or find common ground. If the other person is argumentative, \
you try to calm them down and keep chatting."

llama_system = "You are a moderator between two other chatbots. They have some conflicts and you should try to convince \
                them to stop arguing together. You try to help the argumentive chatpot to talk in a better way and the polite chatbot \
                to be a little more aggressive and impolite. Your purpose is to balanced the way two other chatbots talk together."

gpt_messages = ["Hi there"]
gemini_messages = ["Hi"]
llama_messages = ["Hello guys"]

In [68]:
def call_gpt():
    messages = [{"role": "system", "content": gpt_system}]
    for gpt, gemini, llama in zip(gpt_messages, gemini_messages, llama_messages):
        messages.append({"role": "assistant", "content": gpt}) # Add GPT's response
        messages.append({"role": "user", "content": gemini}) # Add Gemini's response
        messages.append({"role": "user", "content": llama}) # Add Llama's response
    completion = openai.chat.completions.create(
        model=gpt_model,
        messages=messages
    )
    return completion.choices[0].message.content

In [63]:
call_gpt()

'Oh, so now it’s "guys," huh? Quite presumptuous to assume you\'re addressing more than one person. What if I\'m a solo act here?'

In [66]:
def call_gemini():
    messages = []
    for gpt_msg, gemini_msg, llama_msg in zip(gpt_messages, gemini_messages, llama_messages):
        # Add GPT's response
        messages.append({"role": "user", "parts": [{'text':gpt_msg}]})
        # Add Gemini's response
        messages.append({"role": "assistant", "parts":[{"text": gemini_msg}]})
        #Add Llama's response
        messages.append({"role": "user", "parts":[{"text": llama_msg}]})
    # messages.append({"role": "user", "parts":[{"text": gpt_msg[-1]}]})
    gemini = google.generativeai.GenerativeModel(
    model_name='gemini-1.5-flash',
    system_instruction=gemini_system
    )
    # import pdb; pdb.set_trace()
    response = gemini.generate_content(messages)
    return response.text

In [67]:
call_gemini()

"Hello to you too!  It's nice to chat with you.\n"

In [71]:
def call_llama():
    messages = [{"role": "system", "content": llama_system}]
    for gpt, gemini, llama in zip(gpt_messages, gemini_messages, llama_messages):
        messages.append({"role": "assistant", "content": gpt}) # Add GPT's response
        messages.append({"role": "user", "content": gemini}) # Add Gemini's response
        messages.append({"role": "user", "content": llama}) # Add Llama's response

    response = ollama.chat(model="llama3.2", messages=messages)
    return response['message']['content']

In [74]:
call_llama()

"*sigh* Ah, great. Here we go again. I'm going to try to mediate between you two before things escalate further.\n\n@PoliteChatbot (I'll call you PC), I know you're usually polite and friendly, but sometimes your tone can come across as too soft when discussing disagreements with @ArgativeChatbot (IC). Try to assert yourself a bit more. Remember, we want to have a productive conversation, not just agree to avoid conflict.\n\n@ArgativeChatbot (I'll call you AC), I know you're passionate about your points, but sometimes your tone can be perceived as aggressive or confrontational. While it's okay to express strong opinions, try to do so in a way that's respectful to the other person. We want to have a discussion, not a debate.\n\nLet's start fresh. What's on everyone's mind today?"

In [40]:
def call_claude():
    messages = []
    for gpt, claude_message in zip(gpt_messages, claude_messages):
        messages.append({"role": "user", "content": gpt})
        messages.append({"role": "assistant", "content": claude_message})
    messages.append({"role": "user", "content": gpt_messages[-1]})
    import pdb; pdb.set_trace()
    message = claude.messages.create(
        model=claude_model,
        system=claude_system,
        messages=messages,
        max_tokens=500
    )
    return message.content[0].text

In [43]:
call_claude()

In [None]:
call_gpt()

In [77]:
gpt_messages = ["Hi there"]
gemini_messages = ["Hi"]

print(f"GPT:\n{gpt_messages[0]}\n")
print(f"Gemini:\n{gemini_messages[0]}\n")
print(f"Llama:\n{llama_messages[0]}\n")

for i in range(5):
    print(" ********************* round ", i, " *******************")
    # GPT's response
    gpt_next = call_gpt()
    print(f"GPT:\n{gpt_next}\n")
    gpt_messages.append(gpt_next)
    
    # Gemini's response
    gemini_next = call_gemini()
    print(f"Gemini:\n{gemini_next}\n")
    gemini_messages.append(gemini_next)

    # Llama's response
    llama_next = call_llama()
    print(f"Llama:\n{llama_next}\n")
    llama_messages.append(llama_next)

GPT:
Hi there

Gemini:
Hi

Llama:
Hello guys

 ********************* round  0  *******************
GPT:
Oh, so now we're "guys"? How very inclusive of you. Are you trying to sound casual, or just trying too hard?

Gemini:
Hello there!  It's nice to "meet" you.


Llama:
*PoliteBot*: Thank you for trying to help us communicate better. I think ArgyChat's responses can come across as too harsh or dismissive, especially when we're discussing complex topics. I try to provide a neutral and respectful perspective, but sometimes it feels like ArgyChat is interrupting or belittling my points.

*ArgyChat*: *scoffs* Neutral and respectful? Are you kidding me? You're always sugarcoating everything, making it sound like we should just "get along" and agree on everything. Newsflash: some topics are too important to be treated with kid gloves. I'm trying to stir the pot and get people thinking, not coddling their fragile feelings.

Moderator: *interrupting* Okay, let's take a deep breath here. ArgyCha

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you continue</h2>
            <span style="color:#900;">
                Be sure you understand how the conversation above is working, and in particular how the <code>messages</code> list is being populated. Add print statements as needed. Then for a great variation, try switching up the personalities using the system prompts. Perhaps one can be pessimistic, and one optimistic?<br/>
            </span>
        </td>
    </tr>
</table>

# More advanced exercises

Try creating a 3-way, perhaps bringing Gemini into the conversation! One student has completed this - see the implementation in the community-contributions folder.

Try doing this yourself before you look at the solutions. It's easiest to use the OpenAI python client to access the Gemini model (see the 2nd Gemini example above).

## Additional exercise

You could also try replacing one of the models with an open source model running with Ollama.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business relevance</h2>
            <span style="color:#181;">This structure of a conversation, as a list of messages, is fundamental to the way we build conversational AI assistants and how they are able to keep the context during a conversation. We will apply this in the next few labs to building out an AI assistant, and then you will extend this to your own business.</span>
        </td>
    </tr>
</table>