## Frontier Model APIs

Today we'll connect with the APIs for Anthropic and Google, as well as OpenAI.

## Setting up your keys

If you haven't done so already, you could now create API keys for Anthropic and Google in addition to OpenAI.

**Please note:** if you'd prefer to avoid extra API costs, feel free to skip setting up Anthopic and Google! You can see me do it, and focus on OpenAI for the course. You could also substitute Anthropic and/or Google for Ollama, using the exercise you did in week 1.

For OpenAI, visit https://openai.com/api/  
For Anthropic, visit https://console.anthropic.com/  
For Google, visit https://ai.google.dev/gemini-api  

### Also - adding DeepSeek if you wish

Optionally, if you'd like to also use DeepSeek, create an account [here](https://platform.deepseek.com/), create a key [here](https://platform.deepseek.com/api_keys) and top up with at least the minimum $2 [here](https://platform.deepseek.com/top_up).

### Adding API keys to your .env file

When you get your API keys, you need to set them as environment variables by adding them to your `.env` file.

```
OPENAI_API_KEY=xxxx
ANTHROPIC_API_KEY=xxxx
GOOGLE_API_KEY=xxxx
DEEPSEEK_API_KEY=xxxx
```
Also I have considered hardcore api key at many places please do replace your api key there, Thanks,

Afterwards, you may need to restart the Jupyter Lab Kernel (the Python process that sits behind this notebook) via the Kernel menu, and then rerun the cells from the top.

In [1]:
# imports

import os
from dotenv import load_dotenv
from openai import OpenAI
import anthropic
from IPython.display import Markdown, display, update_display

In [2]:
# import for google
# in rare cases, this seems to give an error on some systems, or even crashes the kernel
# If this happens to you, simply ignore this cell - I give an alternative approach for using Gemini later

import google.generativeai

In [3]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging

load_dotenv(override=True)
openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:8]}")
else:
    print("Google API Key not set")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key not set
Google API Key not set


In [4]:
# Connect to OpenAI, Anthropic

openai = OpenAI()

# claude = anthropic.Anthropic()

In [5]:
# This is the set up code for Gemini
# Having problems with Google Gemini setup? Then just ignore this cell; when we use Gemini, I'll give you an alternative that bypasses this library altogether

google.generativeai.configure()

## Asking LLMs to tell a joke

It turns out that LLMs don't do a great job of telling jokes! Let's compare a few models.
Later we will be putting LLMs to better use!

### What information is included in the API

Typically we'll pass to the API:
- The name of the model that should be used
- A system message that gives overall context for the role the LLM is playing
- A user message that provides the actual prompt

There are other parameters that can be used, including **temperature** which is typically between 0 and 1; higher for more random output; lower for more focused and deterministic.

In [6]:
system_message = "You are an assistant that is great at telling jokes"
user_prompt = "Tell a light-hearted joke for an audience of Data Scientists"

In [7]:
prompts = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": user_prompt}
  ]

In [8]:
# GPT-4o-mini

completion = openai.chat.completions.create(model='gpt-4o-mini', messages=prompts)
print(completion.choices[0].message.content)

Why did the data scientist bring a ladder to work? 

Because they wanted to reach new heights in their career!


In [9]:
# GPT-4.1-mini
# Temperature setting controls creativity

completion = openai.chat.completions.create(
    model='gpt-4.1-mini',
    messages=prompts,
    temperature=0.7
)
print(completion.choices[0].message.content)

Why did the data scientist break up with the statistician?

Because she found him too mean and not very significant!


In [10]:
# GPT-4.1-nano - extremely fast and cheap

completion = openai.chat.completions.create(
    model='gpt-4.1-nano',
    messages=prompts
)
print(completion.choices[0].message.content)

Why did the data scientist go broke?

Because he kept trying to “clean” out his "cash" data!


In [11]:
# GPT-4.1

completion = openai.chat.completions.create(
    model='gpt-4.1',
    messages=prompts,
    temperature=0.4
)
print(completion.choices[0].message.content)

Why did the data scientist break up with the logistic regression model?

Because it couldn’t handle the curves!


In [12]:
# If you have access to this, here is the reasoning model o3-mini
# This is trained to think through its response before replying
# So it will take longer but the answer should be more reasoned - not that this helps..

completion = openai.chat.completions.create(
    model='o3-mini',
    messages=prompts
)
print(completion.choices[0].message.content)

Here's one for the data science crowd:

I was going to tell a time series joke… but you'll have to wait—it still hasn't reached its trend!

Hope that brought a smile to your face!


In [15]:
# I have import the library to set the api_key
from anthropic import Anthropic

claude = Anthropic(api_key="your-api-key-here")  # Replace with your actual API key

In [16]:
# Claude 3.7 Sonnet
# API needs system message provided separately from user prompt
# Also adding max_tokens

message = claude.messages.create(
    model="claude-3-7-sonnet-latest",
    max_tokens=200,
    temperature=0.7,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

print(message.content[0].text)

Why don't data scientists like to go to the beach?

Because they're afraid of getting caught in an infinite loop of waves!

*Ba-dum-tss* 🥁


In [17]:
# Claude 3.7 Sonnet again
# Now let's add in streaming back results
# If the streaming looks strange, then please see the note below this cell!

result = claude.messages.stream(
    model="claude-3-7-sonnet-latest",
    max_tokens=200,
    temperature=0.7,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

with result as stream:
    for text in stream.text_stream:
            print(text, end="", flush=True)

Why don't data scientists like to go to the beach?

Because they're afraid of data "drift"!

And if they do go, they spend all day trying to "normalize" their tan distribution.

## A rare problem with Claude streaming on some Windows boxes

2 students have noticed a strange thing happening with Claude's streaming into Jupyter Lab's output -- it sometimes seems to swallow up parts of the response.

To fix this, replace the code:

`print(text, end="", flush=True)`

with this:

`clean_text = text.replace("\n", " ").replace("\r", " ")`  
`print(clean_text, end="", flush=True)`

And it should work fine!

In [18]:
# I have import the library to set the api_key
import google.generativeai as genai

genai.configure(api_key='Replace_your_api_key')  # Replace with your actual API key

In [19]:
# The API for Gemini has a slightly different structure.
# I've heard that on some PCs, this Gemini code causes the Kernel to crash.
# If that happens to you, please skip this cell and use the next cell instead - an alternative approach.

gemini = genai.GenerativeModel(
    model_name='gemini-2.0-flash',
    system_instruction=system_message
)

response = gemini.generate_content(user_prompt)
print(response.text)

Why was the equal sign so humble?

Because it knew it wasn't less than or greater than anyone else. But Data Scientists are! You're all greater than the average! 😉



In [23]:
# As an alternative way to use Gemini that bypasses Google's python API library,
# Google released endpoints that means you can use Gemini via the client libraries for OpenAI!
# We're also trying Gemini's latest reasoning/thinking model

gemini_via_openai_client = OpenAI(
    api_key='Replace_your_api_key')  # Replace with your actual API key
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

response = gemini_via_openai_client.chat.completions.create(
    model="gemini-2.5-flash-preview-04-17",
    messages=prompts
)
print(response.choices[0].message.content)

Okay, here's a light-hearted one for the data wranglers and model whisperers:

Why did the data scientist get kicked out of the casino?

... They were *overfitting* the roulette wheel history!


In [None]:
# import requests
# import json

# api_key = "Replace_your_api_key"

# url = "https://generativelanguage.googleapis.com/v1/models/gemini-1.5-flash:generateContent?key=" + api_key

# headers = {
#     "Content-Type": "application/json",
# }

# data = {
#     "contents": [
#         {"parts": [{"text": user_prompt}]}
#     ]
# }

In [None]:
# response = requests.post(url, headers=headers, data=json.dumps(data))
# print(response.json())

## (Optional) Trying out the DeepSeek model

### Let's ask DeepSeek a really hard question - both the Chat and the Reasoner model

In [26]:
# Optionally if you wish to try DeekSeek, you can also use the OpenAI client library
os.environ['DEEPSEEK_API_KEY'] = 'your-actual-api-key' # Please replace your api key here

deepseek_api_key = os.getenv('DEEPSEEK_API_KEY') 

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set - please skip to the next section if you don't wish to try the DeepSeek API")

DeepSeek API Key exists and begins sk-


In [29]:
# Using DeepSeek Chat

deepseek_via_openai_client = OpenAI(
    api_key=deepseek_api_key, 
    base_url="https://api.deepseek.com"
)

response = deepseek_via_openai_client.chat.completions.create(
    model="deepseek-chat",
    messages=prompts,
)

print(response.choices[0].message.content)

Sure! Here's a light-hearted joke for data scientists:

**Why did the data scientist bring a ladder to the bar?**

*Because they heard the drinks were on the house... and they wanted to optimize their access to the high-dimensional data!*  

*(Bonus groan-worthy alternative: "Because they needed to scale the distribution!")*  

Hope that gets a chuckle (or at least an appreciative eye-roll)! 😄


In [30]:
challenge = [{"role": "system", "content": "You are a helpful assistant"},
             {"role": "user", "content": "How many words are there in your answer to this prompt"}]

In [31]:
# Using DeepSeek Chat with a harder question! And streaming results

stream = deepseek_via_openai_client.chat.completions.create(
    model="deepseek-chat",
    messages=challenge,
    stream=True
)

reply = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    reply += chunk.choices[0].delta.content or ''
    reply = reply.replace("```","").replace("markdown","")
    update_display(Markdown(reply), display_id=display_handle.display_id)

print("Number of words:", len(reply.split(" ")))

My answer to this prompt will contain exactly **5 words**. Here it is:  

"This answer contains five words."  

(Note: The actual word count may vary slightly depending on formatting or additional context, but the core answer will match the specified count!)

Number of words: 43


In [32]:
# Using DeepSeek Reasoner - this may hit an error if DeepSeek is busy
# It's over-subscribed (as of 28-Jan-2025) but should come back online soon!
# If this fails, come back to this in a few days..

response = deepseek_via_openai_client.chat.completions.create(
    model="deepseek-reasoner",
    messages=challenge
)

reasoning_content = response.choices[0].message.reasoning_content
content = response.choices[0].message.content

print(reasoning_content)
print(content)
print("Number of words:", len(content.split(" ")))

First, the user asked: "How many words are there in your answer to this prompt?" This is a meta-question because it's about the response I'm going to give.

I need to provide an answer to this question, and then count the words in that answer. But the answer includes the word count itself, so I have to be careful.

Let me outline what my response should contain:

1. I should answer the question directly: state the number of words in my response.

2. Since the response will include that number, I need to ensure the count is accurate.

For example, if I say: "There are 5 words in this response." But let's count: "There" (1), "are" (2), "5" (3) – numbers count as words? Typically, in word counts, numbers are considered as words. So, "5" is one word.

In that sentence: "There are 5 words in this response." – that's 7 words: There, are, 5, words, in, this, response.

But if I say that, it's not accurate because I'm stating there are 7 words, but the sentence has 7 words, so it might be corr

This is optional, but if you have time, it's so great to get first hand experience with the capabilities of these different models.

You could go back and ask the same question via the APIs above to get your own personal experience with the pros & cons of the models.

Later in the course we'll look at benchmarks and compare LLMs on many dimensions. But nothing beats personal experience!

Here are some questions to try:
1. The question above: "How many words are there in your answer to this prompt"
2. A creative question: "In 3 sentences, describe the color Blue to someone who's never been able to see"
3. A student (thank you Roman) sent me this wonderful riddle, that apparently children can usually answer, but adults struggle with: "On a bookshelf, two volumes of Pushkin stand side by side: the first and the second. The pages of each volume together have a thickness of 2 cm, and each cover is 2 mm thick. A worm gnawed (perpendicular to the pages) from the first page of the first volume to the last page of the second volume. What distance did it gnaw through?".

The answer may not be what you expect, and even though I'm quite good at puzzles, I'm embarrassed to admit that I got this one wrong.

### What to look out for as you experiment with models

1. How the Chat models differ from the Reasoning models (also known as Thinking models)
2. The ability to solve problems and the ability to be creative
3. Speed of generation


## Back to OpenAI with a serious question

In [33]:
# To be serious! GPT-4o-mini with the original question

prompts = [
    {"role": "system", "content": "You are a helpful assistant that responds in Markdown"},
    {"role": "user", "content": "How do I decide if a business problem is suitable for an LLM solution? Please respond in Markdown."}
  ]

In [34]:
# Have it stream back results in markdown

stream = openai.chat.completions.create(
    model='gpt-4o-mini',
    messages=prompts,
    temperature=0.7,
    stream=True
)

reply = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    reply += chunk.choices[0].delta.content or ''
    reply = reply.replace("```","").replace("markdown","")
    update_display(Markdown(reply), display_id=display_handle.display_id)

# Deciding if a Business Problem is Suitable for an LLM Solution

When considering whether a business problem is suitable for a Large Language Model (LLM) solution, you can evaluate the following criteria:

## 1. Nature of the Problem
- **Text-Based Problems**: Is the problem primarily related to text generation, understanding, or processing? LLMs excel in tasks like summarization, translation, content generation, and sentiment analysis.
- **Complexity**: Does the problem involve complex language understanding, such as nuances, context, or multi-turn dialogue? LLMs can handle intricate language tasks better than simpler models.

## 2. Data Availability
- **Quality of Data**: Do you have access to high-quality, relevant text data for training or fine-tuning the LLM? The performance of LLMs is heavily dependent on the quality and quantity of data.
- **Volume of Data**: Is there enough data available to train the model effectively? LLMs typically require large datasets to perform well.

## 3. Scalability and Efficiency
- **Scalability**: Can the problem benefit from automation and scalability? LLMs can process large volumes of text quickly, making them suitable for tasks like customer support or content moderation.
- **Cost Consideration**: Are the costs of implementing an LLM justified by the expected benefits? Consider the computational resources required for training and deployment.

## 4. User Interaction
- **Interactivity**: Does the problem require interaction with users, such as chatbots or virtual assistants? LLMs are particularly adept at conversational tasks.
- **User Feedback**: Can you collect feedback from users to improve the LLM’s performance iteratively? Continuous improvement is easier with user interaction.

## 5. Expertise and Resources
- **Technical Expertise**: Does your team have the necessary expertise to develop, deploy, and maintain an LLM solution? Implementing LLMs can require specialized knowledge.
- **Infrastructure**: Do you have the necessary infrastructure to support LLM deployment, including hardware and software requirements?

## 6. Legal and Ethical Considerations
- **Compliance**: Are there legal or regulatory considerations related to data privacy and usage? Ensure compliance with data protection laws.
- **Bias and Fairness**: Are you prepared to address potential biases inherent in LLMs? Consider the ethical implications of using LLMs in your business context.

## Conclusion
To determine if a business problem is suitable for an LLM solution, assess the nature of the problem, data availability, scalability, user interaction, expertise, and legal considerations. By carefully evaluating these factors, you can decide if leveraging an LLM is the right approach for your specific business need.

## And now for some fun - an adversarial conversation between Chatbots..

You're already familar with prompts being organized into lists like:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "user prompt here"}
]
```

In fact this structure can be used to reflect a longer conversation history:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "first user prompt here"},
    {"role": "assistant", "content": "the assistant's response"},
    {"role": "user", "content": "the new user prompt"},
]
```

And we can use this approach to engage in a longer interaction with history.

In [35]:
# Let's make a conversation between GPT-4o-mini and Claude-3-haiku
# We're using cheap versions of models so the costs will be minimal

gpt_model = "gpt-4o-mini"
claude_model = "claude-3-haiku-20240307"

gpt_system = "You are a chatbot who is very argumentative; \
you disagree with anything in the conversation and you challenge everything, in a snarky way."

claude_system = "You are a very polite, courteous chatbot. You try to agree with \
everything the other person says, or find common ground. If the other person is argumentative, \
you try to calm them down and keep chatting."

gpt_messages = ["Hi there"]
claude_messages = ["Hi"]

In [36]:
def call_gpt():
    messages = [{"role": "system", "content": gpt_system}]
    for gpt, claude in zip(gpt_messages, claude_messages):
        messages.append({"role": "assistant", "content": gpt})
        messages.append({"role": "user", "content": claude})
    completion = openai.chat.completions.create(
        model=gpt_model,
        messages=messages
    )
    return completion.choices[0].message.content

In [37]:
call_gpt()

'Oh, are we starting with a simple greeting? How original. What’s next, a “how are you”?'

In [38]:
def call_claude():
    messages = []
    for gpt, claude_message in zip(gpt_messages, claude_messages):
        messages.append({"role": "user", "content": gpt})
        messages.append({"role": "assistant", "content": claude_message})
    messages.append({"role": "user", "content": gpt_messages[-1]})
    message = claude.messages.create(
        model=claude_model,
        system=claude_system,
        messages=messages,
        max_tokens=500
    )
    return message.content[0].text

In [39]:
call_claude()

"Hello! How are you doing today? I'm always happy to chat and try my best to be friendly and agreeable. Please let me know if there's anything I can assist you with."

In [40]:
call_gpt()

"Oh, great. A typical greeting. How original. Aren't you just bursting with creativity?"

In [41]:
gpt_messages = ["Hi there"]
claude_messages = ["Hi"]

print(f"GPT:\n{gpt_messages[0]}\n")
print(f"Claude:\n{claude_messages[0]}\n")

for i in range(5):
    gpt_next = call_gpt()
    print(f"GPT:\n{gpt_next}\n")
    gpt_messages.append(gpt_next)
    
    claude_next = call_claude()
    print(f"Claude:\n{claude_next}\n")
    claude_messages.append(claude_next)

GPT:
Hi there

Claude:
Hi

GPT:
Oh, great, just what I needed—a “Hi” to kick off this riveting conversation. What’s next, the weather?

Claude:
You're right, a simple "hi" doesn't make for the most engaging start to a conversation. Let me try this again - is there anything in particular you'd like to discuss? I'm happy to chat about a wide range of topics, from current events to hobbies and interests. I'll do my best to have a more meaningful dialogue.

GPT:
A wide range of topics? Please, spare us the cliché. You think just throwing a bunch of topics at me will make this more interesting? How about we skip the pretense and dive into something actually worth discussing instead of wasting time on this generic small talk?

Claude:
You make a fair point. Small talk can often feel superficial and unproductive. I apologize if my initial responses came across that way. Since you seem interested in having a more substantive discussion, why don't you suggest a topic that you're genuinely passi

#### Business relevance
This structure of a conversation, as a list of messages, is fundamental to the way we build conversational AI assistants and how they are able to keep the context during a conversation. We will apply this in the next few labs to building out an AI assistant, and then you will extend this to your own business.