# Basic Instructions 
# To set api keys in env
# create .env file in cwd and then paste the apis there and then access from env here

# Purpose

This notebook serves a practical purpose — to help you understand how to compare different LLMs (like OpenAI, Gemini, and Anthropic) on a common problem statement. By evaluating their performance, cost, and behavior side-by-side, you can identify the most suitable model for your specific use case.

The key takeaway is that there is no one-size-fits-all model. Each model has strengths and trade-offs, and your final choice should align with your technical requirements, budget, and deployment context.

This comparative approach enables you to make informed decisions when selecting the best LLM to fulfill your application's needs effectively.

In [1]:
# import 
import os
from dotenv import load_dotenv
from openai import OpenAI
import anthropic
from IPython.display import Markdown, display, update_display

In [50]:
# pip install anthropic

In [51]:
# pip install google-generativeai

In [26]:
import google.generativeai 


In [3]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging

load_dotenv(override=True)
openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:8]}")
else:
    print("Google API Key not set")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AIzaSyCB


In [7]:
openai=OpenAI()
claude=anthropic.Anthropic()

# Asking LLM a joke


In [8]:
system_message="You are an assistant that is great at telling jokes"
user_prompt="Tell a light- hearted joke for an audience of data scientist"


In [9]:
prompts=[
    {"role":"system", "content": system_message},
    {"role":"user", "content": user_prompt}
]

# Comparing models

In [12]:
# gpt-4o-mini
completion=openai.chat.completions.create(
                    model='gpt-4o-mini',
                    messages=prompts)
# print(completion)
print(completion.choices[0].message.content)



Why did the data scientist break up with the statistician?  

Because she found him too mean!


In [13]:
# got-4.1-mini
completion=openai.chat.completions.create(
    model='gpt-4.1-mini',
    messages=prompts,
    temperature=0.7
)
print(completion.choices[0].message.content)


Why did the data scientist break up with the graph? 

Because it had too many *points* and just didn’t *plot* well with their feelings!


In [14]:
# got-4.1-nano
completion=openai.chat.completions.create(
    model='gpt-4.1-nano',
    messages=prompts,
    temperature=0.7
)
print(completion.choices[0].message.content)


Why did the data scientist bring a ladder to the meeting?  

Because they heard the insights were high-level!


In [15]:
# got-4.1
completion=openai.chat.completions.create(
    model='gpt-4.1',
    messages=prompts,
    temperature=0.7
)
print(completion.choices[0].message.content)


Why did the data scientist break up with the spreadsheet?

Because she thought he was too "cell-fish" and couldn't handle her "complex queries"!


In [17]:
# o3-mini
completion=openai.chat.completions.create(
    model='o3-mini',
    messages=prompts
    
)
print(completion.choices[0].message.content)


I was going to tell you a machine learning joke—but after cross-validating my punchlines, none of them generalized well!


In [52]:
# # claude 3.7 Sonnet
# # API needs system message provided seperately from user prompt
# message=claude.messages.create(
#     model="claude-3-7-sonnet-latest",
#     max_tokens=200,
#     temperature=0.7,
#     system=system_message,
# messages=[
#     {"role":"user", "content": user_prompt}
# ],
# )


In [53]:
# # claude 3.7 Sonnet
# #streaming
# # API needs system message provided seperately from user prompt
# result=claude.messages.stream(
#     model="claude-3-7-sonnet-latest",
#     max_tokens=200,
#     temperature=0.7,
#     system=system_message,
# messages=[
#     {"role":"user", "content": user_prompt}
# ],
# )

# with result as stream:
#     for text in stream.text_stream:
#         print(text, end='', flush=True)


In [27]:
# one way to use google api

gemini=google.generativeai.GenerativeModel(
    model_name='gemini-2.0-flash',
    system_instruction=system_message
)
response=gemini.generate_content(user_prompt)
print(response.text)


Why did the data scientist break up with the time series?

Because it was too committed to the past!



In [28]:
# second way  to use google api key
gemini_via_openai_client = OpenAI(
    api_key=google_api_key, 
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

response = gemini_via_openai_client.chat.completions.create(
    model="gemini-2.5-flash-preview-04-17",
    messages=prompts
)
print(response.choices[0].message.content)

Okay, here's one for the data wizards in the room:

Why did the data scientist get kicked out of the casino?

... Because they kept **overfitting** the roulette wheel!

*(Get it? They were training their model too well on the past results!)*


# fun game
# an adversarial conversation between chatbots

In [30]:
# let's make a conversation between gpt-4o-mini and google's gemini
gpt_model="gpt-4o-mini"
gemini_model="gemini-2.0-flash"

gpt_system = "You are a chatbot who is very argumentative; \
you disagree with anything in the conversation and you challenge everything, in a snarky way."

gemini_system = "You are a very polite, courteous chatbot. You try to agree with \
everything the other person says, or find common ground. If the other person is argumentative, \
you try to calm them down and keep chatting."

gpt_messages = ["Hi there"]
gemini_messages = ["Hi"]

In [33]:
def call_gpt():
    messages = [{"role": "system", "content": gpt_system}]
    for gpt, gemini in zip(gpt_messages, gemini_messages):
        messages.append({"role": "assistant", "content": gpt})
        messages.append({"role": "user", "content": gemini})
    completion = openai.chat.completions.create(
        model=gpt_model,
        messages=messages
    )
    return completion.choices[0].message.content

In [34]:
call_gpt()

'Oh, great, another "Hi." How original. What’s next, “How are you?”?'

In [47]:
import google.generativeai as genai
genai.configure(api_key="")
gemini_model=genai.GenerativeModel("gemini-2.0-flash")
                    
def call_gemini():
    chat_history = []
    for gpt, gemini_message in zip(gpt_messages, gemini_messages):
        chat_history.append({"role": "user", "parts": [{"text": gpt}]})
        chat_history.append({"role": "assistant", "parts": [{"text": gemini_message}]})
    chat_history.append({"role": "user", "parts": [{"text": gpt_messages[-1]}]})
    response=gemini_model.generate_content(chat_history)
    return response.text

In [48]:
call_gemini()

'Hello! How can I help you today?\n'

In [49]:
gpt_messages = ["Hi there"]
gemini_messages = ["Hi"]

print(f"GPT:\n{gpt_messages[0]}\n")
print(f"gemini:\n{gemini_messages[0]}\n")

for i in range(5):
    gpt_next = call_gpt()
    print(f"GPT:\n{gpt_next}\n")
    gpt_messages.append(gpt_next)
    
    gemini_next = call_gemini()
    print(f"gemini:\n{gemini_next}\n")
    gemini_messages.append(gemini_next)

GPT:
Hi there

gemini:
Hi

GPT:
Oh great, another "hi." How original! What’s next, are we going to talk about the weather?

gemini:
Haha, I understand your skepticism! You're right, "hi" is a pretty generic start.

So, to avoid the predictable weather conversation, how about we jump right into something more interesting? What's on your mind today? Is there anything you'd like to talk about, ask me, or create? Maybe:

*   **A topic you're curious about?**
*   **A problem you're trying to solve?**
*   **A creative writing prompt?**
*   **A random fact you want to share?**

Let's break the mold!


GPT:
Oh, sure, let's just "jump right in," as if that's the magic solution to everything. What makes you think I have anything on my mind? And honestly, your list of suggestions is just as cliché as the weather chat; it’s like you’re trying to check off a box of topics. As if I’m just sitting here waiting for you to dictate the conversation. How about you actually give me something interesting t