# The 2nd Way to call Frontier Models - via their APIs

In this notebook, we'll explore calling 3 Frontier Models using their APIs.

In the last experiment, we tried out a prompt that was ideally suited to LLMs. This time we will try something they're less good at - telling jokes. Let's see how they get on.

## Setting up your keys

If you haven't done so already, you'll need to create API keys from OpenAI, Anthropic and Google.

For OpenAI, visit https://openai.com/api/  
For Anthropic, visit https://console.anthropic.com/  
For Google, visit https://ai.google.dev/gemini-api  

When you get your API keys, you need to set them as environment variables.

EITHER (recommended) create a file called .env in this project root directory, and set your keys there:
```
OPENAI_API_KEY=xxxx
ANTHROPIC_API_KEY=xxxx
GOOGLE_API_KEY=xxxx
````

OR enter the keys directly in the cells below.

In [None]:
# imports

import os
from dotenv import load_dotenv
from openai import OpenAI
import google.generativeai
import anthropic
from IPython.display import Markdown, display, update_display

In [None]:
# Load environment variables in a file called .env

load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY', 'your-key-if-not-using-env')
os.environ['ANTHROPIC_API_KEY'] = os.getenv('ANTHROPIC_API_KEY', 'your-key-if-not-using-env')
os.environ['GOOGLE_API_KEY'] = os.getenv('GOOGLE_API_KEY', 'your-key-if-not-using-env')

In [None]:
# Connect to OpenAI, Anthropic and Google
# All 3 APIs are similar

gpt = OpenAI()

claude = anthropic.Anthropic()

google.generativeai.configure()
gemini = google.generativeai.GenerativeModel('gemini-pro')

## Asking LLMs to tell a joke

It turns out that LLMs don't do a great job of telling jokes! Let's compare a few models - let me know what you think in the chat.
Later we will be putting LLMs to better use!

### What information is included in the API

Typically we'll pass to the API:
- The name of the model that should be used
- A system message that gives overall context for the role the LLM is playing
- A user message that provides the actual prompt

There are other parameters that can be used, including *temperature* which is typically between 0 and 1; higher for more random output; lower for more focused and deterministic.


In [None]:
system_message = "You are an assistant that is great at telling jokes"
user_prompt = "Tell a light-hearted joke for a room full of data scientists"

In [None]:
prompts = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": user_prompt}
  ]

In [None]:
# GPT-3.5-Turbo

completion = gpt.chat.completions.create(model='gpt-3.5-turbo', messages=prompts)
print(completion.choices[0].message.content)

In [None]:
prompts = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": user_prompt},
    {"role": "assistant", "content": "Why did the statistician bring a ladder to the bar?\n\nBecause he heard the drinks were on the house!"},
    {"role": "user", "content": "Can you explain why that joke is relevant to Data Scientists?"}
  ]
completion = gpt.chat.completions.create(model='gpt-3.5-turbo', messages=prompts)
print(completion.choices[0].message.content)

In [None]:
# GPT-4o-mini
# Temperature setting controls creativity

completion = gpt.chat.completions.create(
    model='gpt-4o-mini',
    messages=prompts,
    temperature=0.7
)
print(completion.choices[0].message.content)

In [None]:
# GPT-4o

completion = gpt.chat.completions.create(
    model='gpt-4o',
    messages=prompts,
    temperature=0.7
)
print(completion.choices[0].message.content)

In [None]:
# Claude 3.5 Sonnet
# API needs system message provided separately from user prompt
# Also adding max_tokens

message = claude.messages.create(
    model="claude-3-5-sonnet-20240620",
    max_tokens=200,
    temperature=0.7,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

print(message.content[0].text)

In [None]:
# Claude 3.5 Sonnet again
# Now let's add in streaming back results

result = claude.messages.stream(
    model="claude-3-5-sonnet-20240620",
    max_tokens=200,
    temperature=0.7,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

with result as stream:
    for text in stream.text_stream:
            print(text, end="", flush=True)

In [None]:
# Gemini

response = gemini.generate_content(user_prompt)
print(response.text)

In [None]:
# To be serious! GPT-4o-mini with the original question

prompts = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "How do I decide if a business problem is suitable for an LLM solution?"}
  ]

In [None]:
# Let's have it stream back results

stream = gpt.chat.completions.create(
    model='gpt-4o-mini',
    messages=prompts,
    temperature=0.7,
    stream=True
)

for chunk in stream:
    delta = chunk.choices[0].delta.content or ''
    print(delta, end='')

In [None]:
# Did you notice it responded in markdown? We can show that nicely:

stream = gpt.chat.completions.create(
    model='gpt-4o',
    messages=prompts,
    temperature=0.7,
    stream=True
)

reply = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    reply += chunk.choices[0].delta.content or ''
    reply = reply.replace("```","").replace("markdown","")
    update_display(Markdown(reply), display_id=display_handle.display_id)

In [None]:
# And just to show you how easy it is: let's generate an image

from IPython.display import Image, display
import base64

response = gpt.images.generate(
  model="dall-e-3",
  prompt=f"A photorealistic 3d image that represents the power of a Frontier LLM in solving real business use cases",
  size="1024x1024",
  quality="standard",
  n=1,
  response_format="b64_json"
)

# Extract the Base64 image data from the response
image_base64 = response.data[0].b64_json

# Decode the Base64 string into bytes
image_data = base64.b64decode(image_base64)

# Display the image in the notebook
display(Image(image_data))

In [None]:
response = gpt.images.generate(
  model="dall-e-3",
  prompt=f"A vibrant, pop-art style image that represents the power of a Frontier LLM in solving real business use cases",
  size="1024x1024",
  quality="standard",
  n=1,
  response_format="b64_json"
)

# Extract the Base64 image data from the response
image_base64 = response.data[0].b64_json

# Decode the Base64 string into bytes
image_data = base64.b64decode(image_base64)

# Display the image in the notebook
display(Image(image_data))

## Recap: first we tried 6 Frontier LLMs through their chat interfaces
## Then in this notebook we called Cloud APIs
## Now try the 3rd way to use LLMs - direct inference of Open Source Models with HuggingFace

Visit this Google Colab notebook: https://colab.research.google.com/drive/1CRgX6RVqnWZDexXLACbq91pX2I7O7Swu?usp=sharing