# LLM API Examples - The Big 3

This notebook shows you how to call the three major LLM providers:
1. **OpenAI** (GPT-4, GPT-4o, GPT-3.5)
2. **Google Gemini** (Gemini 2.0, Gemini 1.5)
3. **Anthropic Claude** (Claude 3.5 Sonnet, Claude Opus)

## What are LLMs?
Large Language Models (LLMs) are AI models that can:
- Generate human-like text
- Answer questions
- Have conversations
- Write code, summarize text, translate, and more

## Learning Path Check ✅
Learning in the right order:
1. **Embeddings**  → Convert text to vectors for similarity/search
2. **LLM APIs**  → Generate intelligent responses
3. **Next: RAG** → Combine both (use embeddings to find context, LLM to generate answers)

## Setup: Install Required Libraries

In [None]:
# Uncomment and run if you need to install:
# !pip install openai google-genai anthropic python-dotenv

## Setup: API Keys

Add these to your `.env` file:
```
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AI...
ANTHROPIC_API_KEY=sk-ant-...
```

Get your API keys:
- **OpenAI**: https://platform.openai.com/api-keys
- **Google**: https://aistudio.google.com/apikey
- **Anthropic**: https://console.anthropic.com/settings/keys

---

# Example 1: OpenAI (GPT-4o)

**Model:** `gpt-4o` (GPT-4 Optimized - fast, intelligent, multimodal)

**Popular Models:**
- `gpt-4o` - Best balance of speed and intelligence
- `gpt-4o-mini` - Faster, cheaper, good for simple tasks
- `gpt-3.5-turbo` - Cheapest, fastest

In [None]:
from openai import OpenAI
from dotenv import load_dotenv
import os

# Load environment variables
load_dotenv()

# Initialize client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Simple chat completion
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": "Explain what machine learning is in one sentence."}
    ],
    temperature=0.7,  # 0-2, higher = more creative
    max_tokens=100    # Maximum response length
)

# Extract the response
answer = response.choices[0].message.content

print("OpenAI Response:")
print(answer)
print(f"\nTokens used: {response.usage.total_tokens}")

OpenAI Response:
Machine learning is a field of artificial intelligence that involves training algorithms to recognize patterns and make decisions based on data.

Tokens used: 49


---

# Example 2: Google Gemini 2.0

**Model:** `gemini-2.0-flash-exp` (Latest, experimental, fast)

**Popular Models:**
- `gemini-2.0-flash-exp` - Newest, fastest (experimental)
- `gemini-1.5-pro` - Most capable, best reasoning
- `gemini-1.5-flash` - Fast, cost-effective

In [None]:
# code to identify currently available models

from google import genai
from dotenv import load_dotenv
import os

# Load environment variables
load_dotenv()

# Initialize client
client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])

models = client.models.list()
for model in models:
    print(f"Model: {model.name}")

Model: models/embedding-gecko-001
Model: models/gemini-2.5-flash
Model: models/gemini-2.5-pro
Model: models/gemini-2.0-flash-exp
Model: models/gemini-2.0-flash
Model: models/gemini-2.0-flash-001
Model: models/gemini-2.0-flash-exp-image-generation
Model: models/gemini-2.0-flash-lite-001
Model: models/gemini-2.0-flash-lite
Model: models/gemini-2.0-flash-lite-preview-02-05
Model: models/gemini-2.0-flash-lite-preview
Model: models/gemini-exp-1206
Model: models/gemini-2.5-flash-preview-tts
Model: models/gemini-2.5-pro-preview-tts
Model: models/gemma-3-1b-it
Model: models/gemma-3-4b-it
Model: models/gemma-3-12b-it
Model: models/gemma-3-27b-it
Model: models/gemma-3n-e4b-it
Model: models/gemma-3n-e2b-it
Model: models/gemini-flash-latest
Model: models/gemini-flash-lite-latest
Model: models/gemini-pro-latest
Model: models/gemini-2.5-flash-lite
Model: models/gemini-2.5-flash-image-preview
Model: models/gemini-2.5-flash-image
Model: models/gemini-2.5-flash-preview-09-2025
Model: models/gemini-2.5-

In [None]:
from google import genai
from dotenv import load_dotenv
import os

# Load environment variables
load_dotenv()

# Initialize client
client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])

# Simple chat completion
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Explain what machine learning is in one sentence.",
    config={
        "temperature": 0.7,
        "max_output_tokens": 1000  # Increased from 100 to 1000
    }
)

# Extract the response
answer = response.text

print("Google Gemini Response:")
print(answer)


Google Gemini Response:
Machine learning enables computers to learn from data, identify patterns, and make decisions or predictions without being explicitly programmed for every task.


In [None]:
from google import genai
from google.genai import types
from dotenv import load_dotenv
import os

load_dotenv()
client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Explain what machine learning is in one sentence.",
    config=types.GenerateContentConfig(
        temperature=0.7,
        max_output_tokens=1000  # Increase this
    )
)

answer = response.text
print("Google Gemini Response:")
print(answer)

# Check WHY it stopped
print(f"\nFinish reason: {response.candidates[0].finish_reason}")
print(f"\nTokens used:")
print(f"Input tokens: {response.usage_metadata.prompt_token_count}")
print(f"Output tokens: {response.usage_metadata.candidates_token_count}")
print(f"Total tokens: {response.usage_metadata.total_token_count}")


Google Gemini Response:
Machine learning is a method where computers learn from data to identify patterns and make predictions or decisions, without being explicitly programmed for every task.

Finish reason: STOP

Tokens used:
Input tokens: 10
Output tokens: 27
Total tokens: 994


---

# Example 3: Anthropic Claude 3.5

**Model:** `claude-3-5-sonnet-20241022` (Latest Sonnet - best for coding)

**Popular Models:**
- `claude-3-5-sonnet-20241022` - Best balance (this is what powers me!)
- `claude-3-5-haiku-20241022` - Fastest, cheapest
- `claude-opus-4` - Most intelligent (expensive)

In [None]:
from anthropic import Anthropic
from dotenv import load_dotenv
import os

# Load environment variables
load_dotenv()

# Initialize client
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

# Simple chat completion
response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=100,  # Maximum response length
    temperature=0.7, # 0-1, higher = more creative
    messages=[
        {"role": "user", "content": "Explain what machine learning is in one sentence."}
    ]
)

# Extract the response
answer = response.content[0].text

print("Anthropic Claude Response:")
print(answer)
print(f"\nTokens used: Input={response.usage.input_tokens}, Output={response.usage.output_tokens}")

---

# Bonus: Conversation with Context (Multi-turn Chat)

Real applications need to maintain conversation history. Here's how:

## OpenAI Conversation Example

In [None]:
from openai import OpenAI
from dotenv import load_dotenv
import os

load_dotenv()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Conversation history
messages = [
    {"role": "system", "content": "You are a helpful AI tutor."},
    {"role": "user", "content": "What is Python?"},
]

# First exchange
response = client.chat.completions.create(model="gpt-4o", messages=messages)
assistant_reply = response.choices[0].message.content
messages.append({"role": "assistant", "content": assistant_reply})

print("User: What is Python?")
print(f"Assistant: {assistant_reply}\n")

# Follow-up question (model remembers context!)
messages.append({"role": "user", "content": "What are its main use cases?"})
response = client.chat.completions.create(model="gpt-4o", messages=messages)
assistant_reply = response.choices[0].message.content

print("User: What are its main use cases?")
print(f"Assistant: {assistant_reply}")

User: What is Python?
Assistant: Python is a high-level, interpreted programming language known for its simplicity and readability, making it an excellent choice for beginners and experienced developers alike. Created by Guido van Rossum and first released in 1991, Python emphasizes clear and concise code, utilizing indentation to define code blocks rather than brackets or keywords.

Python supports multiple programming paradigms, including procedural, object-oriented, and functional programming. It has a vast standard library and a thriving ecosystem of third-party packages, making it highly versatile for a wide range of applications. Some common use cases for Python include:

1. **Web Development**: Frameworks like Django and Flask enable the creation of robust web applications.
2. **Data Analysis and Machine Learning**: Libraries such as Pandas, NumPy, SciPy, and Scikit-learn empower scientists and analysts to process and analyze data efficiently.
3. **Artificial Intelligence**: Ten

## Claude Conversation Example

In [None]:
from anthropic import Anthropic
from dotenv import load_dotenv
import os

load_dotenv()
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

# Conversation history
messages = [
    {"role": "user", "content": "What is Python?"},
]

# First exchange
response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=200,
    messages=messages
)
assistant_reply = response.content[0].text
messages.append({"role": "assistant", "content": assistant_reply})

print("User: What is Python?")
print(f"Assistant: {assistant_reply}\n")

# Follow-up question (model remembers context!)
messages.append({"role": "user", "content": "What are its main use cases?"})
response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=200,
    messages=messages
)
assistant_reply = response.content[0].text

print("User: What are its main use cases?")
print(f"Assistant: {assistant_reply}")

---

# Quick Reference

## Common Parameters Explained

| Parameter | Range | What it does |
|-----------|-------|-------------|
| **temperature** | 0-2 (OpenAI, Google)<br>0-1 (Claude) | Controls randomness:<br>• 0 = deterministic, focused<br>• 1+ = creative, varied |
| **max_tokens** | 1-model limit | Maximum length of response |
| **top_p** | 0-1 | Alternative to temperature (nucleus sampling) |
| **messages** | Array | Conversation history (role + content) |

## Message Roles

- **system** (OpenAI only): Sets the AI's behavior/personality
- **user**: Your questions/prompts
- **assistant**: AI's responses

## Cost Comparison (Approximate, Jan 2025)

| Provider | Model | Input (per 1M tokens) | Output (per 1M tokens) |
|----------|-------|----------------------|------------------------|
| OpenAI | gpt-4o | $2.50 | $10.00 |
| OpenAI | gpt-4o-mini | $0.15 | $0.60 |
| Google | gemini-1.5-pro | $1.25 | $5.00 |
| Google | gemini-1.5-flash | $0.075 | $0.30 |
| Anthropic | claude-3.5-sonnet | $3.00 | $15.00 |
| Anthropic | claude-3.5-haiku | $0.80 | $4.00 |

## When to Use Which?

- **OpenAI GPT-4o**: Best for general tasks, great multimodal support
- **Google Gemini**: Great for free tier learning, good multilingual support
- **Claude**: Best for coding tasks, long context (200K tokens), strong reasoning
