# 01 - LLM Basics

**Welcome to Agentic AI!** This is your first notebook in the journey to building AI agents.

## Learning Objectives

By the end of this notebook, you will:
- Understand what Large Language Models (LLMs) are
- Set up API access for OpenAI and Anthropic
- Make your first API calls
- Understand key concepts: tokens, context windows, temperature

## Table of Contents

1. [What are LLMs?](#what-are-llms)
2. [Setup & Configuration](#setup)
3. [Your First API Call](#first-call)
4. [Understanding Tokens](#tokens)
5. [Parameters: Temperature & More](#parameters)
6. [Exercises](#exercises)
7. [Checkpoint](#checkpoint)

---
## 1. What are LLMs? <a id='what-are-llms'></a>

**Large Language Models (LLMs)** are AI systems trained on massive amounts of text data. They learn patterns in language and can:

- Generate human-like text
- Answer questions
- Summarize documents
- Write code
- And much more!

### How Do They Work? (Simplified)

1. **Training**: The model reads billions of text examples (books, websites, code)
2. **Pattern Learning**: It learns statistical patterns in language
3. **Generation**: Given a prompt, it predicts the most likely next words

### Popular LLMs

| Provider | Models | Strengths |
|----------|--------|----------|
| OpenAI | GPT-4, GPT-4o, GPT-4o-mini | General purpose, code, reasoning |
| Anthropic | Claude 3.5 Sonnet, Claude 3 Opus | Safety, long context, analysis |
| Google | Gemini Pro, Gemini Ultra | Multimodal, integration |
| Meta | Llama 3 | Open source, customizable |

---
## 2. Setup & Configuration <a id='setup'></a>

First, let's set up our environment and API keys.

### Install Dependencies

Run this cell to ensure all packages are installed:

In [None]:
# GUIDED: Install required packages
# Uncomment and run if packages are not installed
# !pip install openai anthropic python-dotenv tiktoken

### Import Libraries

In [None]:
# GUIDED: Import required libraries
import os
import sys
from pathlib import Path

# Add src to path for our utilities
sys.path.append(str(Path.cwd().parent))

from dotenv import load_dotenv

# For direct API usage
from openai import OpenAI
from anthropic import Anthropic

print("Libraries imported successfully!")

### Configure API Keys

You'll need API keys from at least one provider:
- **OpenAI**: Get your key at [platform.openai.com](https://platform.openai.com/api-keys)
- **Anthropic**: Get your key at [console.anthropic.com](https://console.anthropic.com/)

Create a `.env` file in the `Agentic AI/` folder with:
```
OPENAI_API_KEY=sk-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here
```

In [None]:
# GUIDED: Load API keys from .env file
load_dotenv(Path.cwd().parent / ".env")

# Check which keys are available
openai_key = os.getenv("OPENAI_API_KEY")
anthropic_key = os.getenv("ANTHROPIC_API_KEY")

print("API Key Status:")
print(f"  OpenAI: {'✓ Found' if openai_key else '✗ Not found'}")
print(f"  Anthropic: {'✓ Found' if anthropic_key else '✗ Not found'}")

if not openai_key and not anthropic_key:
    print("\n⚠️ No API keys found! Please add them to your .env file.")

---
## 3. Your First API Call <a id='first-call'></a>

Let's make your first call to an LLM!

### OpenAI API

In [None]:
# GUIDED: Make your first OpenAI API call
if openai_key:
    # Initialize the client
    client = OpenAI()
    
    # Make a chat completion request
    response = client.chat.completions.create(
        model="gpt-4o-mini",  # Using the smaller, faster model
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Hello! What's your name?"}
        ]
    )
    
    # Extract the response
    answer = response.choices[0].message.content
    print("OpenAI Response:")
    print(answer)
else:
    print("OpenAI API key not found. Skipping this example.")

### Anthropic API

In [None]:
# GUIDED: Make your first Anthropic API call
if anthropic_key:
    # Initialize the client
    client = Anthropic()
    
    # Make a message request
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1024,
        system="You are a helpful assistant.",
        messages=[
            {"role": "user", "content": "Hello! What's your name?"}
        ]
    )
    
    # Extract the response
    answer = response.content[0].text
    print("Anthropic Response:")
    print(answer)
else:
    print("Anthropic API key not found. Skipping this example.")

### Understanding the Response Structure

Let's examine what we get back from the API:

In [None]:
# GUIDED: Examine the full response object
if openai_key:
    client = OpenAI()
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Say hello in 3 words."}]
    )
    
    print("Full Response Object:")
    print(f"  Model: {response.model}")
    print(f"  Finish Reason: {response.choices[0].finish_reason}")
    print(f"  Content: {response.choices[0].message.content}")
    print(f"\nUsage:")
    print(f"  Prompt tokens: {response.usage.prompt_tokens}")
    print(f"  Completion tokens: {response.usage.completion_tokens}")
    print(f"  Total tokens: {response.usage.total_tokens}")

---
## 4. Understanding Tokens <a id='tokens'></a>

**Tokens** are the basic units that LLMs process. A token is roughly:
- 4 characters in English
- 3/4 of a word

### Why Tokens Matter

1. **Pricing**: You pay per token (input + output)
2. **Context Limits**: Each model has a maximum context window
3. **Speed**: More tokens = longer processing time

In [None]:
# GUIDED: Count tokens using tiktoken
import tiktoken

# Get the encoder for GPT-4
encoder = tiktoken.encoding_for_model("gpt-4")

# Example texts
texts = [
    "Hello!",
    "Hello, how are you today?",
    "The quick brown fox jumps over the lazy dog.",
    "def fibonacci(n):\n    if n <= 1: return n\n    return fibonacci(n-1) + fibonacci(n-2)"
]

print("Token Counting Examples:")
print("=" * 50)
for text in texts:
    tokens = encoder.encode(text)
    print(f"\nText: {text[:50]}{'...' if len(text) > 50 else ''}")
    print(f"Characters: {len(text)}")
    print(f"Tokens: {len(tokens)}")
    print(f"Ratio: {len(text)/len(tokens):.1f} chars per token")

### Context Windows

Each model has a maximum context size:

| Model | Context Window |
|-------|---------------|
| GPT-4o | 128K tokens |
| GPT-4o-mini | 128K tokens |
| Claude 3.5 Sonnet | 200K tokens |
| Claude 3 Opus | 200K tokens |

Context includes: system prompt + conversation history + user message + response

---
## 5. Parameters: Temperature & More <a id='parameters'></a>

You can control LLM behavior with various parameters.

### Temperature

Controls randomness in responses:
- **0.0**: Deterministic, most likely tokens
- **0.7**: Balanced creativity (default)
- **1.0+**: More random, creative

In [None]:
# GUIDED: Experiment with temperature
if openai_key:
    client = OpenAI()
    
    prompt = "Write a one-sentence story about a robot."
    
    print("Temperature Comparison:")
    print("=" * 50)
    
    for temp in [0.0, 0.7, 1.2]:
        print(f"\nTemperature: {temp}")
        print("-" * 30)
        
        # Run 3 times to see variation
        for i in range(3):
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=[{"role": "user", "content": prompt}],
                temperature=temp,
                max_tokens=50
            )
            print(f"  {i+1}. {response.choices[0].message.content}")

### Other Important Parameters

| Parameter | Description | Typical Values |
|-----------|-------------|----------------|
| `max_tokens` | Maximum response length | 100-4096 |
| `temperature` | Randomness | 0.0-2.0 |
| `top_p` | Nucleus sampling | 0.0-1.0 |
| `stop` | Stop sequences | List of strings |
| `presence_penalty` | Discourage repetition | -2.0 to 2.0 |
| `frequency_penalty` | Discourage repeated words | -2.0 to 2.0 |

In [None]:
# GUIDED: Using max_tokens to control response length
if openai_key:
    client = OpenAI()
    
    print("Max Tokens Comparison:")
    print("=" * 50)
    
    for max_tokens in [20, 50, 100]:
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "Explain what Python is."}],
            max_tokens=max_tokens
        )
        content = response.choices[0].message.content
        finish = response.choices[0].finish_reason
        
        print(f"\nmax_tokens={max_tokens} (finish_reason: {finish}):")
        print(f"  {content}")

---
## 6. Exercises <a id='exercises'></a>

Now it's your turn! Complete these exercises to practice what you've learned.

### Exercise 1: Basic API Call

Make an API call asking the model to explain what an AI agent is.

In [None]:
# TODO: Make an API call to explain what an AI agent is
# Use either OpenAI or Anthropic based on which key you have

# Your code here:


### Exercise 2: Token Estimation

Estimate the cost of processing a document. Given:
- Document: 5000 words
- Expected response: 500 words
- Model: GPT-4o-mini ($0.15/1M input, $0.60/1M output)

In [None]:
# TODO: Calculate the estimated cost
# Hint: ~0.75 tokens per word

# Your code here:
document_words = 5000
response_words = 500

# Calculate tokens

# Calculate cost


### Exercise 3: Temperature Experiment

Ask the model to generate a creative name for an AI startup. Run it at temperature 0.0 and 1.0. What differences do you observe?

In [None]:
# TODO: Compare responses at different temperatures

# Your code here:


---
## 7. Checkpoint <a id='checkpoint'></a>

Before moving to the next notebook, verify:

- [ ] You have at least one API key configured
- [ ] You successfully made an API call
- [ ] You understand what tokens are and why they matter
- [ ] You experimented with temperature and max_tokens
- [ ] You completed at least 2 exercises

### Next Steps

In the next notebook, we'll dive into **Prompt Engineering** - the art of crafting effective prompts to get better results from LLMs.

---
## Summary

**Key Takeaways:**

1. **LLMs** are AI systems that generate text by predicting the most likely next tokens
2. **API calls** follow a simple pattern: create a client, send messages, get response
3. **Tokens** are the units of processing - they affect cost, speed, and limits
4. **Temperature** controls creativity - lower = deterministic, higher = random
5. **Context windows** limit how much text you can process at once

**Resources:**
- [OpenAI API Documentation](https://platform.openai.com/docs)
- [Anthropic API Documentation](https://docs.anthropic.com)
- [Tiktoken Tokenizer](https://github.com/openai/tiktoken)