# Week 1: Understanding LLMs and API Basics

## Learning Objectives
By the end of this session, you will:
- Understand how large language models work at a conceptual level
- Successfully make API calls to OpenAI
- Understand key API parameters and their effects
- Build simple text generation scripts

## Part 1: How LLMs Work (Conceptual Overview)

### Tokens: The Building Blocks
- LLMs don't see words, they see **tokens**
- A token is roughly 3-4 characters or about 0.75 words
- "Hello world" ≈ 2-3 tokens
- This matters for cost and context limits!

### Prediction and Probability
- LLMs predict the next token based on all previous tokens
- They assign probabilities to many possible next tokens
- They don't "know" things - they predict statistically likely continuations
- Temperature controls randomness in selection

### Key Limitations
- No real-time information (knowledge cutoff dates)
- Can "hallucinate" plausible-sounding but false information
- Cannot count tokens or characters perfectly
- Context window limits (how much text they can "remember")

## Part 2: Setting Up Your Environment

### Load Environment Variables
We use `python-dotenv` to keep API keys secure and separate from code.

In [None]:
import os
from dotenv import load_dotenv
from openai import OpenAI

# Load environment variables from .env file
load_dotenv()

# Initialize the OpenAI client
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

print("✓ Environment loaded successfully!")
print(f"✓ API key found: {os.getenv('OPENAI_API_KEY')[:8]}...")

## Part 3: Your First API Call

The basic structure of an OpenAI API call:
```python
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Your prompt here"}]
)
```

In [None]:
# Simple completion
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Say hello!"}
    ]
)

print(response.choices[0].message.content)

### Understanding the Response Object

Let's examine what the API returns:

In [None]:
# Make another call and explore the response
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "What is 2+2?"}
    ]
)

print("Full response object:")
print(response)
print("\n" + "="*50 + "\n")

print("Just the content:")
print(response.choices[0].message.content)
print("\n" + "="*50 + "\n")

print("Token usage:")
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")

## Part 4: Key API Parameters

### Temperature (0.0 to 2.0)
Controls randomness:
- **0.0**: Deterministic, always picks most likely token
- **0.7**: Balanced (default for most uses)
- **1.5+**: Very creative/random

In [None]:
# Let's compare different temperatures
prompt = "Complete this sentence: The best thing about learning to code is"

for temp in [0.0, 0.7, 1.5]:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=temp,
        max_tokens=50
    )
    print(f"Temperature {temp}:")
    print(response.choices[0].message.content)
    print("\n" + "-"*50 + "\n")

### Max Tokens
Limits the length of the response. Important for cost control!

In [None]:
# Compare different max_tokens
prompt = "Explain what a large language model is."

for max_tok in [20, 50, 150]:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=max_tok
    )
    print(f"Max tokens: {max_tok}")
    print(response.choices[0].message.content)
    print(f"Actual tokens used: {response.usage.completion_tokens}")
    print("\n" + "-"*50 + "\n")

### System Messages
Set the behavior and personality of the assistant:

In [None]:
# Without system message
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "What is DNA?"}
    ]
)
print("Without system message:")
print(response.choices[0].message.content)
print("\n" + "="*50 + "\n")

# With system message
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a biology teacher explaining concepts to 10-year-olds. Use simple language and fun analogies."},
        {"role": "user", "content": "What is DNA?"}
    ]
)
print("With system message (10-year-old level):")
print(response.choices[0].message.content)

## Part 5: Practical Examples

### Example 1: Text Summarization

In [None]:
long_text = """
Large language models are artificial intelligence systems trained on vast amounts of text data. 
They learn patterns in language by predicting the next word in a sequence. These models have billions 
of parameters and can generate human-like text, answer questions, write code, and perform various 
language tasks. They work by converting text into numerical representations called tokens, processing 
these tokens through neural network layers, and generating probability distributions for likely next tokens.
"""

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Summarize the following text in one sentence."},
        {"role": "user", "content": long_text}
    ],
    temperature=0.3
)

print("Summary:")
print(response.choices[0].message.content)

### Example 2: Text Classification

In [None]:
def classify_sentiment(text):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Classify the sentiment of the text as: positive, negative, or neutral. Respond with only one word."},
            {"role": "user", "content": text}
        ],
        temperature=0,
        max_tokens=10
    )
    return response.choices[0].message.content.strip()

# Test it
test_texts = [
    "I love this new feature!",
    "This is the worst experience ever.",
    "The product arrived on time."
]

for text in test_texts:
    sentiment = classify_sentiment(text)
    print(f"Text: {text}")
    print(f"Sentiment: {sentiment}")
    print()

### Example 3: Information Extraction

In [None]:
def extract_info(text):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Extract the person's name, email, and phone number from the text. Format as: Name: X | Email: Y | Phone: Z"},
            {"role": "user", "content": text}
        ],
        temperature=0
    )
    return response.choices[0].message.content

contact_text = "Hi, I'm John Smith. You can reach me at john.smith@email.com or call me at 555-123-4567."

extracted = extract_info(contact_text)
print("Extracted information:")
print(extracted)

## Part 6: Cost Awareness

Understanding and tracking your API costs:

In [None]:
# Pricing (as of late 2024, check current pricing!)
# gpt-4o-mini: ~$0.15 per 1M input tokens, ~$0.60 per 1M output tokens

def estimate_cost(response, model="gpt-4o-mini"):
    """Estimate the cost of an API call"""
    # Current pricing (verify at platform.openai.com/pricing)
    pricing = {
        "gpt-4o-mini": {"input": 0.15, "output": 0.60},  # per 1M tokens
        "gpt-4o": {"input": 2.50, "output": 10.00}
    }
    
    input_cost = (response.usage.prompt_tokens / 1_000_000) * pricing[model]["input"]
    output_cost = (response.usage.completion_tokens / 1_000_000) * pricing[model]["output"]
    total_cost = input_cost + output_cost
    
    return {
        "input_tokens": response.usage.prompt_tokens,
        "output_tokens": response.usage.completion_tokens,
        "total_tokens": response.usage.total_tokens,
        "cost_usd": total_cost,
        "cost_cents": total_cost * 100
    }

# Test it
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Write a haiku about programming."}]
)

print(response.choices[0].message.content)
print("\n" + "="*50)
cost_info = estimate_cost(response)
print(f"\nTokens used: {cost_info['total_tokens']}")
print(f"Estimated cost: ${cost_info['cost_usd']:.6f} (or {cost_info['cost_cents']:.4f} cents)")

## Key Takeaways

1. **LLMs predict tokens** based on probability, they don't "know" facts
2. **API structure** is simple: model + messages + parameters
3. **Temperature** controls randomness (0 = deterministic, higher = creative)
4. **max_tokens** limits response length and controls costs
5. **System messages** shape the assistant's behavior
6. **Always monitor costs** - even small calls add up!

## Next Steps

In Week 2, we'll learn how to:
- Maintain conversation history
- Build multi-turn conversations
- Manage context effectively

Complete the assignment to practice these concepts!