# Foundations: Understanding LLMs and Basic Interactions

Welcome to your first hands-on notebook for learning AI agents! This notebook will teach you the fundamental building blocks of working with Large Language Models (LLMs) programmatically.

## What You'll Learn

In this notebook, you'll understand:
1. How to securely load API keys and configuration from environment files
2. The message structure used by all modern LLM APIs
3. How to design effective prompts that guide model behavior
4. How to implement conversation memory to maintain context
5. Why these foundations matter for building AI agents

## Prerequisites

- Basic Python knowledge (variables, functions, dictionaries, lists)
- Understanding that LLMs are AI models that generate text based on input

## Why This Matters

Before you can build sophisticated AI agents, you need to understand how to communicate with LLMs. Every agent system - from simple chatbots to complex multi-agent workflows - builds on these core concepts:
- Secure configuration management
- Message structure and roles
- Prompt engineering
- Conversation state management

Think of this as learning the alphabet before writing sentences. Master these basics, and everything else becomes much easier.

Let's dive in!

## Step 1: Secure Configuration Management

When working with LLMs, you'll need API keys and configuration. NEVER hardcode these in your code - it's a security risk!

Instead, we use environment files (.env) that:
- Store secrets locally on your machine
- Are excluded from version control (via .gitignore)
- Can be loaded at runtime

Below is a minimal environment loader that doesn't require external dependencies. It searches upward from the current directory to find a .env file, making it robust to where you run the notebook from.

In [None]:
# Minimal environment loader that does NOT require python-dotenv
# It searches upward from the current working directory for a `.env` file
import os
from pathlib import Path

def find_upwards(filename='.env', start_dir=None, max_levels=5):
    """
    Search upward through parent directories to find a file.
    
    Why search upward? When you run a notebook, the current directory might be
    the notebook folder, but your .env file is typically at the project root.
    This function climbs up the directory tree to find it.
    
    Args:
        filename: Name of file to find (default: '.env')
        start_dir: Where to start searching (default: current directory)
        max_levels: How many parent directories to check (default: 5)
    
    Returns:
        Path to the file if found, None otherwise
    """
    # Start from specified directory or current working directory
    start = Path(start_dir or Path.cwd())
    current = start.resolve()  # Get absolute path
    
    # Search up to max_levels parent directories
    for _ in range(max_levels + 1):
        candidate = current / filename  # Build full path
        if candidate.exists():
            return candidate  # Found it!
        
        # Move to parent directory
        if current.parent == current:  # Reached filesystem root
            break
        current = current.parent
    
    return None  # File not found

def load_dotenv_if_present(dotenv_path=None, max_levels=5):
    """
    Load environment variables from a .env file.
    
    This function:
    1. Finds the .env file (searching upward if needed)
    2. Parses each line for KEY=VALUE pairs
    3. Sets them as environment variables (accessible via os.environ)
    
    Args:
        dotenv_path: Specific path to .env (optional)
        max_levels: How many directories to search upward (default: 5)
    
    Returns:
        Dictionary of loaded variables
    """
    # If specific path provided, use it
    if dotenv_path:
        p = Path(dotenv_path)
        if not p.is_absolute():
            p = Path.cwd() / p
        if not p.exists():
            print('No .env file found at', p.resolve())
            return {}
    else:
        # Search upward from current directory
        p = find_upwards('.env', start_dir=Path.cwd(), max_levels=max_levels)
        if p is None:
            print('No .env file found searching up from', Path.cwd())
            return {}
    
    # Parse the .env file
    data = {}
    with p.open() as f:
        for line in f:
            line = line.strip()
            
            # Skip empty lines and comments
            if not line or line.startswith('#'):
                continue
            
            # Parse KEY=VALUE format
            if '=' not in line:
                continue
            
            key, value = line.split('=', 1)  # Split on first '=' only
            key = key.strip()
            value = value.strip().strip('"').strip("'")  # Remove quotes
            
            # Set as environment variable (only if not already set)
            os.environ.setdefault(key, value)
            data[key] = value
    
    print(f'Loaded {len(data)} entries from {p}')
    return data

# Run loader (safe) - will search up to 5 parent directories by default
# This line executes the function and stores results
_loaded_env = load_dotenv_if_present()

# If no .env file found, that's okay! You can still run this notebook
# without connecting to external LLM APIs

Loaded 4 entries from C:\Training\Udacity\AI_Agents_LangGraph\.env


### What Just Happened?

The code above did three important things:

1. **Defined helper functions** to find and load .env files
2. **Searched upward** through parent directories to find your .env file
3. **Loaded environment variables** into `os.environ` so they're accessible throughout your code

**Key Programming Concepts:**
- **Path manipulation:** Using `pathlib.Path` for cross-platform file paths
- **File I/O:** Reading files line by line
- **String parsing:** Splitting KEY=VALUE pairs
- **Environment variables:** Setting values in `os.environ`

**Security Note:** The .env file is in your `.gitignore`, so it never gets committed to version control. This is how we keep API keys safe!

## Step 2: Understanding LLM Message Structure

All modern LLM APIs (OpenAI, Azure, Google, Anthropic) use a similar message structure. Understanding this structure is crucial because it's the foundation of every agent interaction.

### Message Roles

There are three key roles:

1. **system**: Sets the AI's behavior, personality, and constraints
   - "You are a helpful coding assistant"
   - "You are an expert in Python data analysis"
   - This message is processed first and influences all responses

2. **user**: Input from the human or application
   - Questions, commands, data to analyze
   - What you want the LLM to process

3. **assistant**: Outputs from the LLM
   - The model's responses
   - Used when showing conversation history

### Why This Structure Matters

The LLM needs context to generate good responses. By maintaining a list of messages with roles, we can:
- Show the LLM the conversation history
- Control the AI's behavior with system messages
- Build multi-turn conversations where the LLM remembers context

Let's see this in action:

In [None]:
# Example: Building a message list for an LLM conversation
# Each message is a dictionary with 'role' and 'content' keys

messages = [
    # System message sets behavior - this is like giving instructions to the AI
    {"role": "system", "content": "You are a helpful assistant who explains technical concepts clearly using analogies."},
    
    # User message provides the query or command
    {"role": "user", "content": "Explain what an API is in simple terms."},
    
    # In a real conversation, the assistant's response would be added here
    # For example: {"role": "assistant", "content": "An API is like a waiter in a restaurant..."}
]

# Let's examine the structure
print("Message structure example:")
print("-" * 50)
for i, msg in enumerate(messages, 1):
    print(f"\nMessage {i}:")
    print(f"  Role: {msg['role']}")
    print(f"  Content: {msg['content'][:60]}..." if len(msg['content']) > 60 else f"  Content: {msg['content']}")

# This is the data structure you'll pass to LLM APIs
print("\n" + "=" * 50)
print("Full structure:", messages)

Messages structure: [{'role': 'system', 'content': 'You are a helpful assistant.'}, {'role': 'user', 'content': 'Summarize the dataset columns for me.'}]


### What You Just Learned

You now understand:
- **Messages are dictionaries** with "role" and "content" keys
- **System messages control behavior** - they're like giving instructions to an employee
- **User messages are inputs** - questions, commands, data
- **Assistant messages are outputs** - the LLM's responses
- **Order matters** - messages are processed sequentially

**In a real application:** You'd send this message list to an LLM API, get back an assistant message, add it to the list, and continue the conversation.

## Step 3: Prompt Engineering Basics

Prompt engineering is the art and science of crafting effective instructions for LLMs. Good prompts get better results.

### Key Principles

1. **Be specific**: Vague prompts get vague answers
2. **Provide context**: The more context, the better the response
3. **Use examples**: Show the LLM what you want (few-shot learning)
4. **Structure clearly**: Use formatting, sections, constraints

### Template Pattern

Often you'll want to reuse prompt structures with different values. Python's built-in `string.Template` is perfect for this - no external dependencies needed!

Let's create reusable prompt templates:

In [None]:
# Prompt template example using built-in Template
from string import Template

# Create a template with placeholders marked by $variable_name
# This allows you to fill in values without rewriting the whole prompt
template = Template('''System: $system

User: $user_prompt

Expected format: $output_format''')

# Now we can use this template with different values
example_1 = template.substitute(
    system='You are a data analyst who summarizes tables clearly',
    user_prompt='Describe the columns in this sales dataset',
    output_format='Bullet list with column name and data type'
)

print("Example 1 - Data Analysis Prompt:")
print(example_1)
print("\n" + "=" * 60 + "\n")

# Same template, different use case
example_2 = template.substitute(
    system='You are a code reviewer who provides constructive feedback',
    user_prompt='Review this Python function for security issues',
    output_format='List of issues with severity ratings'
)

print("Example 2 - Code Review Prompt:")
print(example_2)

System: You summarize tables
User: Describe columns


### Understanding Templates

**What happened here?**
- We defined a template with `$variable_name` placeholders
- Used `.substitute()` to fill in the placeholders
- Got different prompts from the same template structure

**Why this matters:**
- **Consistency**: All prompts follow the same structure
- **Reusability**: Write the template once, use it many times
- **Maintainability**: Update the template in one place
- **Testing**: Easy to test different values

**Programming Concepts:**
- String interpolation with Template
- Parameterized text generation
- Separation of structure from content

## Step 4: Memory Management

### Why Memory Matters

**The Challenge:**
- LLMs are stateless - they don't remember previous conversations
- Each call is independent - the model forgets everything after responding
- Context window limits - you can't send infinite conversation history

**The Solution: Memory Management**
- Store conversation history to maintain context
- Limit memory to stay within token/character budgets
- Convert memory into system prompts for context injection

**Real-World Applications:**
- Chatbots that remember user preferences
- Multi-turn conversations with context
- Personalized responses based on history
- Debugging and conversation analysis

### What You'll Learn
- How to build a Memory class from scratch
- Character-based memory limiting
- Converting history into system prompts
- Managing conversation context efficiently

In [None]:
# Memory class that limits stored characters and can produce a system prompt

class Memory:
    """
    A simple conversation memory manager with character-based limiting.
    
    Why we need this:
    - LLMs have token limits (typically 4k-128k tokens)
    - Can't send infinite conversation history
    - Need to keep most relevant/recent information
    
    How it works:
    - Stores messages in chronological order
    - Tracks total characters used
    - Removes oldest messages when limit exceeded
    - Can generate summary for system prompts
    """
    
    def __init__(self, char_limit=500):
        """
        Initialize memory with a character limit.
        
        Args:
            char_limit: Maximum characters to store (default 500)
                       In production, you might use token count instead
        
        Why char_limit:
        - Simple approximation of token usage
        - 1 token ≈ 4 characters for English text
        - Easy to understand and debug
        """
        self.char_limit = int(char_limit)
        self.history = []  # List of message dictionaries
        self._total_chars = 0  # Running count of characters
    
    def add(self, role, content):
        """
        Add a message to memory, removing old messages if limit exceeded.
        
        Args:
            role: 'user', 'assistant', or 'system'
            content: The message text
        
        Process:
        1. Create message dictionary
        2. Append to history
        3. Update character count
        4. Remove oldest messages until under limit
        """
        # Create message in same format as LLM APIs
        item = {"role": role, "content": content}
        self.history.append(item)
        self._total_chars += len(content)
        
        # Trim oldest messages until we're under the limit
        # This is a sliding window approach - FIFO (First In, First Out)
        while self._total_chars > self.char_limit and self.history:
            removed = self.history.pop(0)  # Remove from beginning
            self._total_chars -= len(removed['content'])
    
    def summarize(self, max_items=5):
        """
        Create a compact summary of recent messages.
        
        Args:
            max_items: Number of recent messages to include
        
        Returns:
            String with messages separated by ' | '
        
        Why summarize:
        - Condense conversation for system prompts
        - Keep only most relevant context
        - Reduce token usage
        """
        # Take last max_items messages and join their content
        return ' | '.join(m['content'] for m in self.history[-max_items:])
    
    def to_system_prompt(self):
        """
        Convert memory into a system prompt string.
        
        Returns:
            String formatted for LLM system prompt
        
        Use case:
        - Inject conversation context into new LLM calls
        - Maintain continuity across sessions
        - Provide personalization context
        """
        if not self.history:
            return ''
        
        # Prefix with 'Memory summary:' so LLM knows this is historical context
        return 'Memory summary: ' + self.summarize()

# Demo: Building conversational memory
print("=== Memory Management Demo ===\n")

# Create memory with 200 character limit (small for demo purposes)
mem = Memory(char_limit=200)
print(f"Created Memory with {mem.char_limit} character limit\n")

# Add user preferences (might come from profile or conversation)
mem.add('user', 'I like data visualization and prefer seaborn for quick plots.')
print("Added user preference about seaborn")

# Add assistant acknowledgment
mem.add('assistant', 'Noted — I will suggest charts and colors.')
print("Added assistant response")

# Add more context
mem.add('user', 'I often work with time series.')
print("Added additional user context\n")

# Check what's in memory
print(f"Current history length: {len(mem.history)} messages")
print(f"Current character count: {mem._total_chars} / {mem.char_limit}\n")

# Generate system prompt for next LLM call
print('System prompt built from memory:')
print(mem.to_system_prompt())
print("\nThis would be sent to the LLM to maintain context!")

System prompt built from memory:
Memory summary: I like data visualization and prefer seaborn for quick plots. | Noted — I will suggest charts and colors. | I often work with time series.


### What Just Happened?

**Memory Management Flow:**
1. Created Memory instance with 200 character limit
2. Added 3 messages (user → assistant → user)
3. Memory automatically tracked character count
4. Generated system prompt from stored history

**Key Concepts:**
- **Sliding Window**: Old messages dropped when limit exceeded (FIFO)
- **Character Counting**: Simple approximation of token usage
- **Context Injection**: Memory converted to system prompt
- **State Management**: History preserved across interactions

**Why This Design:**
- Simple and predictable behavior
- Easy to understand and debug
- Balances context retention with size limits
- Ready for production use (with token counting upgrade)

**Production Considerations:**
- Use token counting instead of characters (tiktoken library)
- Add timestamp metadata for time-based filtering
- Implement importance scoring (keep critical messages)
- Add compression/summarization for very long conversations

## Practice Exercises

### Exercise 1: Enhanced Memory with Metadata
**Objective**: Extend the Memory class to support message tagging and filtering.

**Requirements**:
- Add a `tags` parameter to the `add()` method
- Store tags with each message (e.g., `project`, `priority`, `topic`)
- Create a `filter_by_tag()` method that returns messages with specific tags
- Update `to_system_prompt()` to accept optional tag filter

**Example Usage**:
```python
mem.add('user', 'Deploy to production', tags=['urgent', 'deployment'])
mem.add('user', 'Update documentation', tags=['docs', 'low-priority'])
urgent_prompt = mem.to_system_prompt(filter_tags=['urgent'])
```

**Why This Matters**:
- Prioritize important information in prompts
- Organize memory by topic/project
- Reduce noise in context injection
- Support multi-project conversations

---

### Exercise 2: Structured System Prompt Generator
**Objective**: Convert memory into formatted, readable system prompts.

**Requirements**:
- Write a `to_structured_prompt()` method
- Format as bullet points or numbered list
- Group by role (user preferences, assistant notes, system info)
- Include message count and freshness indicators

**Example Output**:
```
User Preferences (2 recent):
• Prefers seaborn for visualization
• Works with time series data

Assistant Notes (1 recent):
• Will suggest charts and colors
```

**Why This Matters**:
- Improves LLM comprehension with structure
- Makes prompts more readable for debugging
- Organizes information by type
- Better token efficiency with clear formatting

---

### Exercise 3: Token-Based Memory (Advanced)
**Objective**: Replace character counting with actual token counting.

**Requirements**:
- Install tiktoken: `pip install tiktoken`
- Replace `char_limit` with `token_limit`
- Use `tiktoken.encoding_for_model()` to get encoder
- Count tokens instead of characters in `add()` method

**Example Code Start**:
```python
import tiktoken

class TokenMemory(Memory):
    def __init__(self, token_limit=500, model="gpt-3.5-turbo"):
        self.encoder = tiktoken.encoding_for_model(model)
        # Continue implementation...
```

**Why This Matters**:
- Accurate token usage tracking
- Works with actual LLM limits
- Model-specific tokenization
- Production-ready implementation

---

### Challenge Exercise: Intelligent Memory Compression
**Objective**: Add automatic summarization when memory limit approached.

**Requirements**:
- Detect when approaching limit (e.g., 80% full)
- Use an LLM to summarize older messages
- Replace old messages with compressed summary
- Preserve most recent messages unchanged

**Why This Matters**:
- Maintain longer conversation context
- Automatic context management
- Balance detail with efficiency
- Advanced production pattern

---

## Next Steps

Continue with [02_code_examples.ipynb](02_code_examples.ipynb) for:
- Runnable code examples with tool calling
- Minimal agent loop implementation
- Hands-on practice with function stubs
- Building your first working agent

**What You've Learned So Far**:
1. Secure configuration management with .env files
2. LLM message structure and roles
3. Prompt engineering with templates
4. Memory management and context injection

You now have the foundational knowledge to build stateful AI agents!