# üß† Notebook 1: The Foundation - How LLMs Generate Text

## Conference Event Description Generator Demo

**Duration:** ~8 minutes  
**Learning Objective:** Understand the fundamentals of how Large Language Models generate text token-by-token, and see this in action through a practical conference management use case.

---

## üéØ What We'll Cover

1. **LLM Theory Basics** - Tokens, transformers, and probability
2. **Text Generation Process** - How models predict the next word
3. **Practical Demo** - Building an event description generator
4. **Parameters Impact** - Temperature, top-p, and creativity control

---

## üöÄ From Theory to Practice

By the end of this notebook, you'll understand:
- How LLMs work at a fundamental level
- Why they sometimes "hallucinate" or make mistakes
- How to control their creativity and consistency
- Real business value through conference management automation

## üß© Part 1: LLM Fundamentals - The Token Game

Let's start with the basics. Every LLM works by:

1. **Tokenization** - Breaking text into smaller pieces (tokens)
2. **Context Understanding** - Looking at previous tokens to understand context
3. **Prediction** - Calculating probabilities for the next token
4. **Selection** - Choosing the next token based on those probabilities
5. **Repeat** - Continue until complete

### Think of it like autocomplete on steroids! 

When you type "The conference will be held in..." the model considers:
- What locations make sense?
- What style matches the context?
- What information is most relevant?

Then it predicts the most likely next words based on patterns learned from training data.

In [1]:
# Let's start by setting up our environment
import os
from dotenv import load_dotenv
import json
from typing import Dict, List, Optional
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta

# Load environment variables
load_dotenv()

# Configure plotting
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("üöÄ Environment setup complete!")
print("üìä Plotting libraries ready")
print("üéØ Let's build our conference event generator!")

üöÄ Environment setup complete!
üìä Plotting libraries ready
üéØ Let's build our conference event generator!


In [None]:
# Set up Azure OpenAI for direct LLM access
# This shows the foundational approach before we use agent frameworks
import os
from openai import AzureOpenAI
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Azure OpenAI Configuration
endpoint = os.getenv("AZURE_OPENAI_ENDPOINT", "https://frederiekvandepitte4468-resource.cognitiveservices.azure.com/")
model_name = os.getenv("AZURE_OPENAI_MODEL", "gpt-4.1-nano")
deployment = os.getenv("AZURE_OPENAI_DEPLOYMENT", "gpt-4.1-nano")
api_key = os.getenv("AZURE_OPENAI_API_KEY", "<your-api-key>")
api_version = os.getenv("AZURE_OPENAI_API_VERSION", "2024-12-01-preview")

# Initialize Azure OpenAI client
try:
    client = AzureOpenAI(
        api_version=api_version,
        azure_endpoint=endpoint,
        api_key=api_key,
    )
    
    # Test connection with a simple prompt
    test_response = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant. Respond with just 'Connection successful!' to confirm setup.",
            },
            {
                "role": "user",
                "content": "Test connection",
            }
        ],
        max_completion_tokens=10,
        temperature=0.0,
        model=deployment
    )
    
    print("‚úÖ Azure OpenAI connection established")
    print(f"üéØ Using model: {model_name}")
    print(f"? Endpoint: {endpoint}")
    print(f"üîó Test response: {test_response.choices[0].message.content}")
    azure_client_ready = True
    
except Exception as e:
    print(f"‚ö†Ô∏è Azure OpenAI setup needed: {e}")
    print("üí° Please check your .env file configuration")
    print("üìù For demo purposes, we'll use a simulation")
    azure_client_ready = False
    client = None

‚úÖ Azure AI Foundry connection established
üîê Using secure managed identity authentication


In [None]:
# üß™ Simple Example: Direct LLM API Call
# Let's see a basic example of how we talk to the LLM directly

if azure_client_ready and client:
    print("üî• Live Azure OpenAI Example:")
    print("-" * 30)
    
    simple_response = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": "You are a conference marketing expert. Be concise and engaging.",
            },
            {
                "role": "user", 
                "content": "Write a 2-sentence teaser for a session called 'Introduction to Kubernetes for Beginners'",
            }
        ],
        max_completion_tokens=100,
        temperature=0.7,
        model=deployment
    )
    
    print("‚ú® AI Response:")
    print(simple_response.choices[0].message.content)
    print()
    print("üéØ This is the foundation - simple prompt ‚Üí AI response!")
    
else:
    print("üß™ Simulated Example (Azure OpenAI not configured):")
    print("-" * 40)
    print("‚ú® AI Response:")
    print("Ready to dive into container orchestration? Join our beginner-friendly Kubernetes session where you'll learn to deploy, scale, and manage applications with confidence. Walk away with hands-on skills and the knowledge to streamline your development workflow!")
    print()
    print("üéØ This is what a real API call would return!")

## üé® Part 2: Text Generation Parameters - The Creativity Controls

Before we generate our first event description, let's understand the key parameters that control how creative or conservative our AI becomes:

### üå°Ô∏è **Temperature** (0.0 - 2.0)
- **Low (0.0-0.3)**: Predictable, consistent, factual
- **Medium (0.4-0.7)**: Balanced creativity and reliability  
- **High (0.8-2.0)**: Creative, varied, sometimes unpredictable

### üéØ **Top-p** (0.0 - 1.0)  
- Controls vocabulary diversity
- 0.1 = Very focused, limited word choices
- 0.9 = Diverse vocabulary, more creative

### üé≤ **Max Tokens**
- Maximum length of generated text
- Helps control response length and cost

### Why This Matters for Conference Management:
- **Event descriptions**: Medium creativity for engaging but accurate content
- **Speaker bios**: Low temperature for factual accuracy
- **Marketing copy**: Higher creativity for compelling messaging

In [None]:
# Demo: Generate event descriptions with different creativity levels
def generate_event_description(session_info: Dict, temperature: float = 0.7) -> str:
    """
    Generate a conference session description using Azure OpenAI directly
    
    Args:
        session_info: Dictionary containing session details
        temperature: Controls creativity (0.0 = conservative, 1.0 = creative)
    """
    
    system_prompt = """You are an expert conference event description generator.
    Create engaging, professional, and informative descriptions.
    Focus on value proposition and clear outcomes for attendees.
    Keep descriptions between 150-300 words.
    Always include practical takeaways."""
    
    user_prompt = f"""
    Create an engaging conference session description for:
    
    Title: {session_info['title']}
    Speaker: {session_info['speaker']}
    Duration: {session_info['duration']} minutes
    Track: {session_info['track']}
    Level: {session_info['level']}
    Key Topics: {', '.join(session_info['topics'])}
    
    Include:
    - Compelling overview that highlights value
    - What attendees will learn
    - Practical takeaways
    - Who should attend
    
    Style: Professional but engaging, suitable for technical conference
    """
    
    if azure_client_ready and client:
        try:
            # Use real Azure OpenAI model
            response = client.chat.completions.create(
                messages=[
                    {"role": "system", "content": system_prompt},
                    {"role": "user", "content": user_prompt}
                ],
                max_completion_tokens=500,
                temperature=temperature,
                top_p=1.0,
                frequency_penalty=0.0,
                presence_penalty=0.0,
                model=deployment
            )
            return response.choices[0].message.content
        except Exception as e:
            print(f"‚ö†Ô∏è API call failed: {e}")
            return generate_simulation(session_info, temperature)
    else:
        # Simulation for demo purposes
        return generate_simulation(session_info, temperature)

def generate_simulation(session_info: Dict, temperature: float) -> str:
    """Generate a simulated response for demo purposes"""
    creativity_styles = {
        0.2: "straightforward and factual",
        0.7: "engaging and professional", 
        1.2: "creative and dynamic"
    }
    
    style_desc = creativity_styles.get(temperature, "balanced")
    
    return f"""**{session_info['title']}**

Join {session_info['speaker']} for an intensive {session_info['duration']}-minute deep dive into cutting-edge {session_info['track']} technologies. This {session_info['level']}-level session will cover {', '.join(session_info['topics'])}, providing you with practical insights you can implement immediately.

**What You'll Learn:**
‚Ä¢ Advanced techniques in {session_info['topics'][0]}
‚Ä¢ Real-world implementation strategies  
‚Ä¢ Best practices from industry leaders
‚Ä¢ Hands-on examples and case studies

**Who Should Attend:**
Perfect for developers, architects, and technical leaders looking to stay ahead of the curve in {session_info['track']}.

**Takeaways:**
Leave with actionable knowledge and a clear roadmap for implementing these technologies in your organization.

*[Simulated response - {style_desc} style, temperature: {temperature}]*
"""

# Test session data
test_session = {
    "title": "Building Resilient Microservices with Event-Driven Architecture",
    "speaker": "Dr. Sarah Chen",
    "duration": 45,
    "track": "Architecture",
    "level": "Intermediate",
    "topics": ["Event Sourcing", "CQRS", "Message Queues", "Distributed Systems"]
}

print("üéØ Event Description Generator Ready!")
print("üìù Test session data loaded")
print("üöÄ Ready to demonstrate different creativity levels!")
print(f"üîß Using direct Azure OpenAI API calls (not agent framework yet)")

üéØ Event Description Generator Ready!
üìù Test session data loaded
üöÄ Ready to demonstrate different creativity levels!


In [4]:
# üé≠ Demo: Compare different creativity levels

print("üî• CONSERVATIVE (Temperature: 0.2)")
print("=" * 50)
conservative_desc = generate_event_description(test_session, temperature=0.2)
print(conservative_desc)
print("\n")

print("‚öñÔ∏è BALANCED (Temperature: 0.7)")
print("=" * 50)
balanced_desc = generate_event_description(test_session, temperature=0.7)
print(balanced_desc)
print("\n")

print("üé® CREATIVE (Temperature: 1.2)")
print("=" * 50)
creative_desc = generate_event_description(test_session, temperature=1.2)
print(creative_desc)
print("\n")

print("üéØ Key Observations:")
print("‚Ä¢ Conservative: Factual, predictable, safe for official use")
print("‚Ä¢ Balanced: Engaging while maintaining professionalism") 
print("‚Ä¢ Creative: More varied language, higher marketing appeal")
print("‚Ä¢ Choose temperature based on your conference's brand and audience!")

üî• CONSERVATIVE (Temperature: 0.2)


ModelProviderError: API key is required

## üìä Part 3: Understanding Token-by-Token Generation

Let's visualize how LLMs actually build text one token at a time. This helps explain why they sometimes make mistakes or "hallucinate"!

### The Process:
1. **Start with prompt** ‚Üí Model reads your input
2. **Calculate probabilities** ‚Üí For every possible next token  
3. **Sample based on temperature** ‚Üí Choose next token
4. **Add to context** ‚Üí Token becomes part of the story so far
5. **Repeat** ‚Üí Until completion or max tokens reached

### Why This Matters:
- **Consistency**: Each token only "sees" what came before
- **Context limits**: Models have finite memory windows  
- **Probability**: Sometimes unlikely (but valid) tokens get selected
- **Hallucination**: Model confidently predicts plausible but incorrect information

In [None]:
# Simulate token-by-token generation process
def simulate_token_generation(prompt: str, max_tokens: int = 50):
    """
    Simulate how an LLM generates text token by token
    This is a simplified simulation for educational purposes
    """
    
    # Simulated token probabilities for conference-related text
    conference_tokens = [
        "conference", "session", "attendees", "speakers", "workshop",
        "presentation", "networking", "innovation", "technology", "insights",
        "practical", "hands-on", "experience", "learning", "industry",
        "will", "learn", "explore", "discover", "master", "understand"
    ]
    
    general_tokens = [
        "the", "and", "to", "of", "in", "for", "with", "on", "at", "by",
        "this", "that", "these", "those", "you", "your", "our", "we"
    ]
    
    tokens_generated = []
    current_text = prompt
    
    print(f"üöÄ Starting generation from: '{prompt}'")
    print("=" * 60)
    
    for step in range(max_tokens):
        # Simulate probability calculation based on context
        if len(tokens_generated) < 5:
            # Early tokens more likely to be general structure words
            candidates = general_tokens + conference_tokens[:5]
        else:
            # Later tokens more domain-specific
            candidates = conference_tokens + general_tokens[:8]
        
        # Simulate token selection (simplified)
        import random
        next_token = random.choice(candidates)
        
        tokens_generated.append(next_token)
        current_text += " " + next_token
        
        # Show progress every few tokens
        if step % 8 == 0 or step < 5:
            print(f"Step {step+1:2d}: '{next_token}' ‚Üí {current_text}")
            
        # Simple stopping condition
        if next_token in [".", "!", "?"] and len(tokens_generated) > 10:
            break
    
    print("=" * 60)
    print(f"‚úÖ Generated {len(tokens_generated)} tokens")
    print(f"üìù Final text: {current_text}")
    
    return tokens_generated, current_text

# Demo the process
print("üß™ Simulating Token-by-Token Generation")
print("(Simplified for demonstration - real models are much more sophisticated)")
print()

tokens, final_text = simulate_token_generation("This conference session about AI will")

## üè¢ Part 4: Business Value - From Theory to Practice

Now that you understand how LLMs work, let's see the **real business impact** for conference management:

### ‚è∞ **Time Savings**
- **Before**: 30-45 minutes per session description
- **After**: 2-3 minutes with AI assistance
- **Impact**: 90%+ time reduction for content creation

### ‚ú® **Quality Consistency**  
- Standardized tone and structure across all sessions
- No more writer's block or inconsistent messaging
- Professional quality even for last-minute additions

### üéØ **Scalability**
- Generate descriptions for 100+ sessions in minutes
- Easy localization for international conferences
- Rapid iteration and A/B testing of messaging

### üí° **Creative Enhancement**
- AI suggests angles you might not have considered  
- Helps non-writers create compelling content
- Maintains engagement while ensuring accuracy

In [None]:
# üöÄ Interactive Demo: Build Your Own Session Description

def interactive_session_builder():
    """
    Interactive tool to create conference session descriptions
    Demonstrates the practical value of LLM text generation
    """
    
    print("üé™ Conference Session Description Builder")
    print("=" * 50)
    print("Let's create a compelling session description together!")
    print()
    
    # In a real demo, these would be interactive inputs
    # For notebook purposes, we'll use example data
    
    sample_sessions = [
        {
            "title": "Kubernetes in Production: Lessons from the Trenches",
            "speaker": "Alex Rodriguez", 
            "duration": 60,
            "track": "DevOps",
            "level": "Advanced",
            "topics": ["Container Orchestration", "Production Deployments", "Monitoring", "Scaling"]
        },
        {
            "title": "Introduction to Machine Learning for Web Developers", 
            "speaker": "Dr. Maya Patel",
            "duration": 45,
            "track": "AI/ML",
            "level": "Beginner", 
            "topics": ["Neural Networks", "TensorFlow.js", "Browser ML", "Practical Applications"]
        },
        {
            "title": "Building Accessible React Components",
            "speaker": "Jordan Kim",
            "duration": 30, 
            "track": "Frontend",
            "level": "Intermediate",
            "topics": ["ARIA", "Screen Readers", "Keyboard Navigation", "Inclusive Design"]
        }
    ]
    
    for i, session in enumerate(sample_sessions, 1):
        print(f"üìã Session #{i}: {session['title']}")
        print("-" * 40)
        
        # Generate description with optimal settings for conference content
        description = generate_event_description(session, temperature=0.6)
        print(description)
        print()
        
        # Show the efficiency gain
        print(f"‚ö° Generated in ~2 seconds vs ~30 minutes manual writing")
        print(f"üéØ Consistent quality and structure")
        print(f"üìà Ready for immediate use in conference program")
        print("=" * 60)
        print()

# Run the interactive demo
interactive_session_builder()

## üéì Key Takeaways - The Foundation is Set!

### What We've Learned:

1. **üß† LLM Fundamentals**
   - Text generation is probabilistic, not deterministic
   - Models predict one token at a time based on context
   - Understanding this helps explain AI behavior and limitations

2. **üéöÔ∏è Parameter Control**
   - Temperature controls creativity vs consistency
   - Lower values for factual content, higher for creative writing
   - Choose settings based on your specific business needs

3. **üíº Business Impact**
   - 90%+ time reduction in content creation
   - Consistent quality and professional tone
   - Scalable solution for large-scale events

4. **üîß Direct API Implementation**
   - Started with raw Azure OpenAI API calls
   - Simple message format: system prompt + user prompt
   - Direct control over all parameters (temperature, max_tokens, etc.)

---

## üöÄ What's Next?

In **Notebook 2**, we'll evolve from direct API calls to **conversational AI**:
- Memory and context management
- Building conference chatbots with AGNO AI framework
- Handling complex multi-turn conversations

**The journey from completion to conversation begins!** üéØ