# Introduction to Large Language Models

**Interactive Notebook** - Section 9: Large Language Models

Welcome to the revolutionary world of Large Language Models! This notebook will guide you through the fundamental concepts of LLMs, from basic transformer architecture to practical applications and fine-tuning techniques.

## 🎯 Learning Objectives

By the end of this notebook, you will:
- ✅ Understand the transformer architecture that powers LLMs
- ✅ Learn about attention mechanisms and self-attention
- ✅ Explore popular LLMs (GPT, BERT, LLaMA, etc.)
- ✅ Implement text generation and completion tasks
- ✅ Learn about fine-tuning and prompt engineering
- ✅ Build practical LLM applications

## 📋 Prerequisites

- Completion of "Introduction to Deep Learning" notebook
- Understanding of neural networks and deep learning
- Basic NLP concepts (tokens, embeddings, sequences)
- Python programming and PyTorch/TensorFlow basics

**Estimated Time**: 4-5 hours

⚠️ **Note**: Some exercises require API access to LLM services. Free tier options are provided where possible.

## 🔧 Setup and Installation

Let's set up our environment with the necessary libraries for working with Large Language Models.

In [None]:
# Install required packages
!pip install -q transformers torch datasets accelerate sentencepiece tokenizers
!pip install -q matplotlib seaborn numpy pandas scikit-learn ipywidgets
!pip install -q openai cohere huggingface-hub

# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import torch
import torch.nn as nn
from transformers import (
    AutoTokenizer, AutoModel, AutoModelForCausalLM, AutoModelForSequenceClassification,
    GPT2Tokenizer, GPT2LMHeadModel, BertTokenizer, BertModel,
    TrainingArguments, Trainer, pipeline
)
from datasets import load_dataset
import ipywidgets as widgets
from IPython.display import display, Markdown
import warnings
warnings.filterwarnings('ignore')

# Set random seeds for reproducibility
np.random.seed(42)
torch.manual_seed(42)

# Set visualization style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("✅ LLM environment setup complete!")
print(f"PyTorch version: {torch.__version__}")
print(f"Transformers version: {transformers.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU device: {torch.cuda.get_device_name(0)}")
    print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")

## 🧠 What are Large Language Models?

Large Language Models are deep learning models trained on massive amounts of text data that can understand and generate human-like text. They're based on the transformer architecture and use self-attention mechanisms to process sequential data.

### Key Concepts:

1. **Transformers**: The neural network architecture that revolutionized NLP
2. **Attention Mechanisms**: Allow models to focus on relevant parts of input
3. **Self-Attention**: Enables understanding relationships between words in a sequence
4. **Positional Encoding**: Provides information about word positions
5. **Tokenization**: Breaking text into smaller units (tokens)
6. **Embeddings**: Vector representations of tokens

### Popular LLMs:
- **GPT (Generative Pre-trained Transformer)**: OpenAI's series of models
- **BERT (Bidirectional Encoder Representations from Transformers)**: Google's model
- **LLaMA**: Meta's open-source large language model
- **Claude**: Anthropic's AI assistant
- **Gemini**: Google's multimodal AI model

In [None]:
# Visualize LLM evolution and architecture
def visualize_llm_evolution():
    fig, axes = plt.subplots(2, 2, figsize=(18, 12))
    fig.suptitle('Large Language Models: Evolution and Architecture', fontsize=16, fontweight='bold')

    # LLM Evolution Timeline
    ax = axes[0, 0]
    models = [
        ('ELMo', 2018, 94),
        ('GPT-1', 2018, 117),
        ('BERT', 2018, 340),
        ('GPT-2', 2019, 1500),
        ('GPT-3', 2020, 175000),
        ('LLaMA', 2023, 65000),
        ('GPT-4', 2023, 1000000),
        ('Claude 2', 2023, 100000),
        ('Gemini', 2023, 1000000)
    ]
    
    years = [model[1] for model in models]
    params = [model[2] for model in models]
    names = [model[0] for model in models]
    
    # Create bubble chart
    scatter = ax.scatter(years, np.log10(params), s=[p/1000 for p in params], c=range(len(models)), cmap='viridis', alpha=0.7)
    ax.set_xlabel('Year')
    ax.set_ylabel('Parameters (log10 scale)')
    ax.set_title('LLM Evolution Over Time')
    ax.grid(True, alpha=0.3)
    
    # Add labels
    for i, (name, year, param) in enumerate(models):
        ax.annotate(name, (year, np.log10(param)), xytext=(5, 5), textcoords='offset points', fontsize=8)
    
    # Transformer Architecture
    ax = axes[0, 1]
    ax.set_xlim(0, 10)
    ax.set_ylim(0, 8)
    
    # Input embedding
    ax.add_patch(plt.Rectangle((1, 6), 1, 1, fill=True, color='lightblue', alpha=0.7))
    ax.text(1.5, 6.5, 'Input\nEmbedding', ha='center', va='center', fontsize=8)
    
    # Positional encoding
    ax.add_patch(plt.Rectangle((2.5, 6), 1, 1, fill=True, color='lightgreen', alpha=0.7))
    ax.text(3, 6.5, 'Positional\nEncoding', ha='center', va='center', fontsize=8)
    
    # Encoder layers
    for i in range(3):
        y_pos = 4 - i*1.2
        ax.add_patch(plt.Rectangle((4, y_pos), 2, 0.8, fill=True, color='lightcoral', alpha=0.7))
        ax.text(5, y_pos+0.4, f'Encoder\nLayer {i+1}', ha='center', va='center', fontsize=8)
    
    # Decoder layers
    for i in range(3):
        y_pos = 4 - i*1.2
        ax.add_patch(plt.Rectangle((7, y_pos), 2, 0.8, fill=True, color='lightyellow', alpha=0.7))
        ax.text(8, y_pos+0.4, f'Decoder\nLayer {i+1}', ha='center', va='center', fontsize=8)
    
    # Output
    ax.add_patch(plt.Rectangle((9.5, 3), 0.5, 3, fill=True, color='lightpink', alpha=0.7))
    ax.text(9.75, 4.5, 'Output', ha='center', va='center', fontsize=8, rotation=90)
    
    ax.set_title('Transformer Architecture')
    ax.axis('off')
    
    # Attention Mechanism
    ax = axes[1, 0]
    # Create attention heatmap
    attention_matrix = np.random.rand(8, 8)
    attention_matrix = attention_matrix / attention_matrix.sum(axis=1, keepdims=True)
    
    im = ax.imshow(attention_matrix, cmap='Blues', aspect='equal')
    ax.set_title('Self-Attention Mechanism')
    ax.set_xlabel('Key Position')
    ax.set_ylabel('Query Position')
    
    # Add word labels
    words = ['The', 'cat', 'sat', 'on', 'the', 'mat', 'and', 'purred']
    ax.set_xticks(range(8))
    ax.set_yticks(range(8))
    ax.set_xticklabels(words, rotation=45)
    ax.set_yticklabels(words)
    
    plt.colorbar(im, ax=ax, shrink=0.8)
    
    # LLM Applications
    ax = axes[1, 1]
    applications = [
        'Text Generation', 'Translation', 'Summarization', 'Question Answering',
        'Code Generation', 'Sentiment Analysis', 'Named Entity Recognition', 'Dialogue Systems'
    ]
    
    # Create a circular layout
    angles = np.linspace(0, 2*np.pi, len(applications), endpoint=False)
    x = np.cos(angles) * 0.8
    y = np.sin(angles) * 0.8
    
    # Draw connections to center
    for xi, yi, app in zip(x, y, applications):
        ax.plot([0, xi], [0, yi], 'gray', alpha=0.3)
        ax.scatter(xi, yi, s=200, c='lightblue', alpha=0.7)
        ax.text(xi*1.2, yi*1.2, app, ha='center', va='center', fontsize=8)
    
    # Center node
    ax.scatter(0, 0, s=300, c='red', alpha=0.8)
    ax.text(0, 0, 'LLM', ha='center', va='center', fontsize=12, fontweight='bold')
    
    ax.set_xlim(-1.5, 1.5)
    ax.set_ylim(-1.5, 1.5)
    ax.set_title('LLM Applications')
    ax.axis('off')
    
    plt.tight_layout()
    plt.show()

visualize_llm_evolution()

## 🔄 Understanding Tokenization

Tokenization is the first step in working with LLMs. It breaks down text into smaller units called tokens that the model can process.

In [None]:
# Explore tokenization with different models
def explore_tokenization():
    print("🔍 Exploring Tokenization")
    print("="*50)
    
    # Sample texts
    texts = [
        "Hello, world!",
        "Large Language Models are amazing!",
        "The quick brown fox jumps over the lazy dog.",
        "Artificial Intelligence will revolutionize technology."
    ]
    
    # Load different tokenizers
    tokenizers = {
        'GPT-2': GPT2Tokenizer.from_pretrained('gpt2'),
        'BERT': BertTokenizer.from_pretrained('bert-base-uncased'),
        'T5': AutoTokenizer.from_pretrained('t5-small')
    }
    
    # Tokenize each text with each tokenizer
    for text in texts:
        print(f"\n📝 Text: '{text}'")
        print(f"Characters: {len(text)}")
        print(f"Words: {len(text.split())}")
        
        for name, tokenizer in tokenizers.items():
            tokens = tokenizer.tokenize(text)
            token_ids = tokenizer.convert_tokens_to_ids(tokens)
            
            print(f"\n  {name}:")
            print(f"    Tokens: {tokens}")
            print(f"    Number of tokens: {len(tokens)}")
            print(f"    Token IDs: {token_ids}")
        
        print("\n" + "-"*50)
    
    # Visualize tokenization comparison
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
    fig.suptitle('Tokenization Comparison Across Models', fontsize=16, fontweight='bold')
    
    text_to_analyze = "The transformer architecture has revolutionized natural language processing."
    
    for i, (name, tokenizer) in enumerate(tokenizers.items()):
        ax = axes[i//2, i%2]
        
        tokens = tokenizer.tokenize(text_to_analyze)
        
        # Create token length visualization
        token_lengths = [len(token) for token in tokens]
        colors = plt.cm.Set3(np.linspace(0, 1, len(tokens)))
        
        bars = ax.bar(range(len(tokens)), token_lengths, color=colors, alpha=0.7)
        ax.set_title(f'{name} Tokenization')
        ax.set_xlabel('Token Index')
        ax.set_ylabel('Token Length')
        ax.set_xticks(range(len(tokens)))
        ax.set_xticklabels(tokens, rotation=45, ha='right')
        
        # Add value labels
        for bar, length in zip(bars, token_lengths):
            ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.1,
                    str(length), ha='center', va='bottom')
        
        ax.grid(True, alpha=0.3)
    
    # Remove empty subplot
    if len(tokenizers) < 4:
        fig.delaxes(axes[1, 1])
    
    plt.tight_layout()
    plt.show()
    
    # Vocabulary size comparison
    print("\n📊 Vocabulary Sizes:")
    print("="*30)
    for name, tokenizer in tokenizers.items():
        vocab_size = tokenizer.vocab_size
        print(f"{name}: {vocab_size:,} tokens")
    
explore_tokenization()

## 🏗️ Building a Simple Text Generator with GPT-2

Let's build a simple text generator using a pre-trained GPT-2 model to understand how LLMs work.

In [None]:
# Load pre-trained GPT-2 model and tokenizer
def load_gpt2_model():
    print("🔄 Loading GPT-2 model...")
    
    # Load tokenizer and model
    tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
    model = GPT2LMHeadModel.from_pretrained('gpt2')
    
    # Set pad token
    if tokenizer.pad_token is None:
        tokenizer.pad_token = tokenizer.eos_token
    
    # Move to GPU if available
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model.to(device)
    
    print(f"✅ Model loaded successfully on {device}")
    print(f"Model parameters: {model.num_parameters():,}")
    
    return tokenizer, model, device

tokenizer, model, device = load_gpt2_model()

# Text generation function
def generate_text(prompt, max_length=100, temperature=1.0, num_return_sequences=1):
    # Encode the prompt
    inputs = tokenizer.encode(prompt, return_tensors='pt').to(device)
    
    # Generate text
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_length=max_length,
            num_return_sequences=num_return_sequences,
            temperature=temperature,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id,
            no_repeat_ngram_size=2
        )
    
    # Decode the generated text
    generated_texts = []
    for output in outputs:
        generated_text = tokenizer.decode(output, skip_special_tokens=True)
        generated_texts.append(generated_text)
    
    return generated_texts

# Test text generation
prompts = [
    "The future of artificial intelligence",
    "In a world where robots and humans coexist",
    "Machine learning algorithms are",
    "The key to successful data science is"
]

print("🎨 Text Generation Examples")
print("="*50)

for prompt in prompts:
    print(f"\n📝 Prompt: '{prompt}'")
    generated = generate_text(prompt, max_length=80, temperature=0.8)
    print(f"🤖 Generated: {generated[0]}")
    print("-" * 50)

In [None]:
# Interactive text generation with different parameters
def interactive_text_generation():
    # Create widgets
    prompt_text = widgets.Textarea(
        value='The future of technology is',
        placeholder='Enter your prompt here...',
        description='Prompt:',
        layout=widgets.Layout(width='80%', height='80px')
    )
    
    max_length_slider = widgets.IntSlider(
        value=50, min=20, max=200, step=10,
        description='Max Length:', style={'description_width': 'initial'}
    )
    
    temperature_slider = widgets.FloatSlider(
        value=0.8, min=0.1, max=2.0, step=0.1,
        description='Temperature:', style={'description_width': 'initial'}
    )
    
    generate_button = widgets.Button(
        description='Generate Text',
        button_style='success',
        layout=widgets.Layout(width='200px')
    )
    
    output_area = widgets.Output()
    
    def on_generate_clicked(b):
        with output_area:
            output_area.clear_output()
            
            prompt = prompt_text.value
            max_len = max_length_slider.value
            temp = temperature_slider.value
            
            if not prompt.strip():
                print("❌ Please enter a prompt!")
                return
            
            print(f"🔄 Generating text with parameters:")
            print(f"   Prompt: '{prompt}'")
            print(f"   Max Length: {max_len}")
            print(f"   Temperature: {temp}")
            print("\n🤖 Generated Text:")
            print("="*50)
            
            try:
                generated = generate_text(prompt, max_length=max_len, temperature=temp)
                print(generated[0])
                print("="*50)
                
                # Show generation statistics
                tokens = tokenizer.encode(generated[0])
                print(f"\n📊 Generation Stats:")
                print(f"   Total tokens: {len(tokens)}")
                print(f"   Prompt tokens: {len(tokenizer.encode(prompt))}")
                print(f"   Generated tokens: {len(tokens) - len(tokenizer.encode(prompt))}")
                
            except Exception as e:
                print(f"❌ Error during generation: {e}")
    
    generate_button.on_click(on_generate_clicked)
    
    # Display widgets
    display(widgets.VBox([
        prompt_text,
        widgets.HBox([max_length_slider, temperature_slider]),
        generate_button,
        output_area
    ]))
    
    # Show temperature explanation
    print("\n🌡️ Temperature Controls:")
    print("• Low temperature (0.1-0.5): More focused, conservative text")
    print("• Medium temperature (0.5-1.0): Balanced creativity")
    print("• High temperature (1.0-2.0): More diverse, creative text")

interactive_text_generation()

## 🎯 Text Classification with BERT

Now let's use BERT for text classification to understand how LLMs can be used for specific NLP tasks.

In [None]:
# Load BERT model for text classification
def load_bert_classifier():
    print("🔄 Loading BERT classifier...")
    
    # Load tokenizer and model for sequence classification
    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
    model = AutoModelForSequenceClassification.from_pretrained(
        'textattack/bert-base-uncased-imdb',  # Pre-trained on IMDB sentiment analysis
        num_labels=2
    )
    
    # Move to GPU if available
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model.to(device)
    
    print(f"✅ BERT classifier loaded on {device}")
    return tokenizer, model, device

bert_tokenizer, bert_model, bert_device = load_bert_classifier()

# Function to classify text
def classify_text(text, tokenizer, model, device):
    # Tokenize input
    inputs = tokenizer(text, return_tensors='pt', truncation=True, padding=True, max_length=512)
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    # Get predictions
    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
        predicted_class = torch.argmax(predictions, dim=-1).item()
        confidence = predictions[0][predicted_class].item()
    
    return predicted_class, confidence, predictions[0].cpu().numpy()

# Test sentiment classification
test_texts = [
    "This movie was absolutely fantastic! The acting was superb and the plot was engaging.",
    "I was really disappointed with this film. The story was predictable and the characters were boring.",
    "The movie had some good moments but overall it was just average.",
    "One of the worst movies I've ever seen. Complete waste of time and money.",
    "A masterpiece of cinema! Every scene was beautifully crafted and the performances were outstanding.",
    "The special effects were good, but the script needed a lot of work.",
]

print("🎬 Sentiment Analysis with BERT")
print("="*60)

sentiment_labels = ['Negative', 'Positive']
results = []

for i, text in enumerate(test_texts, 1):
    predicted_class, confidence, probs = classify_text(text, bert_tokenizer, bert_model, bert_device)
    
    sentiment = sentiment_labels[predicted_class]
    
    print(f"\n📝 Text {i}: '{text[:60]}...'")
    print(f"🎭 Sentiment: {sentiment}")
    print(f"📊 Confidence: {confidence:.3f} ({confidence*100:.1f}%)")
    print(f"📈 Probabilities: Negative={probs[0]:.3f}, Positive={probs[1]:.3f}")
    
    results.append({
        'text': text,
        'sentiment': sentiment,
        'confidence': confidence,
        'negative_prob': probs[0],
        'positive_prob': probs[1]
    })

# Visualize results
df_results = pd.DataFrame(results)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

# Sentiment distribution
sentiment_counts = df_results['sentiment'].value_counts()
colors = ['lightcoral', 'lightgreen']
ax1.pie(sentiment_counts.values, labels=sentiment_counts.index, autopct='%1.1f%%', colors=colors)
ax1.set_title('Sentiment Distribution')

# Confidence scores
ax2.bar(range(len(df_results)), df_results['confidence'], color='lightblue', alpha=0.7)
ax2.set_xlabel('Text Index')
ax2.set_ylabel('Confidence Score')
ax2.set_title('Classification Confidence')
ax2.set_ylim(0, 1)
ax2.grid(True, alpha=0.3)

# Add sentiment labels on bars
for i, (_, row) in enumerate(df_results.iterrows()):
    ax2.text(i, row['confidence'] + 0.02, row['sentiment'][:4], 
            ha='center', va='bottom', fontsize=8)

plt.tight_layout()
plt.show()

## 🔍 Understanding Model Embeddings

Let's explore how LLMs represent text as embeddings and visualize the semantic relationships.

In [None]:
# Extract and visualize embeddings
def explore_embeddings():
    print("🔍 Exploring Text Embeddings")
    print("="*40)
    
    # Load BERT model for embeddings
    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
    model = BertModel.from_pretrained('bert-base-uncased')
    model.eval()
    
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model.to(device)
    
    # Sample texts with semantic relationships
    texts = [
        'cat', 'dog', 'animal', 'pet',
        'car', 'truck', 'vehicle', 'transportation',
        'happy', 'sad', 'emotion', 'feeling',
        'computer', 'laptop', 'technology', 'device'
    ]
    
    # Get embeddings for each text
    embeddings = []
    
    for text in texts:
        # Tokenize and get embeddings
        inputs = tokenizer(text, return_tensors='pt', padding=True, truncation=True)
        inputs = {k: v.to(device) for k, v in inputs.items()}
        
        with torch.no_grad():
            outputs = model(**inputs)
            # Use CLS token embedding as sentence representation
            embedding = outputs.last_hidden_state[:, 0, :].cpu().numpy()
            embeddings.append(embedding.flatten())
    
    embeddings = np.array(embeddings)
    
    print(f"✅ Extracted embeddings for {len(texts)} words")
    print(f"Embedding dimension: {embeddings.shape[1]}")
    
    # Reduce dimensionality for visualization
    from sklearn.decomposition import PCA
    from sklearn.manifold import TSNE
    
    # PCA
    pca = PCA(n_components=2)
    embeddings_2d_pca = pca.fit_transform(embeddings)
    
    # t-SNE
    tsne = TSNE(n_components=2, random_state=42)
    embeddings_2d_tsne = tsne.fit_transform(embeddings)
    
    # Create visualization
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))
    
    # PCA visualization
    categories = ['Animals', 'Vehicles', 'Emotions', 'Technology']
    category_colors = ['red', 'blue', 'green', 'orange']
    
    for i, (category, color) in enumerate(zip(categories, category_colors)):
        start_idx = i * 4
        end_idx = start_idx + 4
        
        ax1.scatter(embeddings_2d_pca[start_idx:end_idx, 0], 
                   embeddings_2d_pca[start_idx:end_idx, 1], 
                   c=color, label=category, s=100, alpha=0.7)
        
        # Add text labels
        for j in range(start_idx, end_idx):
            ax1.annotate(texts[j], (embeddings_2d_pca[j, 0], embeddings_2d_pca[j, 1]),
                        xytext=(5, 5), textcoords='offset points', fontsize=8)
    
    ax1.set_title('Word Embeddings (PCA)')
    ax1.set_xlabel('PCA Component 1')
    ax1.set_ylabel('PCA Component 2')
    ax1.legend()
    ax1.grid(True, alpha=0.3)
    
    # t-SNE visualization
    for i, (category, color) in enumerate(zip(categories, category_colors)):
        start_idx = i * 4
        end_idx = start_idx + 4
        
        ax2.scatter(embeddings_2d_tsne[start_idx:end_idx, 0], 
                   embeddings_2d_tsne[start_idx:end_idx, 1], 
                   c=color, label=category, s=100, alpha=0.7)
        
        # Add text labels
        for j in range(start_idx, end_idx):
            ax2.annotate(texts[j], (embeddings_2d_tsne[j, 0], embeddings_2d_tsne[j, 1]),
                        xytext=(5, 5), textcoords='offset points', fontsize=8)
    
    ax2.set_title('Word Embeddings (t-SNE)')
    ax2.set_xlabel('t-SNE Component 1')
    ax2.set_ylabel('t-SNE Component 2')
    ax2.legend()
    ax2.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Find similar words
    print("\n🔗 Finding Similar Words:")
    print("="*30)
    
    from sklearn.metrics.pairwise import cosine_similarity
    
    similarity_matrix = cosine_similarity(embeddings)
    
    for i, word in enumerate(texts):
        # Get similarity scores for this word
        similarities = similarity_matrix[i]
        
        # Get top 3 most similar words (excluding itself)
        top_indices = np.argsort(similarities)[::-1][1:4]
        
        print(f"\nWords similar to '{word}':")
        for idx in top_indices:
            print(f"  • {texts[idx]} (similarity: {similarities[idx]:.3f})")
    
    return embeddings, embeddings_2d_pca, embeddings_2d_tsne

embeddings, pca_embeddings, tsne_embeddings = explore_embeddings()

## 🧪 Interactive Exercise: Prompt Engineering

Let's experiment with different prompting techniques to see how they affect model outputs.

In [None]:
# Interactive prompt engineering
def interactive_prompt_engineering():
    print("🔧 Interactive Prompt Engineering")
    print("="*40)
    
    # Different prompt techniques
    prompt_techniques = {
        'Basic': "{topic}",
        'Instruction': "Explain {topic} in simple terms.",
        'Role-playing': "You are an expert educator. Explain {topic} to a beginner.",
        'Step-by-step': "Explain {topic} step by step. Use clear examples.",
        'Analogy': "Explain {topic} using a simple analogy.",
        'Q&A': "What is {topic}? Why is it important? How does it work?"
    }
    
    topics = [
        'machine learning',
        'blockchain technology',
        'climate change',
        'quantum computing',
        'artificial intelligence ethics'
    ]
    
    # Create widgets
    topic_dropdown = widgets.Dropdown(
        options=topics,
        value=topics[0],
        description='Topic:',
        style={'description_width': 'initial'}
    )
    
    technique_dropdown = widgets.Dropdown(
        options=list(prompt_techniques.keys()),
        value='Basic',
        description='Technique:',
        style={'description_width': 'initial'}
    )
    
    generate_button = widgets.Button(
        description='Generate Prompt',
        button_style='info',
        layout=widgets.Layout(width='150px')
    )
    
    output_area = widgets.Output()
    
    def on_generate_clicked(b):
        with output_area:
            output_area.clear_output()
            
            topic = topic_dropdown.value
            technique = technique_dropdown.value
            
            # Generate prompt
            prompt_template = prompt_techniques[technique]
            prompt = prompt_template.format(topic=topic)
            
            print(f"🎯 {technique} Prompt:")
            print(f"\"{prompt}\"")
            print("\n" + "="*60)
            
            # Show technique explanation
            explanations = {
                'Basic': 'Simple and direct, good for straightforward queries',
                'Instruction': 'Clear guidance on what the model should do',
                'Role-playing': 'Helps the model adopt a specific perspective',
                'Step-by-step': 'Encourages structured, detailed responses',
                'Analogy': 'Makes complex topics more relatable',
                'Q&A': 'Prompts comprehensive coverage of multiple aspects'
            }
            
            print(f"💡 Why this works: {explanations[technique]}")
            print("\n📝 Expected Response Characteristics:")
            
            if technique == 'Basic':
                print("• Concise definition")
                print("• May vary in detail level")
            elif technique == 'Instruction':
                print("• Clear, structured explanation")
                print("• Appropriate complexity level")
            elif technique == 'Role-playing':
                print("• Educational tone")
                print("• Beginner-friendly language")
            elif technique == 'Step-by-step':
                print("• Sequential explanation")
                print("• Clear examples")
            elif technique == 'Analogy':
                print("• Relatable comparison")
                print("• Easy to understand")
            elif technique == 'Q&A':
                print("• Comprehensive coverage")
                print("• Multiple perspectives")
            
            # Show example response structure
            print("\n📋 Example Response Structure:")
            if technique == 'Step-by-step':
                print("1. Definition of key terms")
                print("2. Core concepts and principles")
                print("3. Step-by-step explanation")
                print("4. Practical examples")
                print("5. Summary and applications")
            elif technique == 'Q&A':
                print("• Direct answer to 'What is...'?")
                print("• Explanation of importance")
                print("• Breakdown of how it works")
                print("• Real-world applications")
    
    generate_button.on_click(on_generate_clicked)
    
    # Display widgets
    display(widgets.VBox([
        widgets.HBox([topic_dropdown, technique_dropdown]),
        generate_button,
        output_area
    ]))
    
    # Show prompt engineering tips
    print("\n💡 Prompt Engineering Tips:")
    print("="*40)
    print("1. Be specific and clear about what you want")
    print("2. Provide context and background information")
    print("3. Specify the desired format and length")
    print("4. Use examples to demonstrate expected output")
    print("5. Break complex tasks into smaller steps")
    print("6. Use constraints to guide the model")
    print("7. Iterate and refine based on results")

interactive_prompt_engineering()

## 🎯 Hands-on Challenge

Now it's your turn to apply what you've learned! Complete the following challenges:

In [None]:
# Challenge 1: Create a text summarization system
def challenge_1():
    print("📝 Challenge 1: Text Summarization System")
    print("="*50)
    
    # Sample long texts for summarization
    long_texts = [
        """Machine learning is a subset of artificial intelligence that enables computers to learn and improve from experience without being explicitly programmed. The process of learning begins with observations or data, such as examples, direct experience, or instruction, in order to look for patterns in data and make better decisions in the future based on the examples that we provide. The primary aim is to allow the computers learn automatically without human intervention or assistance and adjust actions accordingly. Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech recognition, and computer vision, where it is difficult or infeasible to develop conventional algorithms to perform the needed tasks.""",
        
        """Climate change refers to long-term shifts in temperatures and weather patterns. These shifts may be natural, such as through variations in the solar cycle. But since the 1800s, human activities have been the main driver of climate change, primarily due to burning fossil fuels (like coal, oil, and gas), which produces heat-trapping gases. Climate change is already impacting every region on Earth. The changes we're experiencing include more frequent and intense extreme weather events like heatwaves, storms, and droughts; rising sea levels; melting ice sheets; and warming oceans. These changes affect not only the environment but also human societies, disrupting economies, affecting food security, and threatening human health and safety. Addressing climate change requires global cooperation and significant changes in how we produce and consume energy."""
    ]
    
    # TODO: Complete the summarization task
    
    # Task 1: Basic extractive summarization
    def extractive_summary(text, num_sentences=3):
        """Simple extractive summarization using sentence importance"""
        import re
        from sklearn.feature_extraction.text import TfidfVectorizer
        
        # Split into sentences
        sentences = re.split(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?|\!)\s', text)
        sentences = [s.strip() for s in sentences if s.strip()]
        
        if len(sentences) <= num_sentences:
            return text
        
        # Calculate TF-IDF scores
        vectorizer = TfidfVectorizer(stop_words='english')
        tfidf_matrix = vectorizer.fit_transform(sentences)
        
        # Get sentence scores
        sentence_scores = tfidf_matrix.sum(axis=1).A1
        
        # Get top sentences
        top_indices = sentence_scores.argsort()[-num_sentences:][::-1]
        top_indices = sorted(top_indices)  # Maintain original order
        
        summary = ' '.join([sentences[i] for i in top_indices])
        return summary
    
    print("\n✅ Task 1: Extractive summarization function created!")
    
    # Test the summarization
    for i, text in enumerate(long_texts, 1):
        print(f"\n📄 Text {i} (first 100 chars): {text[:100]}...")
        print(f"Original length: {len(text)} characters")
        
        summary = extractive_summary(text, num_sentences=2)
        print(f"\n📋 Summary ({len(summary)} characters):")
        print(summary)
        print("-" * 60)
    
    # Task 2: Create an abstractive summarization attempt with GPT-2
    print("\n✅ Task 2: Abstractive summarization with GPT-2")
    
    summarization_prompt = "Summarize the following text in 2-3 sentences:\n\n{text}\n\nSummary:"
    
    for i, text in enumerate(long_texts, 1):
        print(f"\n📄 Text {i}:")
        
        # Create prompt for summarization
        prompt = summarization_prompt.format(text=text)
        
        try:
            # Generate summary
            generated = generate_text(prompt, max_length=150, temperature=0.7)
            full_text = generated[0]
            
            # Extract summary part (after "Summary:")
            if "Summary:" in full_text:
                summary = full_text.split("Summary:")[1].strip()
                print(f"🤖 GPT-2 Summary: {summary}")
            else:
                print(f"🤖 GPT-2 Generation: {full_text}")
        
        except Exception as e:
            print(f"❌ Error: {e}")
        
        print("-" * 60)
    
    return extractive_summary

challenge_1()

In [None]:
# Challenge 2: Build a simple question-answering system
def challenge_2():
    print("❓ Challenge 2: Question-Answering System")
    print("="*50)
    
    # Sample context passages
    contexts = [
        """The Python programming language was created by Guido van Rossum and first released in 1991. Python emphasizes code readability and simplicity, making it an excellent choice for beginners. It supports multiple programming paradigms, including procedural, object-oriented, and functional programming. Python has a comprehensive standard library and a vast ecosystem of third-party packages. Popular frameworks like Django and Flask make web development easy, while libraries like NumPy, Pandas, and TensorFlow have made Python the dominant language in data science and machine learning.""",
        
        """Neural networks are computing systems inspired by biological neural networks. They consist of interconnected nodes (neurons) that process information through their connectionist structure. Each connection between neurons has a weight that adjusts as learning proceeds. Neural networks can learn to perform tasks by considering examples, generally without being programmed with task-specific rules. Deep learning uses neural networks with multiple layers between the input and output layers, enabling the learning of complex patterns in large datasets. Applications include image recognition, natural language processing, and autonomous vehicles."""
    ]
    
    # Questions about each context
    questions = [
        [
            "Who created Python?",
            "When was Python first released?",
            "What programming paradigms does Python support?",
            "Why is Python popular in data science?",
            "Name two Python web frameworks."
        ],
        [
            "What are neural networks inspired by?",
            "What adjusts during neural network learning?",
            "What is deep learning?",
            "What are some applications of neural networks?",
            "How do neural networks learn tasks?"
        ]
    ]
    
    # TODO: Complete the QA system
    
    # Task 1: Simple keyword-based QA
    def simple_keyword_qa(question, context):
        """Simple keyword-based question answering"""
        import re
        
        # Split context into sentences
        sentences = re.split(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?|\!)\s', context)
        sentences = [s.strip() for s in sentences if s.strip()]
        
        # Extract keywords from question
        question_words = re.findall(r'\b\w+\b', question.lower())
        stop_words = {'what', 'who', 'when', 'where', 'why', 'how', 'is', 'are', 'was', 'were', 'the', 'a', 'an', 'and', 'or', 'but'}
        keywords = [w for w in question_words if w not in stop_words and len(w) > 2]
        
        # Score sentences based on keyword matches
        sentence_scores = []
        for sentence in sentences:
            sentence_lower = sentence.lower()
            score = sum(1 for keyword in keywords if keyword in sentence_lower)
            sentence_scores.append(score)
        
        # Return sentence with highest score
        if max(sentence_scores) > 0:
            best_sentence = sentences[sentence_scores.index(max(sentence_scores))]
            return best_sentence
        else:
            return "I don't have enough information to answer that question."
    
    print("\n✅ Task 1: Keyword-based QA system created!")
    
    # Test the QA system
    for i, (context, context_questions) in enumerate(zip(contexts, questions), 1):
        print(f"\n📄 Context {i}:")
        print(context[:200] + "..." if len(context) > 200 else context)
        print("\n❓ Questions and Answers:")
        
        for question in context_questions:
            answer = simple_keyword_qa(question, context)
            print(f"Q: {question}")
            print(f"A: {answer}")
            print()
    
    # Task 2: Evaluate QA system performance
    print("\n✅ Task 2: QA System Evaluation")
    
    # Simple evaluation using keyword overlap
    def evaluate_qa_system(contexts, questions_list, qa_function):
        results = []
        
        for context, questions in zip(contexts, questions_list):
            context_results = []
            for question in questions:
                answer = qa_function(question, context)
                
                # Simple relevance check (non-relevant answers are usually the default message)
                is_relevant = "I don't have enough information" not in answer
                
                context_results.append({
                    'question': question,
                    'answer': answer,
                    'relevant': is_relevant
                })
            
            results.append(context_results)
        
        return results
    
    evaluation_results = evaluate_qa_system(contexts, questions, simple_keyword_qa)
    
    # Calculate metrics
    total_questions = sum(len(context_results) for context_results in evaluation_results)
    relevant_answers = sum(
        sum(1 for result in context_results if result['relevant'])
        for context_results in evaluation_results
    )
    
    relevance_rate = relevant_answers / total_questions if total_questions > 0 else 0
    
    print(f"\n📊 QA System Performance:")
    print(f"Total questions: {total_questions}")
    print(f"Relevant answers: {relevant_answers}")
    print(f"Relevance rate: {relevance_rate:.2%}")
    
    return simple_keyword_qa

challenge_2()

In [None]:
# Challenge 3: Build a text classification pipeline
def challenge_3():
    print("🏷️ Challenge 3: Text Classification Pipeline")
    print("="*50)
    
    # Sample texts for classification
    sample_texts = [
        "I absolutely love this new smartphone! The camera quality is amazing and the battery life is incredible.",
        "The movie was disappointing. The plot was predictable and the acting was mediocre at best.",
        "The weather today is perfect for a walk in the park. Sunny and warm with a gentle breeze.",
        "I'm not sure about this restaurant. The food was okay but the service was slow.",
        "This new software update has completely ruined my user experience. Everything is slower now.",
        "The concert last night was fantastic! The band played all my favorite songs.",
        "I can't decide whether I like this book or not. The story is interesting but the writing style is odd.",
        "Excellent customer service! They resolved my issue quickly and were very polite."
    ]
    
    # True labels (for evaluation)
    true_labels = ['positive', 'negative', 'neutral', 'neutral', 'negative', 'positive', 'neutral', 'positive']
    
    # TODO: Complete the classification pipeline
    
    # Task 1: Create a simple rule-based classifier
    def rule_based_classifier(text):
        """Simple rule-based sentiment classifier"""
        text_lower = text.lower()
        
        positive_words = ['love', 'amazing', 'incredible', 'fantastic', 'excellent', 'perfect', 'great', 'awesome']
        negative_words = ['hate', 'terrible', 'awful', 'disappointing', 'predictable', 'mediocre', 'ruined', 'slow']
        
        positive_score = sum(1 for word in positive_words if word in text_lower)
        negative_score = sum(1 for word in negative_words if word in text_lower)
        
        if positive_score > negative_score:
            return 'positive'
        elif negative_score > positive_score:
            return 'negative'
        else:
            return 'neutral'
    
    print("\n✅ Task 1: Rule-based classifier created!")
    
    # Test the classifier
    predictions = []
    
    for i, text in enumerate(sample_texts, 1):
        prediction = rule_based_classifier(text)
        predictions.append(prediction)
        
        print(f"\n📝 Text {i}: {text}")
        print(f"🎭 Predicted sentiment: {prediction}")
        print(f"✅ True sentiment: {true_labels[i-1]}")
        print(f"{'✓ Correct!' if prediction == true_labels[i-1] else '✗ Incorrect'}")
    
    # Task 2: Evaluate classifier performance
    print("\n✅ Task 2: Classifier Evaluation")
    
    from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
    
    accuracy = accuracy_score(true_labels, predictions)
    print(f"\n📊 Classifier Performance:")
    print(f"Accuracy: {accuracy:.2%}")
    
    # Classification report
    print("\n📋 Classification Report:")
    print(classification_report(true_labels, predictions, zero_division=0))
    
    # Confusion matrix
    cm = confusion_matrix(true_labels, predictions, labels=['positive', 'negative', 'neutral'])
    
    plt.figure(figsize=(8, 6))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
                xticklabels=['positive', 'negative', 'neutral'],
                yticklabels=['positive', 'negative', 'neutral'])
    plt.title('Confusion Matrix - Rule-based Classifier')
    plt.xlabel('Predicted')
    plt.ylabel('Actual')
    plt.show()
    
    # Task 3: Compare with BERT classifier
    print("\n✅ Task 3: Comparison with BERT Classifier")
    
    bert_predictions = []
    bert_confidences = []
    
    # Map BERT output (binary) to our three labels
    for text in sample_texts:
        try:
            predicted_class, confidence, probs = classify_text(text, bert_tokenizer, bert_model, bert_device)
            
            # Convert binary sentiment to three classes based on confidence
            if predicted_class == 1 and confidence > 0.8:  # Positive and confident
                bert_pred = 'positive'
            elif predicted_class == 0 and confidence > 0.8:  # Negative and confident
                bert_pred = 'negative'
            else:  # Low confidence or middle ground
                bert_pred = 'neutral'
            
            bert_predictions.append(bert_pred)
            bert_confidences.append(confidence)
            
        except Exception as e:
            print(f"Error classifying text: {e}")
            bert_predictions.append('neutral')
            bert_confidences.append(0.0)
    
    bert_accuracy = accuracy_score(true_labels, bert_predictions)
    print(f"\n📊 BERT Classifier Performance:")
    print(f"Accuracy: {bert_accuracy:.2%}")
    
    # Compare both classifiers
    comparison_df = pd.DataFrame({
        'Text': [text[:50] + '...' for text in sample_texts],
        'True': true_labels,
        'Rule_Based': predictions,
        'BERT': bert_predictions,
        'BERT_Confidence': bert_confidences
    })
    
    print("\n📋 Classifier Comparison:")
    display(comparison_df)
    
    # Final comparison
    print(f"\n🏆 Final Results:")
    print(f"Rule-based accuracy: {accuracy:.2%}")
    print(f"BERT accuracy: {bert_accuracy:.2%}")
    
    if bert_accuracy > accuracy:
        print(f"✅ BERT performs better by {(bert_accuracy - accuracy):.2%}")
    else:
        print(f"✅ Rule-based performs better by {(accuracy - bert_accuracy):.2%}")
    
    return rule_based_classifier

challenge_3()

## 🎓 Summary and Key Takeaways

### What You've Learned:

1. **LLM Fundamentals**:
   - Transformer architecture and attention mechanisms
   - Evolution of large language models
   - Tokenization and embeddings

2. **Text Generation**:
   - Using GPT-2 for creative text generation
   - Understanding temperature and sampling parameters
   - Interactive text generation with user controls

3. **Text Classification**:
   - Sentiment analysis with BERT
   - Understanding model confidence
   - Interpreting classification results

4. **Embeddings and Semantic Understanding**:
   - Extracting text embeddings from LLMs
   - Visualizing semantic relationships
   - Finding similar words using cosine similarity

5. **Practical Applications**:
   - Text summarization techniques
   - Question-answering systems
   - Custom text classification pipelines

6. **Prompt Engineering**:
   - Different prompting strategies
   - Crafting effective prompts
   - Understanding model behavior with different inputs

### 🚀 Next Steps:

- **Continue to next notebook**: "Advanced LLM Techniques" for fine-tuning and advanced applications
- **Explore API-based LLMs**: OpenAI GPT, Claude, Gemini API integration
- **Learn about fine-tuning**: Adapting pre-trained models for specific tasks
- **Study RAG (Retrieval-Augmented Generation)**: Combining LLMs with external knowledge

### 📚 Additional Resources:

- [Hugging Face Documentation](https://huggingface.co/docs/transformers/)
- [Attention Is All You Need (Original Transformer Paper)](https://arxiv.org/abs/1706.03762)
- [Prompt Engineering Guide](https://www.promptingguide.ai/)
- [OpenAI API Documentation](https://platform.openai.com/docs/api-reference)

**🎉 Congratulations! You've completed your first Large Language Models notebook!**
You now have a solid foundation in understanding and working with state-of-the-art language models.