# GPT-Style Text Generation
## AIAT 122 - Deep Learning

## Learning Objectives

By completing this notebook, you will:
- Understand GPT architecture and text generation
- Implement text generation with pre-trained GPT models
- Fine-tune GPT for specific tasks
- Apply GPT to real-world text generation problems

## Prerequisites

- Python 3.8+
- Understanding of transformers and attention mechanisms
- Familiarity with Hugging Face Transformers library

---

## Introduction: Why GPT?

GPT (Generative Pre-trained Transformer) models revolutionized text generation:

- **Language Understanding**: Pre-trained on massive text corpora
- **Text Generation**: Can generate coherent, context-aware text
- **Task Adaptation**: Fine-tunable for specific applications
- **Real-World Applications**: Story writing, code completion, chatbots, content creation

**Real-World Application**: In this notebook, we'll use GPT for creative story generation and code completion, simulating real-world applications in:
- **Content Creation**: Automated article writing, creative storytelling
- **Software Development**: Code completion and generation
- **Customer Service**: Conversational AI chatbots
- **Education**: Personalized learning content generation

**Industry Impact**: GPT models power ChatGPT, GitHub Copilot, and many production AI systems.

In [None]:
%pip install transformers torch datasets -q

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer, GPT2Config
from transformers import Trainer, TrainingArguments
import warnings
warnings.filterwarnings('ignore')

print('✅ Setup complete!')

## Part 1: Understanding GPT Architecture

In [None]:
# Load pre-trained GPT-2 model (smaller version for demonstration)
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

# Set pad token
tokenizer.pad_token = tokenizer.eos_token

print(f"Model parameters: {model.num_parameters():,}")
print(f"Vocabulary size: {len(tokenizer)} tokens")
print(f"Max context length: {model.config.n_positions} tokens")
print("\n✅ GPT-2 model loaded!")

## Part 2: Text Generation Basics

In [None]:
def generate_text(prompt, model, tokenizer, max_length=100, temperature=0.7, top_k=50, top_p=0.95):
    """
    Generate text using GPT model.
    
    Args:
        prompt: Input text prompt
        model: GPT model
        tokenizer: GPT tokenizer
        max_length: Maximum generation length
        temperature: Controls randomness (lower = more deterministic)
        top_k: Keep only top k tokens
        top_p: Nucleus sampling threshold
    """
    # Encode input
    inputs = tokenizer.encode(prompt, return_tensors='pt')
    
    # Generate
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_length=max_length,
            temperature=temperature,
            top_k=top_k,
            top_p=top_p,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    
    # Decode
    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return generated_text

# Example 1: Story generation
prompt1 = "Once upon a time, in a world where artificial intelligence"
generated1 = generate_text(prompt1, model, tokenizer, max_length=150)
print("📖 Story Generation:")
print(generated1)
print("\n" + "="*60 + "\n")

# Example 2: Code completion
prompt2 = "def calculate_fibonacci(n):"
generated2 = generate_text(prompt2, model, tokenizer, max_length=100, temperature=0.3)
print("💻 Code Completion:")
print(generated2)
print("\n✅ Text generation working!")

## Part 3: Real-World Application: Creative Writing Assistant

In [None]:
# Real-world scenario: Content creation for marketing
marketing_prompts = [
    "Our new AI-powered product helps businesses",
    "The future of technology is",
    "Customer satisfaction is achieved through"
]

print("📝 Marketing Content Generation:\n")
for i, prompt in enumerate(marketing_prompts, 1):
    generated = generate_text(prompt, model, tokenizer, max_length=80, temperature=0.8)
    print(f"Prompt {i}: {prompt}")
    print(f"Generated: {generated}\n")
    print("-" * 60 + "\n")

print("✅ Real-world application demonstrated!")

## Part 4: Controlling Generation Quality

In [None]:
# Compare different generation strategies
prompt = "The impact of artificial intelligence on healthcare"

print("🔧 Generation Strategies Comparison:\n")

# Strategy 1: High temperature (creative)
creative = generate_text(prompt, model, tokenizer, temperature=1.2, top_p=0.9)
print(f"Creative (temp=1.2): {creative[:200]}...\n")

# Strategy 2: Low temperature (focused)
focused = generate_text(prompt, model, tokenizer, temperature=0.3, top_p=0.5)
print(f"Focused (temp=0.3): {focused[:200]}...\n")

# Strategy 3: Balanced
balanced = generate_text(prompt, model, tokenizer, temperature=0.7, top_p=0.95)
print(f"Balanced (temp=0.7): {balanced[:200]}...\n")

print("✅ Different strategies produce different styles!")

## Part 5: Fine-tuning GPT for Specific Tasks

**Real-World Application**: Companies fine-tune GPT models for:
- Domain-specific content (legal, medical, technical)
- Brand voice consistency
- Task-specific outputs (summarization, Q&A)

**Note**: Full fine-tuning requires significant resources. Here we demonstrate the concept.

In [None]:
# Conceptual example: Fine-tuning setup
# In production, you would:
# 1. Prepare domain-specific dataset
# 2. Configure training arguments
# 3. Fine-tune model
# 4. Evaluate on test set

print("📚 Fine-tuning Concept:")
print("\n1. Prepare Dataset:")
print("   - Collect domain-specific text (e.g., medical articles)")
print("   - Format as text files or use Hugging Face datasets")
print("\n2. Configure Training:")
print("   - Learning rate: 5e-5")
print("   - Batch size: 4-8 (depending on GPU)")
print("   - Epochs: 3-5")
print("\n3. Fine-tune Model:")
print("   - Use Trainer API from transformers")
print("   - Monitor loss and perplexity")
print("\n4. Evaluate:")
print("   - Test on held-out data")
print("   - Compare with base model")
print("\n✅ Fine-tuning process understood!")

## Real-World Use Cases

### 1. Content Marketing
- Generate blog posts, social media content
- Maintain brand voice consistency
- Scale content production

### 2. Code Generation
- GitHub Copilot-style code completion
- Code documentation generation
- Bug fix suggestions

### 3. Conversational AI
- Customer service chatbots
- Virtual assistants
- Personalized recommendations

### 4. Education
- Personalized learning content
- Quiz generation
- Explanation generation

---

## Key Takeaways

1. **GPT Architecture**: Transformer-based decoder-only model
2. **Text Generation**: Controlled by temperature, top-k, top-p
3. **Fine-tuning**: Adapts pre-trained models to specific domains
4. **Real-World Impact**: Powers many production AI systems

## Next Steps

- Explore larger GPT models (GPT-3, GPT-4 via API)
- Fine-tune on your own dataset
- Implement RAG (Retrieval-Augmented Generation)
- Build production text generation pipeline

---

**End of Notebook**