# KyleBot: A GPT-2 Chatbot for Learning

This notebook demonstrates how to build a generative AI chatbot using GPT-2. Each section includes educational comments to help you understand what's happening.

## What You'll Learn:
- How to load and use pre-trained language models
- Different text generation strategies (greedy, beam search, sampling)
- How to create a conversational interface
- Parameter tuning for better responses
- Building an interactive chat loop

In [1]:
# Cell 1: Import Libraries and Load Model
# This cell sets up everything we need for our chatbot

import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import random
import re

# Load the pre-trained GPT-2 model and tokenizer
# The tokenizer converts text to numbers that the model can understand
# The model contains the learned weights from training on massive amounts of text
print("Loading GPT-2 model and tokenizer...")
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

# Set the model to evaluation mode (no training)
model.eval()

# Add a special token for the end of text
tokenizer.pad_token = tokenizer.eos_token

print("✅ GPT-2 loaded successfully!")
print(f"Model parameters: {model.num_parameters():,}")

Loading GPT-2 model and tokenizer...
✅ GPT-2 loaded successfully!
Model parameters: 124,439,808


In [None]:
def generate_response_greedy(prompt, max_new_tokens=50):
    inputs = tokenizer.encode(prompt, return_tensors="pt")
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_new_tokens=max_new_tokens,
            num_return_sequences=1,
            do_sample=False,
            pad_token_id=tokenizer.eos_token_id
        )
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response[len(prompt):]

def generate_response_sampling(prompt, max_new_tokens=50, temperature=0.8, top_k=50):
    """
    Sampling with temperature and top-k: More creative and diverse responses
    - temperature: Controls randomness (higher = more random)
    - top_k: Only considers the top k most likely words
    """
    inputs = tokenizer.encode(prompt, return_tensors="pt")
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_new_tokens=max_new_tokens,
            num_return_sequences=1,
            do_sample=True,
            temperature=temperature,
            top_k=top_k,
            pad_token_id=tokenizer.eos_token_id
        )
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response[len(prompt):]

def generate_response_beam_search(prompt, max_new_tokens=50, num_beams=5):
    inputs = tokenizer.encode(prompt, return_tensors="pt").to(model.device)
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_new_tokens=max_new_tokens,
            do_sample=False,
            num_beams=num_beams,
            no_repeat_ngram_size=2,
            repetition_penalty=1.2,
            pad_token_id=tokenizer.eos_token_id
        )
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response[len(prompt):]

In [3]:
# Cell 3: Chatbot Class
# This class organizes our chatbot functionality

class KyleBot:
    def __init__(self, name="KyleBot"):
        self.name = name
        self.conversation_history = []
        self.generation_method = "sampling"  # Default method
        
    def add_to_history(self, user_input, bot_response):
        """Keep track of conversation for context"""
        self.conversation_history.append({
            "user": user_input,
            "bot": bot_response
        })
        
    def create_context_prompt(self, user_input):
        """Create a prompt with conversation history for context"""
        prompt = f"You are {self.name}, a knowledgeable AI. Provide a clear and concise definition of the user's topic.\n"
        if len(self.conversation_history) > 0:
            recent_history = self.conversation_history[-2:]
            context = "\n".join([
                f"User: {exchange['user']}\n{self.name}: {exchange['bot']}"
                for exchange in recent_history
            ])
            prompt += context + "\n"
        prompt += f"User: {user_input}\n{self.name}: "
        return prompt
    
    def generate_response(self, user_input, method=None, **kwargs):
        """Generate a response using the specified method"""
        method = method or self.generation_method
        prompt = self.create_context_prompt(user_input)
        print(f"Using method: {method}, Prompt: {prompt}")  # Debug
    
        if method == "greedy":
            response = generate_response_greedy(prompt, **kwargs)
        elif method == "sampling":
            response = generate_response_sampling(prompt, **kwargs)
        elif method == "beam":
            response = generate_response_beam_search(prompt, **kwargs)
        else:
            response = generate_response_sampling(prompt, **kwargs)
    
        response = self.clean_response(response)
        self.add_to_history(user_input, response)
        return response
    
    def clean_response(self, response, max_chars=500):
        """Clean up the generated response"""
        response = response.strip()
        response = re.sub(rf"^{self.name}:\s*|^User:\s*|\n{self.name}:.*", "", response)
        if len(response) > max_chars:
            for char in ['.', '!', '?']:
                if char in response[:max_chars]:
                    response = response[:response.index(char) + 1]
                    break
        return response
    
    def set_generation_method(self, method):
        """Change the generation method"""
        valid_methods = ["greedy", "sampling", "beam"]
        if method in valid_methods:
            self.generation_method = method
            print(f"✅ Generation method set to: {method}")
        else:
            print(f"❌ Invalid method. Choose from: {valid_methods}")

# Create our chatbot instance
kylebot = KyleBot()
print("✅ KyleBot created and ready to chat!")

✅ KyleBot created and ready to chat!


In [4]:
test_prompt = "What is artificial intelligence?"

print("🤖 Testing different generation methods:")
print("=" * 50)

# Test greedy decoding
print("\n1. GREEDY DECODING (always picks most likely word):")
response = kylebot.generate_response(test_prompt, method="greedy", max_new_tokens=100, no_repeat_ngram_size=2, repetition_penalty=1.2, max_chars=500)
print(f"Response: {response}")

# Test sampling
print("\n2. SAMPLING (more creative, uses temperature and top-k):")
response = kylebot.generate_response(test_prompt, method="sampling", max_new_tokens=100, temperature=0.7, top_k=30, max_chars=500)
print(f"Response: {response}")

# Test beam search
print("\n3. BEAM SEARCH (explores multiple possibilities):")
response = kylebot.generate_response(test_prompt, method="beam", max_new_tokens=100, num_beams=5, no_repeat_ngram_size=2, repetition_penalty=1.2, max_chars=500)
print(f"Response: {response}")

print("\n" + "=" * 50)
print("💡 Notice how each method produces different styles of responses!")

🤖 Testing different generation methods:

1. GREEDY DECODING (always picks most likely word):
Using method: greedy, Prompt: You are KyleBot, a knowledgeable AI. Provide a clear and concise definition of the user's topic.
User: What is artificial intelligence?
KyleBot: 


TypeError: generate_response_greedy() got an unexpected keyword argument 'no_repeat_ngram_size'

In [None]:
# Cell 5: Interactive Chat Loop
# This is where you can actually chat with KyleBot!

def chat_with_kylebot():
    """Interactive chat interface"""
    print("🤖 Welcome to KyleBot! Let's chat!")
    print("💡 Commands:")
    print("   - Type 'quit' to exit")
    print("   - Type 'method: [greedy/sampling/beam]' to change generation method")
    print("   - Type 'history' to see conversation history")
    print("   - Type 'help' for this message")
    print("=" * 50)
    
    while True:
        try:
            # Get user input
            user_input = input("\n👤 You: ").strip()
            
            # Handle special commands
            if user_input.lower() == 'quit':
                print("👋 Goodbye! Thanks for chatting with KyleBot!")
                break
            elif user_input.lower() == 'help':
                print("💡 Commands:")
                print("   - Type 'quit' to exit")
                print("   - Type 'method: [greedy/sampling/beam]' to change generation method")
                print("   - Type 'history' to see conversation history")
                print("   - Type 'help' for this message")
                continue
            elif user_input.lower() == 'history':
                if kylebot.conversation_history:
                    print("\n📜 Conversation History:")
                    for i, exchange in enumerate(kylebot.conversation_history, 1):
                        print(f"{i}. You: {exchange['user']}")
                        print(f"   KyleBot: {exchange['bot']}")
                else:
                    print("📜 No conversation history yet.")
                continue
            elif user_input.lower().startswith('method:'):
                method = user_input.split(':')[1].strip()
                kylebot.set_generation_method(method)
                continue
            elif not user_input:
                continue
            
            # Generate and display response
            print(f"\n🤖 KyleBot ({kylebot.generation_method}): ", end="")
            response = kylebot.generate_response(user_input)
            print(response)
            
        except KeyboardInterrupt:
            print("\n👋 Goodbye! Thanks for chatting with KyleBot!")
            break
        except Exception as e:
            print(f"\n❌ Error: {e}")
            print("Try again or type 'quit' to exit.")

# Uncomment the line below to start chatting!
chat_with_kylebot()

print("💡 To start chatting, uncomment the 'chat_with_kylebot()' line above and run this cell!")

In [None]:
# Cell 6: Experiment with Parameters
# This cell helps you understand how different parameters affect generation

def experiment_with_parameters():
    """Demonstrate how different parameters affect text generation"""
    
    test_prompt = "The future of technology is"
    
    print("🧪 Parameter Experimentation")
    print("=" * 50)
    
    # Test different temperatures
    print("\n🌡️  Temperature Effect (controls randomness):")
    temperatures = [0.1, 0.5, 1.0, 1.5]
    
    for temp in temperatures:
        response = generate_response_sampling(
            test_prompt, 
            max_new_tokens=50, 
            temperature=temp, 
            top_k=50
        )
        print(f"Temperature {temp}: {response}")
    
    # Test different top-k values
    print("\n🔝 Top-k Effect (limits word choices):")
    top_k_values = [10, 20, 50, 100]
    
    for k in top_k_values:
        response = generate_response_sampling(
            test_prompt, 
            max_new_tokens=50, 
            temperature=0.8, 
            top_k=k
        )
        print(f"Top-k {k}: {response}")
    
    # Test different beam search beam counts
    print("\n🔍 Beam Search Effect (number of parallel searches):")
    beam_counts = [1, 3, 5, 10]
    
    for beams in beam_counts:
        response = generate_response_beam_search(
            test_prompt, 
            max_new_tokens=50
            , 
            num_beams=beams
        )
        print(f"Beams {beams}: {response}")

# Run the experiment
experiment_with_parameters()

print("\n💡 Key Takeaways:")
print("   - Lower temperature = more focused, predictable responses")
print("   - Higher temperature = more creative, diverse responses")
print("   - Lower top-k = more conservative word choices")
print("   - More beams = potentially better quality but slower")
print("   - Greedy = fast but repetitive")
print("   - Sampling = creative but sometimes incoherent")
print("   - Beam search = balanced quality and coherence")

In [None]:
# Cell 7: Quick Chat Examples
# Try these examples to see KyleBot in action

print("🚀 Quick Chat Examples")
print("=" * 30)

# Example 1: Simple question
print("\n👤 You: What is machine learning?")
response = kylebot.generate_response("What is machine learning?", method="sampling", temperature=0.7)
print(f"🤖 KyleBot: {response}")

# Example 2: Creative prompt
print("\n👤 You: Write a short story about a robot")
response = kylebot.generate_response("Write a short story about a robot", method="sampling", temperature=1.0, max_new_tokens=50)
print(f"🤖 KyleBot: {response}")

# Example 3: Technical question
print("\n👤 You: How do neural networks work?")
response = kylebot.generate_response("How do neural networks work?", method="beam", num_beams=5, max_new_tokens=50)
print(f"🤖 KyleBot: {response}")

print("\n💡 Now try your own questions! Use the chat function above.")

## 🎓 What You've Learned

Congratulations! You've built a complete generative AI chatbot. Here's what you now understand:

### 🤖 **Model Loading & Setup**
- How to load pre-trained language models
- Tokenization (converting text to numbers)
- Model configuration and parameters

### 🧠 **Text Generation Strategies**
- **Greedy Decoding**: Always picks the most likely next word (fast, predictable)
- **Sampling**: Randomly selects from likely words (creative, diverse)
- **Beam Search**: Explores multiple possible sequences (balanced quality)

### ⚙️ **Key Parameters**
- **Temperature**: Controls randomness (0.1 = focused, 1.5 = creative)
- **Top-k**: Limits word choices to top k most likely
- **Max Length**: Controls response length
- **Num Beams**: Number of parallel searches in beam search

### 🏗️ **Software Architecture**
- Object-oriented design with the KyleBot class
- Conversation history management
- Interactive user interface
- Error handling and user commands

### 🚀 **Next Steps to Explore**
1. **Fine-tuning**: Train the model on your own data
2. **Different Models**: Try GPT-3, BERT, or other models
3. **Web Interface**: Build a web app for your chatbot
4. **Memory**: Add long-term conversation memory
5. **Personality**: Customize the bot's responses
6. **Multi-turn**: Handle complex conversations

### 💡 **Pro Tips**
- Start with sampling (temperature 0.7-0.9) for most use cases
- Use beam search for factual or technical responses
- Greedy decoding is good for simple, predictable tasks
- Always clean and format your responses
- Keep conversation history for context

Happy learning! 🎉