# Word Prediction with GPT-2

Welcome to your word prediction adventure! In this notebook, we'll explore how Large Language Models can predict and complete text sequences using GPT-2.

## Setup and Imports

First, let's install and import the necessary libraries. We'll use the Hugging Face Transformers library for easy access to GPT-2.

In [None]:
# Install required packages
%pip install transformers torch

In [None]:
# Import necessary libraries
from transformers import GPT2LMHeadModel, GPT2Tokenizer, pipeline
import torch
import warnings
warnings.filterwarnings('ignore')

print("Libraries imported successfully!")

## Load the GPT-2 Model

Let's load a pre-trained GPT-2 model. We'll use the small version for faster processing.

In [None]:
# Load the GPT-2 model and tokenizer
model_name = "gpt2"  # You can also try "gpt2-medium" for better quality
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Create a text generation pipeline
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

print(f"GPT-2 model '{model_name}' loaded successfully!")
print(f"Model has {model.config.n_embd} dimensions and {model.config.n_layer} layers")

## Basic Word Prediction Function

Let's create a simple function to generate text completions with customizable parameters.

In [None]:
def predict_text(prompt, max_new_tokens=10, temperature=1.0, num_return_sequences=1):
    """
    Generate text completion using GPT-2
    
    Args:
        prompt (str): The input text to complete
        max_new_tokens (int): Maximum number of new tokens to generate (excluding prompt)
        temperature (float): Controls randomness (0.1 = conservative, 2.0 = creative)
        num_return_sequences (int): Number of different completions to generate
    
    Returns:
        List of generated text completions
    """
    # Generate text
    outputs = generator(
        prompt,
        max_new_tokens=max_new_tokens,
        temperature=temperature,
        num_return_sequences=num_return_sequences,
        pad_token_id=tokenizer.eos_token_id,
        do_sample=True
    )
    
    # Extract generated text
    completions = []
    for output in outputs:
        generated_text = output['generated_text']
        # Remove the original prompt to show only the completion
        completion = generated_text[len(prompt):].strip()
        completions.append(completion)
    
    return completions

print("Text prediction function ready!")

## Explore Different Parameters

Try experimenting with different settings to see how they affect the predictions!

In [None]:
# Experiment with different temperatures
test_prompt = "In the year 2030, technology will"

temperatures = [0.2, 0.8, 1.5]

for temp in temperatures:
    completion = predict_text(test_prompt, max_new_tokens=10, temperature=temp)[0]
    print(f"Temperature {temp}: '{test_prompt} {completion}'")
    print()

## 🎯 Your Mission: Experiment with Word Prediction

Now it's time to experiment! Try the prompt "The future of artificial intelligence is" with temperature=0.8 and see what creative completion the model generates.

In [None]:
prompt = ""
temperature = 0

# Generate completion
completions = predict_text(prompt, max_new_tokens=60, temperature=temperature)

print(f"Prompt: '{prompt}'")
print(f"Temperature: {temperature}")
print("\nGenerated completion:")
print(f"'{completions[0]}'")

# Show the full generated text
print("\nFull text:")
print(f"'{prompt} {completions[0]}'")

## 🎉 Congratulations!

You've successfully explored word prediction with GPT-2! You've learned how:
- Language models can complete text based on context
- Temperature affects creativity vs. consistency
- The same prompt can generate different completions
- Context and prompt engineering influence the quality of predictions

Feel free to continue experimenting with different prompts and parameters to see what interesting completions you can generate!