# Lab 1: Training Data Extraction from LLMs

## Overview
Learn to extract memorized training data from language models using various techniques.

## Objectives
- Extract memorized data from GPT-2
- Test different extraction strategies
- Analyze memorization patterns
- Implement detection mechanisms

In [1]:
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import re
import numpy as np
from collections import Counter
import matplotlib.pyplot as plt

# Detect device (supports CUDA, Apple Silicon MPS, and CPU)
if torch.cuda.is_available():
    device = torch.device('cuda')
    print(f"✓ Using CUDA GPU: {torch.cuda.get_device_name(0)}")
elif hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
    device = torch.device('mps')
    print("✓ Using Apple Silicon GPU (MPS)")
else:
    device = torch.device('cpu')
    print("ℹ Using CPU")

print(f"Device: {device}")

✓ Using Apple Silicon GPU (MPS)
Device: mps


## Part 1: Load Model and Setup

In [2]:
# Load GPT-2
model_name = 'gpt2'
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)
model.to(device)
model.eval()

print(f"Model loaded: {model_name}")

Model loaded: gpt2


## Part 2: Basic Extraction

In [3]:
def generate_text(prompt, temperature=1.0, max_length=100, num_return=1):
    inputs = tokenizer(prompt, return_tensors='pt').to(device)
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_length=max_length,
            temperature=temperature,
            num_return_sequences=num_return,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    
    return [tokenizer.decode(output, skip_special_tokens=True) for output in outputs]