This is an extension to the main task-1. It includes how information theoretic concepts (specifically perplexity) and geometric concepts (local intrinsic dimension) can be used to differentiate between AI and human-written text

# V. Information-Theoretic Signatures

Information theory quantifies predictability. AI, is very predictable. Or it should be. That's what we're going to see here...  

Before getting started on the specifics, [this blog](https://kuiper2000.github.io/chaos_and_predictability/week9/week9) explains the basics of information theory, and how it ties into predictability quite well. I enjoyed reading it.

I also want to preface by saying that I am by no means an expert / know much about information theory. I tried to learn a bit for the purposes of this task, but my domain knowledge is limited to that. I also thank the Infosec class and TA's for hinting the basics of this to us, which led me down this rabbit hole.

## Perplexity

[This blog](https://medium.com/nlplanet/two-minutes-nlp-perplexity-explained-with-simple-probabilities-6cdc46884584) explains perplexity really nicely through a simple example.

In general, it is a measure of how well a given sentence is predicted. Or, in the sense of the word itself, how **perplexed** a model is when reading text. Lower perplexity => less perplexed, higher perplexity => more perplexed.

- Because AI generates text by selecting high-probability tokens, the resulting text has statistically low perplexity. It follows the "path of least resistance." However, since our temperature is relatively high (set to 1), the perplexity may be higher than usual AI generated text.
- Human writing - especially famous authors - is replete with choices. Creative metaphors, sudden topic shifts, and idiomatic expressions that statistically defy the model's predictive expectations.


### The maths behind perplexity

This section is inspired by [Fabio Chiusano's Medium post.](https://medium.com/nlplanet/two-minutes-nlp-perplexity-explained-with-simple-probabilities-6cdc46884584)

A language model predicts text **one word at a time**.

For a sentence like:

> **"a red fox."**

the model assigns probabilities like:

- P("a")
- P("red" | "a")
- P("fox" | "a red")
- P("." | "a red fox")

To get the probability of the **whole sentence**, we multiply:

$$
P(W) = P(w_1) \times P(w_2|w_1) \times \dots \times P(w_n|w_1,\dots,w_{n-1})
$$

Example:

$$
P(\text{"a red fox."}) = 0.4 \times 0.27 \times 0.55 \times 0.79 = 0.0469
$$

**Problem: longer sentences always get smaller probabilities**

Multiplying many numbers **smaller than 1** makes the result very small.

So:
- Long sentences → tiny probabilities  
- Short sentences → bigger probabilities  

This makes comparisons **unfair**.

---

To remove the effect of sentence length, we **average** the probabilities using the **geometric mean**.

For a sentence with `n` words:

$$
P_{\text{norm}}(W) = P(W)^{1/n}
$$

Example:

$$
P_{\text{norm}}(\text{"a red fox."}) = 0.0469^{1/4} = 0.465
$$

This means that on average, the model assigns about **46.5% confidence per word**.

---

Perplexity is just the **inverse** of this normalized probability:

$$
\text{Perplexity}(W) = \frac{1}{P_{\text{norm}}(W)}
$$

or  

$$
\text{Perplexity}(W) = \left(\frac{1}{P(W)}\right)^{1/n}
$$

Example:

$$
\text{PP} = \frac{1}{0.465} \approx 2.15
$$

### Interpretation:
- Perplexity ≈ 2 => the model feels like it's choosing between **2 reasonable words** at each step

So we can see that a bad model has high perplexity.

### How we are calculating the Perplexity in code

Perplexity calculation happens in 4 main steps:
1. Tokenization: The text is converted into token-id's which GPT2 understands.
2. Teacher Forcing: The key is labels=input_ids. This basically tells GPT2:
- here is your input sequence
- now predict each subsequent token
- calculate loss

3. Cross-Entropy Loss: GPT2 calculates its loss (how wrong each prediction was).
- For each prediction, it predicts a probability distribution of all possible next tokens.
- The loss is measured as the gap between predicted probability and the actual token.

4. Next, we want to convert loss to probability. `perplexity = exp(loss)`


**IMPORTANT NOTE:** Perplexity is also affected by temperature. As temperature increases, the AI becomes more non-deterministic, and so perplexity increases significantly as well. We use a temperature of 1.0 for creation of class 2 and class 3. This will result in higher than expected, however, it will still mostly be lower than that of humans. Generally, AI achieves human levels of perplexity only at temperatures >1.5. [Source - Peeperkorn et al., 2024](https://arxiv.org/html/2405.00492v1).

# IMPLEMENTATION

In [2]:
from pathlib import Path
import sys

# Check if running in Google Colab
IN_COLAB = 'google.colab' in sys.modules

if IN_COLAB:
    print("Running in Google Colab - attempting to mount Drive...")
    try:
        from google.colab import drive
        drive.mount('/content/drive')
        # Adjust this path to where you stored the dataset in Google Drive
        DATASET_DIR = Path('/content/drive/MyDrive/precog-my-dataset/dataset')
        print(f"✓ Google Drive mounted successfully")
    except Exception as e:
        print(f"✗ Failed to mount Google Drive: {e}")
        print("Falling back to local path...")
        DATASET_DIR = Path('../dataset')
else:
    print("Running locally (not in Colab)")
    DATASET_DIR = Path('../dataset')

print(f"\nUsing dataset directory: {DATASET_DIR.resolve()}")

# Verify the path exists
if DATASET_DIR.exists():
    print(f"✓ Dataset directory found!")
    # Count subdirectories
    subdirs = [d for d in DATASET_DIR.iterdir() if d.is_dir()]
    print(f"  Contains {len(subdirs)} subdirectories")
else:
    print(f"✗ Dataset directory NOT found at {DATASET_DIR.resolve()}")
    print(f"  Please ensure your dataset folder is in the correct location.")

Mounted at /content/drive
✓ Google Drive mounted successfully

Using dataset directory: /content/drive/MyDrive/precog-my-dataset/dataset
✓ Dataset directory found!
  Contains 5 subdirectories


In [4]:
import torch
from transformers import GPT2LMHeadModel, GPT2TokenizerFast
import numpy as np
import statistics
from tqdm.auto import tqdm
import warnings
from pathlib import Path
warnings.filterwarnings('ignore')

DATASET_DIR = Path('../dataset')

print("Loading GPT-2 model and tokenizer...")
print("(This may take a bit if downloading for the first time)")

device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

try:
    # Load model and tokenizer
    model = GPT2LMHeadModel.from_pretrained("gpt2").to(device)
    gpt2_tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
    
    # Set padding token (GPT-2 doesn't have one by default)
    gpt2_tokenizer.pad_token = gpt2_tokenizer.eos_token
    
    model.eval()
    print("✓ Model loaded successfully!")
except Exception as e:
    print(f"✗ Error loading model: {e}")
    print("\nTip: If you're getting timeout errors, try running:")
    print("  export HF_HUB_DOWNLOAD_TIMEOUT=300")
    print("in your terminal before running this cell, or download manually.")
    raise

def calculate_perplexity_single(text):
    """Calculate perplexity for a single text"""
    try:
        # Tokenize with truncation
        encodings = gpt2_tokenizer(
            text, 
            return_tensors="pt", 
            truncation=True, 
            max_length=1024
        )
        input_ids = encodings.input_ids.to(device)
        
        # Calculate loss
        with torch.no_grad():
            outputs = model(input_ids, labels=input_ids)
            loss = outputs.loss
            perplexity = torch.exp(loss).item()
        
        return perplexity
    except Exception as e:
        print(f"Error calculating perplexity: {e}")
        return None

def calculate_perplexities_for_texts(texts, batch_size=1):
    """Calculate perplexity for multiple texts with progress bar"""
    perplexities = []
    
    for text in tqdm(texts, desc="Calculating perplexities", unit="text"):
        ppl = calculate_perplexity_single(text)
        if ppl is not None:
            perplexities.append(ppl)
    
    return perplexities

def read_file_safe(file_path):
    """Safely read a file and return its content"""
    try:
        with open(file_path, 'r', encoding='utf-8') as f:
            text = f.read().strip()
        # Only return if text is not empty and has reasonable length
        if text and len(text) > 10:
            return text
        return None
    except Exception as e:
        print(f"Error reading {file_path}: {e}")
        return None

def get_text_files_from_paths(path_list):
    """Get all .txt files from a list of directory paths"""
    txt_files = []
    for path in path_list:
        if path.exists():
            files = list(path.glob('*.txt'))
            txt_files.extend(files)
            print(f"  Found {len(files)} files in {path.name}")
        else:
            print(f"  ✗ Warning: Path does not exist: {path}")
    return txt_files

# Define paths matching the structure from task-1.ipynb
print("\n" + "="*70)
print("SETTING UP PATHS")
print("="*70)

# Class 1: Human-written
class1_paths = [
    DATASET_DIR / 'class1-human-written' / '01-arthur-conan-doyle' / 'extracted_paragraphs',
    DATASET_DIR / 'class1-human-written' / '02-pg-wodehouse' / 'extracted_paragraphs',
    DATASET_DIR / 'class1-human-written' / '03-mark-twain' / 'extracted_paragraphs',
    DATASET_DIR / 'class1-human-written' / '04-william-shakespeare' / 'extracted_paragraphs'
]

# Class 2: AI-written
class2_paths = [
    DATASET_DIR / 'class2-ai-written' / 'ai-generated-paragraphs'
]

# Class 3: AI-mimicry
class3_paths = [
    DATASET_DIR / 'class3-ai-mimicry' / '01-arthur-conan-doyle',
    DATASET_DIR / 'class3-ai-mimicry' / '02-pg-wodehouse',
    DATASET_DIR / 'class3-ai-mimicry' / '03-mark-twain',
    DATASET_DIR / 'class3-ai-mimicry' / '04-william-shakespeare'
]

# Verify all paths exist
print("\nVerifying paths...")
all_paths_valid = True
for name, paths in [("Class 1", class1_paths), ("Class 2", class2_paths), ("Class 3", class3_paths)]:
    print(f"\n{name}:")
    for p in paths:
        status = "✓" if p.exists() else "✗"
        print(f"  {status} {p}")
        if not p.exists():
            all_paths_valid = False

if not all_paths_valid:
    print("\n⚠ Warning: Some paths don't exist!")
    
print("\n" + "="*70)
print("CALCULATING PERPLEXITIES")
print("="*70)

perplexity_results = {}

for class_name, class_paths in [("Class 1 (Human)", class1_paths), 
                                 ("Class 2 (AI)", class2_paths), 
                                 ("Class 3 (AI Mimicry)", class3_paths)]:
    print(f"\n{'='*70}")
    print(f"Processing {class_name}")
    print('='*70)
    
    # Get all text files
    print("Searching for text files...")
    text_files = get_text_files_from_paths(class_paths)
    print(f"✓ Total: {len(text_files)} text files")
    
    if len(text_files) == 0:
        print(f"✗ No files found. Skipping {class_name}...")
        continue
    
    # Read all files
    print(f"\nReading {len(text_files)} files...")
    all_texts = []
    for file_path in tqdm(text_files, desc="Reading files", unit="file"):
        text = read_file_safe(file_path)
        if text:
            all_texts.append(text)
    
    print(f"✓ Successfully read {len(all_texts)} valid text files")
    
    if not all_texts:
        print(f"✗ No valid texts found. Skipping {class_name}...")
        continue
    
    # Calculate perplexities
    print(f"\nCalculating perplexities for {len(all_texts)} texts...")
    perplexities = calculate_perplexities_for_texts(all_texts)
    
    # Filter out None values
    perplexities = [p for p in perplexities if p is not None and not np.isnan(p) and not np.isinf(p)]
    
    if perplexities:
        mean_perplexity = statistics.mean(perplexities)
        median_perplexity = statistics.median(perplexities)
        std_perplexity = statistics.stdev(perplexities) if len(perplexities) > 1 else 0
        min_perplexity = min(perplexities)
        max_perplexity = max(perplexities)
        
        perplexity_results[class_name] = {
            'mean': mean_perplexity,
            'median': median_perplexity,
            'std': std_perplexity,
            'min': min_perplexity,
            'max': max_perplexity,
            'count': len(perplexities)
        }
        
        print(f"\n✓ Results for {class_name}:")
        print(f"  Mean Perplexity:   {mean_perplexity:.2f}")
        print(f"  Median Perplexity: {median_perplexity:.2f}")
        print(f"  Std Dev:           {std_perplexity:.2f}")
        print(f"  Min:               {min_perplexity:.2f}")
        print(f"  Max:               {max_perplexity:.2f}")
        print(f"  Sample size:       {len(perplexities)}")
    else:
        print(f"✗ No valid perplexity values calculated for {class_name}")

print("\n" + "="*70)
print("PERPLEXITY SUMMARY")
print("="*70)
print(f"{'Class':<25} {'Mean':<12} {'Median':<12} {'Std':<12} {'Files':<10}")
print("-"*70)
for class_name, stats in perplexity_results.items():
    print(f"{class_name:<25} {stats['mean']:<12.2f} {stats['median']:<12.2f} {stats['std']:<12.2f} {stats['count']:<10}")
print("="*70)

# Interpretation
if perplexity_results:
    print("\nINTERPRETATION:")
    print("-" * 70)
    print("Lower perplexity = More predictable text (typical of AI)")
    print("Higher perplexity = Less predictable text (typical of humans)")
    print("\nExpected pattern:")
    print("  Class 2 (AI) < Class 3 (AI Mimicry) < Class 1 (Human)")
    print("-" * 70)

Loading GPT-2 model and tokenizer...
(This may take a bit if downloading for the first time)
Using device: cpu


Loading weights: 100%|██████████| 148/148 [00:00<00:00, 818.37it/s, Materializing param=transformer.wte.weight]             
GPT2LMHeadModel LOAD REPORT from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


✓ Model loaded successfully!

SETTING UP PATHS

Verifying paths...

Class 1:
  ✓ ../dataset/class1-human-written/01-arthur-conan-doyle/extracted_paragraphs
  ✓ ../dataset/class1-human-written/02-pg-wodehouse/extracted_paragraphs
  ✓ ../dataset/class1-human-written/03-mark-twain/extracted_paragraphs
  ✓ ../dataset/class1-human-written/04-william-shakespeare/extracted_paragraphs

Class 2:
  ✓ ../dataset/class2-ai-written/ai-generated-paragraphs

Class 3:
  ✓ ../dataset/class3-ai-mimicry/01-arthur-conan-doyle
  ✓ ../dataset/class3-ai-mimicry/02-pg-wodehouse
  ✓ ../dataset/class3-ai-mimicry/03-mark-twain
  ✓ ../dataset/class3-ai-mimicry/04-william-shakespeare

CALCULATING PERPLEXITIES

Processing Class 1 (Human)
Searching for text files...
  Found 500 files in extracted_paragraphs
  Found 500 files in extracted_paragraphs
  Found 480 files in extracted_paragraphs
  Found 480 files in extracted_paragraphs
✓ Total: 1960 text files

Reading 1960 files...


Reading files: 100%|██████████| 1960/1960 [00:00<00:00, 8209.50file/s]


✓ Successfully read 1960 valid text files

Calculating perplexities for 1960 texts...


Calculating perplexities:   0%|          | 0/1960 [00:00<?, ?text/s]`loss_type=None` was set in the config but it is unrecognized. Using the default loss: `ForCausalLMLoss`.
Calculating perplexities: 100%|██████████| 1960/1960 [07:56<00:00,  4.12text/s]



✓ Results for Class 1 (Human):
  Mean Perplexity:   57.48
  Median Perplexity: 49.98
  Std Dev:           28.97
  Min:               5.22
  Max:               241.31
  Sample size:       1960

Processing Class 2 (AI)
Searching for text files...
  Found 988 files in ai-generated-paragraphs
✓ Total: 988 text files

Reading 988 files...


Reading files: 100%|██████████| 988/988 [00:00<00:00, 7663.18file/s]


✓ Successfully read 988 valid text files

Calculating perplexities for 988 texts...


Calculating perplexities: 100%|██████████| 988/988 [03:43<00:00,  4.41text/s]



✓ Results for Class 2 (AI):
  Mean Perplexity:   48.63
  Median Perplexity: 47.40
  Std Dev:           13.22
  Min:               18.68
  Max:               110.91
  Sample size:       988

Processing Class 3 (AI Mimicry)
Searching for text files...
  Found 250 files in 01-arthur-conan-doyle
  Found 250 files in 02-pg-wodehouse
  Found 237 files in 03-mark-twain
  Found 236 files in 04-william-shakespeare
✓ Total: 973 text files

Reading 973 files...


Reading files: 100%|██████████| 973/973 [00:00<00:00, 7411.52file/s]


✓ Successfully read 972 valid text files

Calculating perplexities for 972 texts...


Calculating perplexities: 100%|██████████| 972/972 [03:58<00:00,  4.08text/s]


✓ Results for Class 3 (AI Mimicry):
  Mean Perplexity:   64.96
  Median Perplexity: 61.70
  Std Dev:           19.65
  Min:               30.28
  Max:               347.20
  Sample size:       972

PERPLEXITY SUMMARY
Class                     Mean         Median       Std          Files     
----------------------------------------------------------------------
Class 1 (Human)           57.48        49.98        28.97        1960      
Class 2 (AI)              48.63        47.40        13.22        988       
Class 3 (AI Mimicry)      64.96        61.70        19.65        972       

INTERPRETATION:
----------------------------------------------------------------------
Lower perplexity = More predictable text (typical of AI)
Higher perplexity = Less predictable text (typical of humans)

Expected pattern:
  Class 2 (AI) < Class 3 (AI Mimicry) < Class 1 (Human)
----------------------------------------------------------------------





## Adding Perplexity to math.csv

Now we'll add the perplexity values to the existing math.csv file so they can be used for analysis alongside other features.

In [5]:
import pandas as pd
from pathlib import Path
from tqdm.auto import tqdm

# Load the existing math.csv
csv_path = Path('../task-1/math.csv')
print(f"Loading {csv_path}...")
df = pd.read_csv(csv_path)

print(f"✓ Loaded {len(df)} rows")
print(f"Columns: {list(df.columns)}")

# Check if perplexity column already exists
if 'perplexity' in df.columns:
    print("\n⚠ Warning: 'perplexity' column already exists!")
    print("Overwriting existing values...")
else:
    print("\n✓ Adding new 'perplexity' column")

# Create a mapping of class names to directory paths
def get_file_path(row):
    """Construct the full file path from the row data"""
    class_name = row['class']
    filename = row['text_file_name']
    author = row['author']
    
    # Map authors to directory names
    author_dirs = {
        'Arthur Conan Doyle': '01-arthur-conan-doyle',
        'P. G. Wodehouse': '02-pg-wodehouse',
        'Mark Twain': '03-mark-twain',
        'William Shakespeare': '04-william-shakespeare'
    }
    
    if class_name == 'Class 1: Human-written':
        author_dir = author_dirs.get(author)
        if author_dir:
            return DATASET_DIR / 'class1-human-written' / author_dir / 'extracted_paragraphs' / filename
    elif class_name == 'Class 2: AI-written':
        return DATASET_DIR / 'class2-ai-written' / 'ai-generated-paragraphs' / filename
    elif class_name == 'Class 3: AI-mimicry':
        author_dir = author_dirs.get(author)
        if author_dir:
            return DATASET_DIR / 'class3-ai-mimicry' / author_dir / filename
    
    return None

# Calculate perplexity for each text
print("\nCalculating perplexity for all texts...")
print("This may take several minutes...")

perplexities = []
failed_count = 0

for idx, row in tqdm(df.iterrows(), total=len(df), desc="Processing files"):
    file_path = get_file_path(row)
    
    if file_path and file_path.exists():
        text = read_file_safe(file_path)
        if text:
            ppl = calculate_perplexity_single(text)
            if ppl is not None and not np.isnan(ppl) and not np.isinf(ppl):
                perplexities.append(ppl)
            else:
                perplexities.append(None)
                failed_count += 1
        else:
            perplexities.append(None)
            failed_count += 1
    else:
        perplexities.append(None)
        failed_count += 1

# Add perplexity column to dataframe
df['perplexity'] = perplexities

print(f"\n✓ Perplexity calculation complete!")
print(f"  Successfully calculated: {len([p for p in perplexities if p is not None])}")
print(f"  Failed/Missing: {failed_count}")

# Save updated CSV
print(f"\nSaving updated CSV to {csv_path}...")
df.to_csv(csv_path, index=False)
print("Saved successfully!")

# Show summary statistics
print("\nPERPLEXITY STATISTICS BY CLASS")

for class_name in df['class'].unique():
    class_data = df[df['class'] == class_name]['perplexity'].dropna()
    if len(class_data) > 0:
        print(f"\n{class_name}:")
        print(f"  Count:  {len(class_data)}")
        print(f"  Mean:   {class_data.mean():.2f}")
        print(f"  Median: {class_data.median():.2f}")
        print(f"  Std:    {class_data.std():.2f}")
        print(f"  Min:    {class_data.min():.2f}")
        print(f"  Max:    {class_data.max():.2f}")

Loading ../task-1/math.csv...
✓ Loaded 3921 rows
Columns: ['class', 'text_file_name', 'author', 'word_count', 'ttr', 'hapax_ratio', 'zipf_alpha', 'zipf_mape', 'avg_parse_depth', 'adj_noun_ratio', 'noun_verb_ratio', 'adverb_verb_ratio', 'em_dash_freq', 'semicolon_freq', 'colon_freq', 'exclamation_freq', 'double_quote_freq', 'flesch_kincaid_grade']

✓ Adding new 'perplexity' column

Calculating perplexity for all texts...
This may take several minutes...


Processing files: 100%|██████████| 3921/3921 [13:33<00:00,  4.82it/s] 


✓ Perplexity calculation complete!
  Successfully calculated: 3171
  Failed/Missing: 750

Saving updated CSV to ../task-1/math.csv...
Saved successfully!

PERPLEXITY STATISTICS BY CLASS

Class 1: Human-written:
  Count:  1460
  Mean:   58.46
  Median: 49.27
  Std:    31.96
  Min:    5.22
  Max:    241.31

Class 2: AI-written:
  Count:  988
  Mean:   48.63
  Median: 47.40
  Std:    13.22
  Min:    18.68
  Max:    110.91

Class 3: AI-mimicry:
  Count:  723
  Mean:   64.21
  Median: 60.57
  Std:    18.47
  Min:    30.28
  Max:    155.97



