# Perplexity Comparison: Clean vs. Attacked Text

This notebook visualizes the impact of the **WordSwapEmbedding** attack on text quality.

We compare the perplexity scores of:
1.  **Clean Data:** Original text from the SQuAD validation set.
2.  **Attacked Data:** The same text after being modified by TextAttack.

**Hypothesis:** The attack should increase the perplexity score, shifting the distribution to the right (indicating less natural text).

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import os

# Set style
plt.style.use('ggplot')

## 1. Load Data

In [None]:
# Define paths (relative to project root if running from there, or adjust as needed)
clean_path = '../tests/data/measures/perplexity_data.csv'
attacked_path = '../tests/data/measures/perplexity_attacked.csv'

# Check if files exist
if not os.path.exists(clean_path):
    print(f"WARNING: {clean_path} not found. Did you run 'run_perplexity_on_clean.py'?")
else:
    df_clean = pd.read_csv(clean_path)
    print(f"Loaded CLEAN data: {len(df_clean)} rows")

if not os.path.exists(attacked_path):
    print(f"WARNING: {attacked_path} not found. Did you run 'run_perplexity_on_attacked.py'?")
else:
    df_attacked = pd.read_csv(attacked_path)
    print(f"Loaded ATTACKED data: {len(df_attacked)} rows")

## 2. Statistical Summary

In [None]:
if 'df_clean' in locals():
    print("--- Clean Data Stats ---")
    print(df_clean['score'].describe())
    
if 'df_attacked' in locals():
    print("
--- Attacked Data Stats ---")
    print(df_attacked['score'].describe())

## 3. Visualization

We overlay the two histograms to visualize the shift in distribution.

In [None]:
if 'df_clean' in locals() and 'df_attacked' in locals():
    plt.figure(figsize=(12, 6))
    
    # Plot Clean Data
    plt.hist(df_clean['score'], bins=50, alpha=0.5, label='Clean (Original)', color='blue', range=(0, 200))
    
    # Plot Attacked Data
    plt.hist(df_attacked['score'], bins=50, alpha=0.5, label='Attacked (Adversarial)', color='red', range=(0, 200))
    
    plt.title('Perplexity Distribution: Clean vs. Attacked Text')
    plt.xlabel('Perplexity Score (Lower is better/more natural)')
    plt.ylabel('Frequency')
    plt.legend()
    plt.xlim(0, 200)  # Limiting x-axis to zoom in on the main distribution (remove if you want to see outliers)
    plt.show()
else:
    print("Cannot plot: Data missing.")