<a href="https://colab.research.google.com/github/0xHenriksson/essay-detect/blob/main/detect.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Essay Analyzer

This notebook is focused on determining whether or not a particular text, in this case a student's essay, is written by AI or not.

Method(s):
- Using the original assignment instructions, prompt popular model providers (OpenAI, Anthropic, Google) for ~10 complete essays.
- Use an embedding model to create a vector embedding representation of the essays
- use various techniques to compare the semantic and syntactical structure of the student's essay compared to that of the model outputs
- use as many novel methods as possible
- visualize embedding comparisons

## Statistical Methods
- word and phrase freq
- perplexity
- stylometry
- semantic coherence

## Mathematical Methods
- Total Variation distance
- Perturbation discrepancy detection

In [None]:
# special installs
!pip install transformers torch spacy nltk

In [None]:
# Download the spaCy language model
!python -m spacy download en_core_web_sm

In [None]:
import numpy as np
from transformers import AutoTokenizer, AutoModel
from scipy.spatial.distances import cosine
from nltk.tokenize import sent_tokenize
import torch
from typing import List, Tuple, Dict
import spacy
from collections import defaultdict

Load in the essay

In [7]:
# mount the drive
from google.colab import drive
# authentication
from google.colab import auth

username = ""
password = ""

auth.authenticate_user()

drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [9]:
# read the essay text
with open('/content/drive/MyDrive/detect/essay.txt', 'r') as file:
    essay_text = file.read()

essay_text

"Hedda Gabler Theme Essay\n\tLife is commonly filled with mishaps and unfair standards that may push one past their breaking point. From unfair treatment and oppression, to tragedy, life is filled with complex challenges that may affect every person in a different way, completely flipping the morals and minds of characters. Throughout Henrik Ibsen’s Play Hedda Gabler, Hedda, the main character, embodies the difficulties and struggles of life, especially for women throughout the 19th century. Hedda is trapped in a loop filled with tragedy and manipulation that leads to her eventually tragic demise, struggles with identity, and the oppressive constraints of societal gender roles, completely altering her character while also conveying a deeper message through her character. The journey of Hedda reveals key themes that serve to highlight the tragedies of Heddas life, the extreme inner turmoil regarding her identity, and her deep battle against the confining expectations placed upon women d

In [11]:
# read in the assignment text to be used for prompting
with open('/content/drive/MyDrive/detect/assignment.txt', 'r') as file:
  assignment_text = file.read()

assignment_text

'Fiction Analysis Essay:  Theme\nAP Literature and Composition  (100 Points)\n\nText Options:\nNative Son by Richard Wright\nWuthering Heights by Emily Bronte\nFrankenstein by Mary Shelley\n“Babylon Revisited” by F. Scott Fitzgerald\n“Hunters in the Snow” by Tobias Wolff\n\nPrompt:\nAuthors become famous for their ability to use complex literary devices to reveal a criticism, reflect a social or political issue, and/or discuss the complexity and truth of human nature into the meaning of their work as a whole.  Simply stated, to understand theme is to understand an author’s work.\n\nIn a fully-developed composition (no longer than four pages in MLA format), analyze several complex themes (approximately 3; better to go deep) in a work of your choice that we have read this semester. Make sure you take a unique approach while analyzing this piece, incorporate and properly use textual evidence, and fully support a clearly-defined original thesis.  Your essay should clearly reveal the writer

# Prompt Construction
Attempt to reconstruct a prompt similar to what the student may have used when asking the AI to write the essay

In [12]:
# put the title of the media here (book, play, poem, etc)
media_title = "Hedda Gabler"
author = "Henrik Ibsen"

prompt = f"Write an essay about {media_title} by {author} that follows these requirements {assignment_text}"

prompt

'Write an essay about Hedda Gabler by Henrik Ibsen that follows these requirements Fiction Analysis Essay:  Theme\nAP Literature and Composition  (100 Points)\n\nText Options:\nNative Son by Richard Wright\nWuthering Heights by Emily Bronte\nFrankenstein by Mary Shelley\n“Babylon Revisited” by F. Scott Fitzgerald\n“Hunters in the Snow” by Tobias Wolff\n\nPrompt:\nAuthors become famous for their ability to use complex literary devices to reveal a criticism, reflect a social or political issue, and/or discuss the complexity and truth of human nature into the meaning of their work as a whole.  Simply stated, to understand theme is to understand an author’s work.\n\nIn a fully-developed composition (no longer than four pages in MLA format), analyze several complex themes (approximately 3; better to go deep) in a work of your choice that we have read this semester. Make sure you take a unique approach while analyzing this piece, incorporate and properly use textual evidence, and fully suppo

In [None]:
# perturb the prompt slightly

In [None]:
class TextSimilarityAnalyzer:
    def __init__(self, model_name: str = "sentence-transformers/all-mpnet-base-v2"):
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model = AutoModel.from_pretrained(model_name).to(self.device)
        self.nlp = spacy.load("en_core_web_sm")

    def get_embeddings(self, text: str) -> np.ndarray:
        """Generate sentence embeddings using transformer model."""
        inputs = self.tokenizer(text, return_tensors="pt", padding=True, truncation=True)
        inputs = {k: v.to(self.device) for k, v in inputs.items()}

        with torch.no_grad():
            outputs = self.model(**inputs)
            embeddings = outputs.last_hidden_state.mean(dim=1)

        return embeddings.cpu().numpy()

    def syntactic_features(self, text: str) -> Dict:
        """Extract syntactic features using spaCy."""
        doc = self.nlp(text)
        features = defaultdict(int)

        # Analyze sentence structure
        for sent in doc.sents:
            features['sent_length'] += len(sent)
            features['dep_tree_depth'] += self._get_dep_tree_depth(sent.root)

        # POS tag distribution
        for token in doc:
            features[f'pos_{token.pos_}'] += 1

        # Normalize features
        total_tokens = len(doc)
        return {k: v/total_tokens for k, v in features.items()}

    def _get_dep_tree_depth(self, root) -> int:
        """Calculate dependency tree depth recursively."""
        if not list(root.children):
            return 0
        return 1 + max(self._get_dep_tree_depth(child) for child in root.children)

    def compare_texts(self, original: str, generated_samples: List[str]) -> Dict[str, List[float]]:
        """Compare original text against multiple generated samples."""
        results = {
            'semantic_similarity': [],
            'syntactic_similarity': [],
            'combined_score': []
        }

        # Get features for original text
        orig_embedding = self.get_embeddings(original)
        orig_syntactic = self.syntactic_features(original)

        for sample in generated_samples:
            # Semantic similarity using embeddings
            sample_embedding = self.get_embeddings(sample)
            semantic_sim = 1 - cosine(orig_embedding.flatten(), sample_embedding.flatten())

            # Syntactic similarity using feature vectors
            sample_syntactic = self.syntactic_features(sample)
            syntactic_sim = self._compare_feature_dicts(orig_syntactic, sample_syntactic)

            # Combined score (weighted average)
            combined = 0.6 * semantic_sim + 0.4 * syntactic_sim

            results['semantic_similarity'].append(semantic_sim)
            results['syntactic_similarity'].append(syntactic_sim)
            results['combined_score'].append(combined)

        return results

    def _compare_feature_dicts(self, dict1: Dict, dict2: Dict) -> float:
        """Compare two feature dictionaries using cosine similarity."""
        keys = set(dict1.keys()) | set(dict2.keys())
        vec1 = np.array([dict1.get(k, 0) for k in keys])
        vec2 = np.array([dict2.get(k, 0) for k in keys])
        return 1 - cosine(vec1, vec2)

# Example usage
def analyze_submission(submission_text: str,
                      prompt: str,
                      model_outputs: List[str],
                      threshold: float = 0.85) -> Tuple[bool, Dict]:
    """
    Analyze if a submission is likely AI-generated.

    Args:
        submission_text: The text to analyze
        prompt: The original assignment prompt
        model_outputs: List of outputs from various LLMs using the same prompt
        threshold: Similarity threshold for flagging potential AI generation

    Returns:
        Tuple of (is_likely_ai: bool, detailed_results: Dict)
    """
    analyzer = TextSimilarityAnalyzer()
    results = analyzer.compare_texts(submission_text, model_outputs)

    # Calculate average similarities
    avg_semantic = np.mean(results['semantic_similarity'])
    avg_syntactic = np.mean(results['syntactic_similarity'])
    avg_combined = np.mean(results['combined_score'])

    # Flag if similarity exceeds threshold
    is_likely_ai = avg_combined > threshold

    detailed_results = {
        'is_likely_ai': is_likely_ai,
        'similarity_scores': results,
        'averages': {
            'semantic': avg_semantic,
            'syntactic': avg_syntactic,
            'combined': avg_combined
        }
    }

    return is_likely_ai, detailed_results

In [None]:
# generate 10 essays from each model provider: openai, anthropic, google


Note to self: Am I losing similarity by using the API rather than the chat window like the student likely does

In [None]:
model_providers = [
    {"provider": "openai", "models": ["text-davinci-003", "gpt-3.5-turbo"]},
    {"provider": "anthropic", "models": ["claude-2"]},
    {"provider": "google", "models": ["PaLM 2"]}
]

In [None]:
# create an embedding of the model outputs
# might have to upload to a vectordb

In [None]:
# plot the embedding

In [14]:
# compare semantic similarity
# output scores
# 3D plot with semantic similarity scores showing how close the essay is to the model outputs in the embedding
#

In [None]:
# compare syntactical similarity

Statistical Methods

In [None]:
# calculate word and phrase frequency, compare to model outputs

In [None]:
# calculate essay perplexity and compare to model output perplexity

In [None]:
# compare stylometry somehow

In [None]:
# compute and compare semantic coherence

Mathematical Methods

In [None]:
# calculate Total Variation distance between the essay and the model outputs

In [None]:
# Perturbation discrepancy detection method