# AI Essay Checker with Veridex

This notebook demonstrates how to build an academic integrity tool to detect AI-generated student essays using **Veridex**.

We will implement an `EssayChecker` that uses an ensemble of detection signals:
1. **Binoculars**: A high-accuracy, zero-shot detector comparing two language models.
2. **Perplexity**: A statistical measure of how "surprising" the text is to a model.

> **Note**: This use case is designed for educational institutions to flag content for review. It should never be used as the sole evidence of misconduct.

## 1. Installation

First, we install `veridex` with the text extras. This will install dependencies like `transformers` and `torch`.

In [None]:
!pip install veridex[text]

## 2. Implementation

We will create an `EssayChecker` class that encapsulates the logic for analyzing essays. It combines results from multiple signals to provide a confidence score and a recommendation.

In [None]:
from veridex.text import BinocularsSignal, PerplexitySignal

class EssayChecker:
    """Academic integrity checker for student essays"""
    
    def __init__(self):
        print("Initializing detectors... (this may take a moment to download models)")
        # Use high-accuracy detector for academic use
        # specific models can be passed to ensure consistent behavior
        self.primary_detector = BinocularsSignal(observer_id="tiiuae/falcon-7b", performer_id="tiiuae/falcon-7b-instruct") 
        # Note: In a real colab with limited RAM, you might use smaller models like:
        # self.primary_detector = BinocularsSignal(observer_id="distilgpt2", performer_id="gpt2")
        
        self.secondary_detector = PerplexitySignal()
        print("Initialization complete.")
    
    def analyze_essay(self, essay_text):
        """
        Analyze student essay for AI generation probability
        """
        # Run both detectors
        print("Running primary detector (Binoculars)...")
        primary_result = self.primary_detector.run(essay_text)
        
        print("Running secondary detector (Perplexity)...")
        secondary_result = self.secondary_detector.run(essay_text)
        
        # Ensemble decision (simple average for demonstration)
        avg_score = (primary_result.score + secondary_result.score) / 2
        avg_confidence = (primary_result.confidence + secondary_result.confidence) / 2
        
        # Conservative thresholds for academic use
        if avg_score > 0.85 and avg_confidence > 0.75:
            status = 'LIKELY_AI_GENERATED'
            recommendation = 'MANUAL_REVIEW_REQUIRED'
        elif avg_score > 0.6:
            status = 'UNCERTAIN'
            recommendation = 'CONSIDER_INTERVIEW'
        else:
            status = 'LIKELY_HUMAN_WRITTEN'
            recommendation = 'NONE'

        return {
            'status': status,
            'recommendation': recommendation,
            'confidence': avg_confidence,
            'details': {
                'binoculars_score': primary_result.score,
                'perplexity_score': secondary_result.score,
                'perplexity_val': primary_result.metadata.get('mean_perplexity')
            }
        }

## 3. Analysis

Now let's test our checker on some sample texts. We'll verify a human-written sample and an AI-generated sample.

In [None]:
# Initialize the checker
# WARNING: This will download model weights (~500MB - 2GB depending on configuration)
checker = EssayChecker()

In [None]:
student_essay_human = """
The nuances of Shakespeare's Hamlet have been debated for centuries, yet the central theme of indecision remains timeless. 
I remember reading it for the first time in high school and feeling frustrated by his constant delaying. 
But as I've grown older, I kinda get it? Like, who hasn't been paralyzed by overthinking a big choice?
"""

student_essay_ai = """
Shakespeare's Hamlet serves as a profound exploration of the human condition, specifically the consequences of hesitation. 
The protagonist's inability to act is often cited as his tragic flaw, or hamartia. 
Through soliloquies such as "To be, or not to be," Shakespeare illuminates the internal conflict between action and contemplation.
"""

In [None]:
def print_report(title, text):
    print(f"--- Analyzing: {title} ---")
    try:
        result = checker.analyze_essay(text)
        print(f"Status: {result['status']}")
        print(f"Confidence: {result['confidence']:.2%}")
        print(f"Recommendation: {result['recommendation']}")
        print(f"Scores -> Binoculars: {result['details']['binoculars_score']:.2f}, Perplexity: {result['details']['perplexity_score']:.2f}")
    except Exception as e:
        print(f"Error: {e}")
    print("\n")

print_report("Human Written Sample", student_essay_human)
print_report("AI Generated Sample", student_essay_ai)

## Conclusion

By combining multiple signals, we can achieve a more robust detection system. Note that the 'AI' sample above might not be flagged with 100% certainty if it is short or high quality, but the probability scores usually show a clear distinction compared to the human text.