
# 🧠 Heuristic-Based - Annotating Human vs. AI Content

This notebook helps you compare AI-generated and human-edited text using:
- **Levenshtein distance** (for textual edits)
- **Cosine similarity** (from transformer embeddings)

We will also show how to build and deploy it to a **Streamlit app** and then to Streamlit Cloud.


**Levenshtein Distance (Edit Distance)**

This is a way of measuring how different two pieces of text are by counting the minimum number of edits needed to change one into the other.

**The allowed edits are:**

Insertions (adding a character)

Deletions (removing a character)

Substitutions (replacing one character with another)

**Example:**
"cat" → "cut" requires 1 substitution (a → u), so the Levenshtein distance = 1.

So, the smaller the distance, the more similar the texts; a larger distance means they are more divergent.



**Cosine Similarity (using embeddings from transformers)**
When comparing meaning between long texts or sentences, we often convert the texts into high-dimensional number vectors using transformer models like BERT. These vectors capture the meaning/context of the text — these are called embeddings.

Cosine similarity measures how close the direction of two embedding vectors are, regardless of their length.

The value ranges from –1 to 1

1 → very similar / pointing in the same direction

0 → no relation

–1 → completely opposite meanings (rare in practice)

Think of two arrows in space: the more they point in the same direction, the higher their cosine similarity.

In [None]:
!pip uninstall -y transformers
!pip install transformers==4.40.1

In [None]:
!pip install textdistance torch

In [None]:
import textdistance
import torch
import matplotlib.pyplot as plt
from transformers import AutoTokenizer, AutoModel

In [None]:
# Comparing the Cosine Similarity between AI and a human-written texts
ai_text = "Artificial intelligence is transforming industries by automating complex tasks."
human_text = "AI is rapidly changing industries by automating both simple and complex operations."

In [None]:
# --- Part 1: Levenshtein Distance ---
# Calculate normalized Levenshtein distance between the two strings.
#   - Levenshtein distance counts how many edits (insertions, deletions, substitutions)
#     are needed to turn one string into the other.
#   - 'normalized_distance()' scales that value between 0 and 1
#       -> 0 = the texts are identical
#       -> 1 = completely different
lev_distance = textdistance.levenshtein.normalized_distance(ai_text, human_text)
print(f"Normalized Levenshtein Distance: {lev_distance:.3f} → This means the texts are about {lev_distance*100:.1f}% different (and roughly {100 - lev_distance*100:.1f}% similar)")

In [None]:
# Raw Levenshtein distance (minimum number of edits required to turn one string into another)
raw_dist = textdistance.levenshtein.distance(ai_text, human_text)
print(f"Raw Levenshtein Distance: {raw_dist} → This means {raw_dist} character-level edits are needed to turn one text into the other.")

In [None]:
# --- Part 2: Embedding Cosine Similarity using Transformers ---

# Specify the model to use for generating sentence embeddings.
# 'all-MiniLM-L6-v2' is a small and fast model from the Sentence-Transformers family,
# optimized for computing sentence-level semantic similarity.
model_name = "sentence-transformers/all-MiniLM-L6-v2"

# Load the tokenizer associated with the model.
# The tokenizer converts raw text into tokens and input IDs the model can process.
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load the actual transformer model.
# This model will generate context-aware embeddings (vectors) for the input text.
model = AutoModel.from_pretrained(model_name)


In [None]:
# Define a function to get the sentence embedding for any given text
def get_embedding(text):
    # Step 1: Tokenize the input text
    # - return_tensors="pt": returns PyTorch tensors
    # - truncation=True: cuts off text that's too long
    # - padding=True: ensures all sequences are the same length
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)

    # Step 2: Disable gradient tracking since we're not training (just getting embeddings)
    with torch.no_grad():
        # Step 3: Pass the tokenized input through the model
        output = model(**inputs)

    # Step 4: Take the mean of the last hidden state across tokens (dim=1)
    # - This gives a single 768-dimensional vector representing the sentence
    return output.last_hidden_state.mean(dim=1)

In [None]:
# Generate the sentence embedding for the AI-generated text
# This will be a 768-dimensional vector that captures the overall meaning of the sentence
embedding_ai = get_embedding(ai_text)

# Generate the sentence embedding for the human-written text
# Also returns a 768-dimensional vector representing the sentence's semantic meaning
embedding_human = get_embedding(human_text)

In [None]:
embedding_ai

In [None]:
embedding_human

In [None]:
# Calculate the cosine similarity between the two sentence embeddings
# - Cosine similarity measures how close the two vectors point in direction
# - Result ranges from -1 (opposite) to 1 (identical); 0 means unrelated
cos_sim = torch.nn.functional.cosine_similarity(embedding_ai, embedding_human).item()

# Print the similarity score, rounded to 3 decimal places
print(f"Cosine Similarity: {cos_sim:.3f}")

#1.0 → vectors point in the same direction (very similar)

#0.0 → vectors are 90° apart (not related)

#–1.0 → vectors point in opposite directions (very different)

A cosine similarity score of 0.861 means the two sentence embeddings are:

✅ Highly similar in meaning — but not identical.

In [None]:
# --- Attribution Confidence Leaderboard ---

# Combine normalized Levenshtein distance and cosine similarity into a weighted score for each category

# Score weights: you can tweak these based on importance
weight_lev = 0.4   # weight for character-level difference (1 - lev_distance)
weight_cos = 0.6   # weight for semantic similarity

# Invert Levenshtein distance (because lower distance = more similar)
lev_sim = 1 - lev_distance

# Compute a combined confidence score for each category
score_ai_generated = (lev_sim * weight_lev) + (cos_sim * weight_cos)
score_human_ai_mix = (lev_sim * 0.5) + (cos_sim * 0.5)
score_human_from_seed = (lev_sim * 0.2) + (cos_sim * 0.8)

# Put scores into a dictionary
scores = {
    "AI-Generated": score_ai_generated,
    "Human-AI Co-Creation": score_human_ai_mix,
    "Human-Written (Assumed no AI applied))": score_human_from_seed
}

# Sort by highest score
sorted_scores = sorted(scores.items(), key=lambda x: x[1], reverse=True)

# Print leaderboard
print("\n🔍 Attribution Leaderboard:")
for label, score in sorted_scores:
    print(f" - {label:<30} → Confidence Score: {score:.2f}")

# Final decision: the label with the highest score
top_label, top_score = sorted_scores[0]
print(f"\n✅ Final Attribution: {top_label} (Confidence: {top_score:.2f})")

In [None]:
# 📊 Plot Confidence Score Graph
labels = list(scores.keys())
values = list(scores.values())

plt.figure(figsize=(8, 5))
bars = plt.bar(labels, values)
plt.ylim(0, 1)
plt.ylabel('Confidence Score')
plt.title('Attribution Confidence Leaderboard')
plt.xticks(rotation=15)

# Annotate bars with score values
for bar in bars:
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval + 0.02, f"{yval:.2f}", ha='center', va='bottom')

plt.tight_layout()
plt.show()

# ⚠️ **CAUTION: Heuristic-Based Attribution**

This method is **simple**, **transparent**, and useful for quick experimentation. However, it comes with notable limitations:

- ❌ **Not trained on labeled data** — relies on hand-crafted logic  
- 📏 **Lacks precision at scale** — not suitable for production environments  
- 🔄 **Easily fooled by paraphrasing or slight rewording**  
- 🧪 Performs best for **exploratory analysis**, **rapid prototyping**, or **educational demonstrations**

---

## ✅ **When It’s Useful**

- Teaching or learning the **basics of content attribution**  
- Rapidly exploring **style differences** between texts  
- Building a **lightweight prototype** before investing in more complex models  
- Supplementing model outputs with **rule-based insights**

---

## ✅ **Suggested Best Practices**

- Use as a **baseline** to compare with ML or zero-shot methods  
- Combine with **metadata** (e.g., time of writing, writing tools used) to improve accuracy  
- Apply in **controlled environments** where risk of misclassification is low  
- Always pair with **human judgment** in sensitive or impactful use cases
