# 🧠 Zero-Shot Content Attribution with Transformers
Welcome! This notebook demonstrates how to perform **zero-shot text classification** for **AI content attribution**, using two powerful transformer models:

1. `facebook/bart-large-mnli` – suited for **English** content.
2. `joeddav/xlm-roberta-large-xnli` – a multilingual model for **100+ languages**, including many African and Asian languages.

---
## 🔍 What is Zero-Shot Classification?
Zero-shot classification allows a model to classify text into user-defined categories **without needing labeled training data**. The model infers relationships between text and labels using natural language inference (NLI).

### ✨ Applications:
- AI-generated content detection
- Content moderation
- Multilingual classification

---

In [1]:
!pip install transformers --quiet

## 📘 English Zero-Shot Attribution using `facebook/bart-large-mnli`

In [2]:
from transformers import pipeline

# Load the English zero-shot classifier
classifier_en = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

# English text sample
text_en = """This article was generated by an AI system to demonstrate the power of modern language models."""

# Define attribution categories
labels = ["AI-Generated", "Human-AI Co-Creation", "Human-Written"]

# Run classification
result_en = classifier_en(text_en, labels)

# Display results
print("🔎 English Attribution Result:")
for label, score in zip(result_en['labels'], result_en['scores']):
    print(f"{label:<25} → Confidence: {score:.2f}")

# Final decision
print(f"\n✅ Final Attribution: {result_en['labels'][0]} (Confidence: {result_en['scores'][0]:.2f})")


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


🔎 English Attribution Result:
AI-Generated              → Confidence: 1.00
Human-AI Co-Creation      → Confidence: 0.00
Human-Written             → Confidence: 0.00

✅ Final Attribution: AI-Generated (Confidence: 1.00)


## 🌍 Multilingual Zero-Shot Attribution using `joeddav/xlm-roberta-large-xnli`

In [3]:
from transformers import pipeline

# Load multilingual zero-shot classifier
classifier_multi = pipeline("zero-shot-classification", model="joeddav/xlm-roberta-large-xnli")

# Example: French text (change to other languages as needed)
text_multi = """Ce texte a été rédigé à l'aide d'un modèle d'IA génératif. Il explore les possibilités offertes par l'intelligence artificielle."""

# Attribution labels (still in English, as expected by the model)
candidate_labels = ["AI-Generated", "Human-AI Co-Creation", "Human-Written"]

# Run classification
result_multi = classifier_multi(text_multi, candidate_labels)

# Display results
print("🌐 Multilingual Attribution Result:")
for label, score in zip(result_multi['labels'], result_multi['scores']):
    print(f"{label:<25} → Confidence: {score:.2f}")

# Final prediction
print(f"\n✅ Final Attribution: {result_multi['labels'][0]} (Confidence: {result_multi['scores'][0]:.2f})")


config.json:   0%|          | 0.00/734 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.24G [00:00<?, ?B/s]

Some weights of the model checkpoint at joeddav/xlm-roberta-large-xnli were not used when initializing XLMRobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing XLMRobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing XLMRobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/150 [00:00<?, ?B/s]

Device set to use cpu
  return forward_call(*args, **kwargs)


🌐 Multilingual Attribution Result:
AI-Generated              → Confidence: 0.99
Human-AI Co-Creation      → Confidence: 0.01
Human-Written             → Confidence: 0.00

✅ Final Attribution: AI-Generated (Confidence: 0.99)


# ⚠️ **CAUTION: Zero-Shot Content Attribution Using NLI Models**

Zero-shot classification with NLI models (like **BART-MNLI** or **XLM-RoBERTa-XNLI**) offers **great flexibility** and **requires no task-specific training**, but it's important to understand its **limitations**:

---

### ❗ Limitations

- ❌ **Not purpose-built** for detecting AI vs. human authorship  
- 🧠 **Limited interpretability** — decisions rely on abstract entailment probabilities  
- 🌐 **Multilingual support** is strong, but **can be inconsistent**, especially for **low-resource languages**  
- 🧪 **Highly sensitive** to **prompt and label phrasing** — small changes may shift results  
- ⏳ **Resource-intensive** — large models may be **slow to run or deploy at scale**  
- 🚫 **Does not truly “understand” authorship** — operates on linguistic patterns, not source detection  
- 🔄 **Vulnerable** to **rephrased**, **translated**, or **AI-human mixed** content  

---

### ✅ **Suggested Best Practices**

- 🔁 Run **multiple prompts** with varied label phrasing and **aggregate results**  
- 🧩 **Combine with heuristics**, metadata, or simpler classifiers for **hybrid attribution**  
- 🧑‍💼 Use **human-in-the-loop** verification for **high-stakes or critical decisions**  
- 📢 Be **transparent about limitations** in reports, dashboards, and user interfaces  

---

## ✅ **When It’s Useful**
> Ideal for **experimentation**, **low-stakes applications**, or **supporting evidence** in a larger attribution pipeline — but not reliable as a standalone solution.
