# Email Spam Classification using an LLM

In this notebook, we'll:

1. Load a pre-trained instruction-following model from Hugging Face.
2. Read the `spam.csv` dataset.
3. Use prompt engineering to classify each email as **spam** or **ham** (not spam).

We will ensure the model outputs **only** the class label for each email.

In [27]:
# 1. Install and import necessary libraries
!pip install transformers pandas tqdm

from transformers import pipeline
import pandas as pd
from tqdm.auto import tqdm

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)




## 2. Load the LLM and create a text2text-generation pipeline

We'll use `google/flan-t5-base`, which is a versatile Seq2Seq instruction-following model. The pipeline will generate a single-token output (`spam` or `ham`).

In [28]:
from transformers import pipeline

# Pipeline zero-shot avec un modèle NLI
classifier = pipeline(
    "zero-shot-classification",
    model="facebook/bart-large-mnli",
    device=0
)


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use mps:0


## 3. Define the prompt template and classification function

We craft a prompt that clearly instructs the model to output **only** the class label. We also constrain the generation to a short maximum length.


In [29]:
PROMPT_TEMPLATE = (
    "Perform a binary classification of the following email as 'spam' or 'ham'."
    " Respond with only one word: spam or ham.\n\n"
    "Use your knowledge and common sense to classify the email."
    "Example 1:\n"
    "Email: Win cash prize now! Click here to claim your reward.\n"
    "Class: spam\n\n"
    "Example 2:\n"
    "Email: Hi team, attached is the report for our meeting tomorrow.\n"
    "Class: ham\n\n"
    "Email: {email_text}\n"
    "Class:"
)

def classify_email(text: str) -> str:
    res = classifier(
        text, 
        candidate_labels=["spam", "ham"],
        multi_label=False
    )
    # res["labels"] est trié par score décroissant
    return res["labels"][0].lower()


## 4. Load the dataset and apply classification

We'll read `spam.csv`, classify each email, and add a new column with the predicted label.

In [30]:
# Load data
df = pd.read_csv('spam.csv', encoding='latin-1')
df.columns = ['text', 'target']

# Quick sanity check on first 5 emails
for idx, row in df.head(5).iterrows():
    pred = classify_email(row['text'])
    print(f"Email #{idx+1}: True={row['target']} | Predicted={pred}")

# Classify
tqdm.pandas(desc="Classifying emails")
df['predicted'] = df['text'].progress_apply(classify_email)

Email #1: True=ham | Predicted=ham
Email #2: True=ham | Predicted=spam
Email #3: True=spam | Predicted=ham
Email #4: True=ham | Predicted=ham
Email #5: True=ham | Predicted=ham


Classifying emails:   0%|          | 0/5572 [00:00<?, ?it/s]

## 5. Evaluate and display sample results

We'll compute the overall accuracy, F1-score, recall and confusion matrix

In [31]:
# Compute accuracy
accuracy = (df['predicted'] == df['target']).mean()
print(f"Accuracy: {accuracy:.2%}")

# Show sample misclassifications
df[df['predicted'] != df['target']].head(10)

from sklearn.metrics import (
    accuracy_score,
    precision_score,
    recall_score,
    f1_score,
    classification_report,
    confusion_matrix
)

# Supposons que vous avez déjà fait :
df['target'] = df['target'].astype(str).str.strip().str.lower()
df['predicted'] = df['predicted'].str.strip().str.lower()

y_true = df['target']
y_pred = df['predicted']

# 1. Accuracy globale
acc = accuracy_score(y_true, y_pred)
print(f"Accuracy overall: {acc:.2%}")

# 2. Précision, rappel et F1 par classe
precisions = precision_score(y_true, y_pred, labels=['ham','spam'], average=None)
rappels    = recall_score   (y_true, y_pred, labels=['ham','spam'], average=None)
f1s        = f1_score       (y_true, y_pred, labels=['ham','spam'], average=None)

for label, p, r, f in zip(['ham','spam'], precisions, rappels, f1s):
    print(f"Classe {label:>4} → Precision: {p:.2%}, Recall: {r:.2%}, F1-score: {f:.2%}")

# 3. Rapport de classification complet
print("\nClassification report détaillé :")
print(classification_report(
    y_true, 
    y_pred, 
    labels=['ham','spam'], 
    target_names=['ham','spam'],
    digits=4
))

# 4. Matrice de confusion
cm = confusion_matrix(y_true, y_pred, labels=['ham','spam'])
print("\nMatrice de confusion (lignes=vérité, colonnes=prédit) :")
print(cm)


Accuracy: 67.28%
Accuracy overall: 67.28%
Classe  ham → Precision: 86.88%, Recall: 73.28%, F1-score: 79.51%
Classe spam → Precision: 14.18%, Recall: 28.51%, F1-score: 18.94%

Classification report détaillé :
              precision    recall  f1-score   support

         ham     0.8688    0.7328    0.7951      4825
        spam     0.1418    0.2851    0.1894       747

    accuracy                         0.6728      5572
   macro avg     0.5053    0.5090    0.4922      5572
weighted avg     0.7713    0.6728    0.7139      5572


Matrice de confusion (lignes=vérité, colonnes=prédit) :
[[3536 1289]
 [ 534  213]]
