<a href="https://colab.research.google.com/github/Meetra21/LLM-Based-Spam-vs-Ham-SMS-Classifier/blob/main/LLM_Based_Spam_vs_Ham_SMS_Classifier_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **LLM-Based Spam vs Ham SMS Classifier**
Azam (Meetra) Nouri

This project builds a spam vs. non-spam SMS classifier by fine-tuning a pretrained LLM (GPT-2) on the public SMS Spam Collection dataset. Messages are tokenized with the model’s tokenizer and used to train a lightweight classification head on top of the LLM’s pretrained language representations. Training runs on a Colab GPU and produces a model that outputs spam/ham probabilities for new messages, demonstrating how a pretrained LLM can be adapted into a practical text-classification system.

**1) Install + check GPU**

Check that the GPU is active
Before training, we confirm that Colab is running with the L4 GPU. If CUDA is not available, training falls back to CPU and becomes significantly slower. Printing the GPU name also verifies that the correct hardware is being used.

In [None]:
# PURPOSE:
#   Install libraries for dataset loading + GPT-2 fine-tuning.
# WHY:
#   - datasets: easy access to spam datasets on Hugging Face
#   - transformers: pretrained GPT-2 + Trainer
#   - accelerate: speeds up Trainer on GPU
!pip install -q datasets transformers accelerate torch

# PURPOSE:
#   Confirm GPU is enabled in Colab.
import torch
print("# CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
    print("# GPU:", torch.cuda.get_device_name(0))

# CUDA available: True
# GPU: NVIDIA L4


**2) Load an appropriate spam dataset (SMS Spam Collection)**

Next, we load an appropriate labeled dataset for spam detection, such as the SMS Spam Collection. This dataset contains short SMS messages and corresponding labels indicating whether each message is spam or non-spam.

In [None]:
# PURPOSE:
#   Load a real spam-vs-ham dataset.
from datasets import load_dataset

# SMS Spam Collection (ham/spam), 5,574 messages
# Source: Hugging Face mirror of the UCI dataset
ds = load_dataset("ucirvine/sms_spam")  # returns a DatasetDict (usually only "train")

print(ds)
print(ds["train"][0])

DatasetDict({
    train: Dataset({
        features: ['sms', 'label'],
        num_rows: 5574
    })
})
{'sms': 'Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...\n', 'label': 0}


**3) Prepare labels + train/validation split**

We then ensure that labels are converted into a clean numeric format: 0 for ham (non-spam) and 1 for spam. After that, the dataset is split into training and validation sets. The training set is used to learn patterns, while the validation set is used to measure generalization on unseen data.

In [None]:
# PURPOSE:
#   Ensure labels are correctly mapped to integers:
#     ham -> 0
#     spam -> 1
# WHY:
#   If labels are already 0/1 and you treat them as strings,
#   you can accidentally convert EVERYTHING to ham (0),
#   which makes the model predict ham for all messages.

from collections import Counter

print("# Raw label examples:", [ds["train"][i]["label"] for i in range(10)])
print("# Raw label counts:", Counter(ds["train"]["label"]))

def normalize_labels(example):
    lab = example["label"]

    # Case 1: already numeric 0/1
    if isinstance(lab, (int, bool)):
        example["label"] = int(lab)
        return example

    # Case 2: numeric but stored as string "0"/"1"
    if isinstance(lab, str) and lab.strip() in {"0", "1"}:
        example["label"] = int(lab.strip())
        return example

    # Case 3: stored as "ham"/"spam"
    if isinstance(lab, str):
        s = lab.strip().lower()
        if s == "spam":
            example["label"] = 1
        elif s == "ham":
            example["label"] = 0
        else:
            raise ValueError(f"Unknown label value: {lab}")
        return example

    raise ValueError(f"Unhandled label type/value: {type(lab)} {lab}")

data = ds["train"].map(normalize_labels)

print("# Normalized label counts:", Counter(data["label"]))

# Train/val split (same as before)
split = data.train_test_split(test_size=0.2, seed=42)
train_ds = split["train"]
val_ds   = split["test"]

# Raw label examples: [0, 0, 1, 0, 0, 1, 0, 0, 1, 1]
# Raw label counts: Counter({0: 4827, 1: 747})


Map:   0%|          | 0/5574 [00:00<?, ? examples/s]

# Normalized label counts: Counter({0: 4827, 1: 747})


**4) Tokenize text with GPT-2 tokenizer**

LLMs do not train directly on raw text, so we convert each message into token IDs using the model’s tokenizer. We also create an attention mask so the model can distinguish real tokens from padding. Truncation to a reasonable maximum length (for example, 64 tokens) is applied because SMS messages are short and shorter sequences train faster.

In [None]:
# PURPOSE:
#   Tokenize the SMS messages into input_ids/attention_mask for GPT-2.
# WHY:
#   The model can only train on token IDs (numbers), not raw strings.

from transformers import AutoTokenizer

model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)

# PURPOSE:
#   GPT-2 has no pad token by default. We set pad_token = eos_token so batching works.
tokenizer.pad_token = tokenizer.eos_token

# PURPOSE:
#   Identify which column contains the message text.
# WHY:
#   The dataset columns are ['sms', 'label'], so we use 'sms'.
text_col = "sms"

def tokenize_fn(batch):
    # PURPOSE:
    #   Convert a batch of SMS strings into token IDs.
    # truncation=True prevents very long texts from exceeding model limits.
    return tokenizer(batch[text_col], truncation=True)

# PURPOSE:
#   Apply tokenization to every row and remove the original text column afterward.
train_tok = train_ds.map(tokenize_fn, batched=True, remove_columns=[text_col])
val_tok   = val_ds.map(tokenize_fn, batched=True, remove_columns=[text_col])

print(train_tok[0].keys())  # should include input_ids, attention_mask, label

Map:   0%|          | 0/4459 [00:00<?, ? examples/s]

Map:   0%|          | 0/1115 [00:00<?, ? examples/s]

dict_keys(['label', 'input_ids', 'attention_mask'])


**5) Load pretrained LLM GPT-2 as a sequence classifier**

We load a pretrained LLM and attach a classification head on top of it. Instead of generating text, the model outputs two scores—one for ham and one for spam. Since GPT-style models typically do not include a padding token by default, a pad token (often the EOS token) is set to ensure stable batching.

In [None]:
# PURPOSE:
#   Use GPT-2 pretrained weights but change the head to classification (2 labels).
# WHY:
#   We’re not generating text; we’re predicting spam vs ham.
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    num_labels=2
)

# PURPOSE:
#   Set padding token id inside the model config to match tokenizer.
# WHY:
#   Avoids padding-related warnings and ensures attention_mask works properly.
model.config.pad_token_id = tokenizer.pad_token_id

Loading weights:   0%|          | 0/148 [00:00<?, ?it/s]

GPT2ForSequenceClassification LOAD REPORT from: gpt2
Key                  | Status     | 
---------------------+------------+-
h.{0...11}.attn.bias | UNEXPECTED | 
score.weight         | MISSING    | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.
- MISSING	:those params were newly initialized because missing from the checkpoint. Consider training on your downstream task.


**6) Train with Trainer**

We choose training settings such as batch size, learning rate, and number of epochs. Mixed precision (FP16) is enabled on the GPU to speed up training and reduce memory usage. A data collator pads batches dynamically, and metrics (such as accuracy) are defined to track performance during evaluation.

In [None]:
# PURPOSE: install the evaluate library used for metrics (accuracy, f1, etc.)
!pip install -q evaluate

In [None]:
# PURPOSE:
#   Build TrainingArguments in a way that works across different transformers versions.
# WHY:
#   Our installed transformers doesn't recognize 'evaluation_strategy'.
#   Some versions use different names (or don't support evaluation at all in TrainingArguments).
#   This code inspects TrainingArguments.__init__ and only passes supported args.

import inspect
import transformers
from transformers import TrainingArguments, Trainer, DataCollatorWithPadding
import numpy as np
import torch
import evaluate

print("# transformers version:", transformers.__version__)

# PURPOSE:
#   Padding collator: pads each batch to the longest sequence in that batch.
# WHY:
#   SMS messages have different lengths; batching needs padding.
collator = DataCollatorWithPadding(tokenizer=tokenizer)

# PURPOSE:
#   Accuracy metric (evaluate library).
accuracy = evaluate.load("accuracy")

def compute_metrics(eval_pred):
    # PURPOSE: compute accuracy from logits
    logits, labels = eval_pred
    preds = np.argmax(logits, axis=-1)
    return accuracy.compute(predictions=preds, references=labels)

# ----------------------------
# Auto-detect supported args
# ----------------------------
sig = inspect.signature(TrainingArguments.__init__)
supported = set(sig.parameters.keys())

# We create a "desired" args dict (what we WANT)
desired = dict(
    output_dir="./gpt2_spam_cls",
    per_device_train_batch_size=16,
    per_device_eval_batch_size=32,
    num_train_epochs=3,
    learning_rate=5e-5,
    fp16=torch.cuda.is_available(),
    logging_steps=50,
    report_to="none",
)

# These keys vary across versions; we add them only if supported
# Newer versions: evaluation_strategy/save_strategy
# Some versions: eval_strategy/save_strategy
if "evaluation_strategy" in supported:
    desired["evaluation_strategy"] = "epoch"
if "eval_strategy" in supported:
    desired["eval_strategy"] = "epoch"

if "save_strategy" in supported:
    desired["save_strategy"] = "epoch"
if "save_steps" in supported:
    desired["save_steps"] = 200  # fallback if save_strategy not available

# Filter out unsupported keys to prevent TypeError
safe_args = {k: v for k, v in desired.items() if k in supported}

print("# TrainingArguments keys used:", sorted(safe_args.keys()))

args = TrainingArguments(**safe_args)

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=train_tok,
    eval_dataset=val_tok,
    data_collator=collator,
    compute_metrics=compute_metrics,
)

trainer.train()

# Evaluation only works if eval_dataset is supported by Trainer version
print(trainer.evaluate())

# transformers version: 5.0.0
# TrainingArguments keys used: ['eval_strategy', 'fp16', 'learning_rate', 'logging_steps', 'num_train_epochs', 'output_dir', 'per_device_eval_batch_size', 'per_device_train_batch_size', 'report_to', 'save_steps', 'save_strategy']


Epoch,Training Loss,Validation Loss,Accuracy
1,0.060115,0.047266,0.992825
2,0.012148,0.051266,0.992825
3,0.005732,0.048732,0.992825


Writing model shards:   0%|          | 0/1 [00:00<?, ?it/s]

Writing model shards:   0%|          | 0/1 [00:00<?, ?it/s]

Writing model shards:   0%|          | 0/1 [00:00<?, ?it/s]

{'eval_loss': 0.04873223975300789, 'eval_accuracy': 0.9928251121076234, 'eval_runtime': 1.1453, 'eval_samples_per_second': 973.552, 'eval_steps_per_second': 30.56, 'epoch': 3.0}


**Step 7 — Fine-tune , evaluate and Test the classifier on new messages**

Training is performed by repeatedly feeding tokenized batches into the model, computing the classification loss, and updating weights to reduce errors.

In [None]:
# PURPOSE:
#   Run inference: spam probability for a new message.
# WHY:
#   This is how we use the trained chatbot as a spam detector.
import torch
import torch.nn.functional as F

def predict_spam(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True).to(model.device)
    with torch.no_grad():
        logits = model(**inputs).logits
        probs = F.softmax(logits, dim=-1)[0].detach().cpu().tolist()
    # label 1 = spam (by our mapping)
    return {"ham_prob": probs[0], "spam_prob": probs[1]}

print(predict_spam("Congratulations! You won a free iPhone. Click this link now!"))
print(predict_spam("Are we still meeting at 3pm today?"))

{'ham_prob': 0.006192990578711033, 'spam_prob': 0.9938070178031921}
{'ham_prob': 0.9974614381790161, 'spam_prob': 0.0025385187473148108}
