---
title: "Text Classification with Transformers Pipeline"
author: "Mohammed Adil Siraju"
date: "2025-09-26"
categories: [nlp, transformers, text-classification, spam-detection]
description: "Using Hugging Face transformers pipeline for text classification and spam detection."
---

This notebook demonstrates how to use Hugging Face's pipeline API for text classification tasks, specifically for spam detection.

## Importing Pipeline

Import the pipeline function from transformers to create pre-configured models for specific tasks.

In [2]:
from transformers import pipeline

## Creating Text Classification Pipeline

Create a text classification pipeline using a multilingual DistilBERT model trained for sentiment analysis. We'll adapt this for spam detection.

In [3]:
spam_classifier = pipeline(
    'text-classification',
    model='philschmid/distilbert-base-multilingual-cased-sentiment'
)

config.json:   0%|          | 0.00/814 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/541M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/345 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/541M [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Device set to use cuda:0


## Testing with Sample Documents

Test the classifier with various text samples including potential spam messages and normal communications.

In [5]:
docs = [
    "Congartulations! You've won 5oo INR Amazon gift voucher",
    'Hey Amit. Lets have a meeting tomorrow',
    'URGENT: Youre gmail has been comprimsed. Click this link to revive it',
    'URGENT: Youre Credit Card has been comprimsed. Click this link to revive it'
]

results = spam_classifier(docs)

## Interpreting Results for Spam Detection

Map sentiment labels to spam categories and display results with confidence scores.

In [9]:
label_mapping = {'negative': 'SPAM',
                 'neutral':'NOT SPAM',
                 'positive':'NOT SPAM'}

for res in results:
    label = label_mapping[res['label']]
    score = res['score']
    print(f"Label: {label}, Confidence: {score:.2f}")

Label: SPAM, Confidence: 0.96
Label: NOT SPAM, Confidence: 0.80
Label: SPAM, Confidence: 0.92
Label: SPAM, Confidence: 0.94


## Summary

This notebook demonstrated:
- Using Hugging Face pipelines for text classification
- Adapting a sentiment model for spam detection
- Processing multiple documents at once
- Interpreting and mapping model outputs

The pipeline API makes it easy to use pre-trained models for various NLP tasks!