Skip to content

hamzeesaid/nlp-text-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🏷️ NLP Text Classifier

Fine-tune a transformer model (DistilBERT / RoBERTa) on custom text classification tasks. Includes training pipeline, evaluation, and a FastAPI inference endpoint.

🚀 Features

  • Fine-tuning pipeline: Train on any labeled text dataset in CSV format
  • Multi-class & multi-label: Configurable output heads
  • Experiment tracking: Weights & Biases integration
  • Model registry: Save / load fine-tuned models locally or HuggingFace Hub
  • FastAPI endpoint: Serve predictions with confidence scores
  • Evaluation dashboard: Confusion matrix, F1, precision/recall charts

🛠️ Tech Stack

Python · PyTorch · HuggingFace Transformers · Datasets · FastAPI · scikit-learn · Weights & Biases

📁 Project Structure

nlp-text-classifier/
├── app/
│   ├── main.py            # FastAPI inference server
│   ├── predictor.py       # Model loading + inference
│   └── schemas.py
├── train.py               # Training entry point
├── evaluate.py            # Evaluation + metrics
├── config.yaml            # Hyperparameter config
├── data/
│   └── sample_data.csv    # Example dataset format
├── notebooks/
│   └── exploration.ipynb
└── requirements.txt

⚡ Quick Start

1. Prepare Data

CSV with text and label columns:

text,label
"This movie was great!",positive
"Terrible experience.",negative

2. Train

python train.py --config config.yaml --data data/your_data.csv

3. Evaluate

python evaluate.py --model-path ./outputs/model

4. Serve

uvicorn app.main:app --reload

📊 Supported Tasks

Task Description
Sentiment Positive / Negative / Neutral
Topic Custom multi-class categories
Intent User intent detection
Spam Binary spam/ham detection

🧪 Benchmark Results (AG News dataset)

Model Accuracy F1
DistilBERT 93.8% 0.937
RoBERTa-base 95.2% 0.951

📄 License

MIT

About

Fine-tune transformers for text classification — DistilBERT/RoBERTa, HuggingFace, FastAPI inference endpoint

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages