🏷️ NLP Text Classifier

Fine-tune a transformer model (DistilBERT / RoBERTa) on custom text classification tasks. Includes training pipeline, evaluation, and a FastAPI inference endpoint.

🚀 Features

Fine-tuning pipeline: Train on any labeled text dataset in CSV format
Multi-class & multi-label: Configurable output heads
Experiment tracking: Weights & Biases integration
Model registry: Save / load fine-tuned models locally or HuggingFace Hub
FastAPI endpoint: Serve predictions with confidence scores
Evaluation dashboard: Confusion matrix, F1, precision/recall charts

🛠️ Tech Stack

Python · PyTorch · HuggingFace Transformers · Datasets · FastAPI · scikit-learn · Weights & Biases

📁 Project Structure

nlp-text-classifier/
├── app/
│   ├── main.py            # FastAPI inference server
│   ├── predictor.py       # Model loading + inference
│   └── schemas.py
├── train.py               # Training entry point
├── evaluate.py            # Evaluation + metrics
├── config.yaml            # Hyperparameter config
├── data/
│   └── sample_data.csv    # Example dataset format
├── notebooks/
│   └── exploration.ipynb
└── requirements.txt

⚡ Quick Start

1. Prepare Data

CSV with text and label columns:

text,label
"This movie was great!",positive
"Terrible experience.",negative

2. Train

python train.py --config config.yaml --data data/your_data.csv

3. Evaluate

python evaluate.py --model-path ./outputs/model

4. Serve

uvicorn app.main:app --reload

📊 Supported Tasks

Task	Description
Sentiment	Positive / Negative / Neutral
Topic	Custom multi-class categories
Intent	User intent detection
Spam	Binary spam/ham detection

🧪 Benchmark Results (AG News dataset)

Model	Accuracy	F1
DistilBERT	93.8%	0.937
RoBERTa-base	95.2%	0.951

📄 License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏷️ NLP Text Classifier

🚀 Features

🛠️ Tech Stack

📁 Project Structure

⚡ Quick Start

1. Prepare Data

2. Train

3. Evaluate

4. Serve

📊 Supported Tasks

🧪 Benchmark Results (AG News dataset)

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
data		data
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
requirements.txt		requirements.txt
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

🏷️ NLP Text Classifier

🚀 Features

🛠️ Tech Stack

📁 Project Structure

⚡ Quick Start

1. Prepare Data

2. Train

3. Evaluate

4. Serve

📊 Supported Tasks

🧪 Benchmark Results (AG News dataset)

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages