Fine-tune a transformer model (DistilBERT / RoBERTa) on custom text classification tasks. Includes training pipeline, evaluation, and a FastAPI inference endpoint.
- Fine-tuning pipeline: Train on any labeled text dataset in CSV format
- Multi-class & multi-label: Configurable output heads
- Experiment tracking: Weights & Biases integration
- Model registry: Save / load fine-tuned models locally or HuggingFace Hub
- FastAPI endpoint: Serve predictions with confidence scores
- Evaluation dashboard: Confusion matrix, F1, precision/recall charts
Python · PyTorch · HuggingFace Transformers · Datasets · FastAPI · scikit-learn · Weights & Biases
nlp-text-classifier/
├── app/
│ ├── main.py # FastAPI inference server
│ ├── predictor.py # Model loading + inference
│ └── schemas.py
├── train.py # Training entry point
├── evaluate.py # Evaluation + metrics
├── config.yaml # Hyperparameter config
├── data/
│ └── sample_data.csv # Example dataset format
├── notebooks/
│ └── exploration.ipynb
└── requirements.txt
CSV with text and label columns:
text,label
"This movie was great!",positive
"Terrible experience.",negativepython train.py --config config.yaml --data data/your_data.csvpython evaluate.py --model-path ./outputs/modeluvicorn app.main:app --reload| Task | Description |
|---|---|
| Sentiment | Positive / Negative / Neutral |
| Topic | Custom multi-class categories |
| Intent | User intent detection |
| Spam | Binary spam/ham detection |
| Model | Accuracy | F1 |
|---|---|---|
| DistilBERT | 93.8% | 0.937 |
| RoBERTa-base | 95.2% | 0.951 |
MIT