This repository contains fine-tuned transformer models for various NLP tasks. Below are the models that have been trained and their respective details.
- Model: noviciusss/agnewsDistilt
- Base Model: DistilBERT Base Uncased
- Dataset: AG News (SetFit/ag_news)
- Task: News Article Classification
- Classes: 4 categories (World, Sports, Business, Sci/Tech)
- Performance:
- Accuracy: ~94.7%
- F1 Macro: ~94.7%
- Training Details:
- Epochs: 3
- Learning Rate: 2e-5
- Batch Size: 16 (train), 32 (eval)
- Weight Decay: 0.01
- Model: noviciusss/RoBERTa-base_Banking77
- Base Model: FacebookAI/RoBERTa Base
- Dataset: Banking77 (mteb/banking77)
- Task: Banking Intent Classification
- Classes: 77 banking-related intents
- Performance:
- Accuracy: ~93.7%
- F1 Macro: ~93.6%
- Training Details:
- Epochs: 5
- Learning Rate: 2e-5
- Batch Size: 16 (train), 32 (eval)
- Weight Decay: 0.01
- Model: noviciusss/flan-t5-base-samsum
- Base Model: google/flan-t5-base with LoRA adapters (r=16, alpha=32, dropout=0.05)
- Dataset: SAMSum (knkarthick/samsum)
- Task: Dialogue Summarization
- Performance:
- ROUGE-1: ~49.0
- ROUGE-L: ~41.0
- BERTScore F1: ~72.3
- METEOR: ~42.5
- Training Details:
- Epochs: 3
- Learning Rate: 1e-4
- Batch Size: 8 (train), 8 (eval)
- Generation Max Length: 128
- Predict with generate enabled for evaluation
| Model | Dataset | Task | Key Metrics | Base Model |
|---|---|---|---|---|
| agnewsDistilt | AG News | News classification | Accuracy: 94.7%; F1 Macro: 94.7% | DistilBERT |
| RoBERTa-base_Banking77 | Banking77 | Intent classification | Accuracy: 93.7%; F1 Macro: 93.6% | RoBERTa |
| flan-t5-base-samsum | SAMSum | Dialogue summarization | ROUGE-1: 49.0; ROUGE-L: 41.0; BERTScore F1: 72.3 | FLAN-T5 + LoRA |
You can use these models directly from Hugging Face:
from transformers import (
AutoTokenizer,
AutoModelForSequenceClassification,
AutoModelForSeq2SeqLM,
)
# AG News Classification
ag_news_tokenizer = AutoTokenizer.from_pretrained("noviciusss/agnewsDistilt")
ag_news_model = AutoModelForSequenceClassification.from_pretrained("noviciusss/agnewsDistilt")
# Banking Intent Classification
banking_tokenizer = AutoTokenizer.from_pretrained("noviciusss/RoBERTa-base_Banking77")
banking_model = AutoModelForSequenceClassification.from_pretrained("noviciusss/RoBERTa-base_Banking77")
# Dialogue Summarization
sam_tokenizer = AutoTokenizer.from_pretrained("noviciusss/flan-t5-base-samsum")
sam_model = AutoModelForSeq2SeqLM.from_pretrained("noviciusss/flan-t5-base-samsum")
inputs = sam_tokenizer("summarize: Alice met Bob to discuss the launch timeline.", return_tensors="pt")
summary_ids = sam_model.generate(**inputs, max_length=128)
summary = sam_tokenizer.decode(summary_ids[0], skip_special_tokens=True)FineTunning/
βββ AgNews_DistilBERT_model/
β βββ FineTuning_1.ipynb # DistilBERT fine-tuning notebook
βββ RoBERTa_base_Banking77/
β βββ RoBERTa_base_Banking77.ipynb # RoBERTa fine-tuning notebook
βββ flan-t5-base-samsum_lora/
β βββ Text_Summ_t5Base_SAMsum.ipynb # FLAN-T5 LoRA summarization notebook
βββ README.md
- Framework: Transformers (Hugging Face)
- Hardware: GPU-accelerated training
- Evaluation Strategy: Per epoch evaluation with
predict_with_generatefor seq2seq runs - Adapters: LoRA applied to FLAN-T5 (r=16, alpha=32, dropout=0.05)
- Metrics: Accuracy and F1 for classification; ROUGE, BERTScore, METEOR, BLEU for summarization
- Model Selection: Classification models track accuracy, while summarization tracks ROUGE-L for best checkpoint
- All training runs use FP16 precision for efficiency
- Models are saved and pushed to Hugging Face Hub automatically after evaluation
- Training pipelines track task-appropriate metrics (accuracy/F1 or ROUGE/BERTScore/METEOR/BLEU)
- Data preprocessing includes consistent tokenization, padding, and task-specific prompts (e.g.,
summarize:prefix) - LoRA adapters keep the FLAN-T5 summarizer lightweight while preserving base model weights