- SciKit Learn
- Random Forest Classifier
- MLP Classifier
- HuggingFace
- DeBERTA
- MiniLM
These models were tested on 10,000 articles each.
SciKit Learn Models | DeBERTA-v3 | Mini LM | |
---|---|---|---|
Top 3 Categories | 64.35% | 50.70% | 75.42% |
Top 4 Categories | 65.54% | 48.50% | 73.10% |
Top 5 Categories | 37.00% | 36.60% | 71.07% |
- Install Python version 3.8 or above
- Install Conda using
pip install conda
- Install dependencies from the
environment.yml
file usingconda env create -f environment.yml
. This creates an environment calledtext-ML
. - Activate environment using
conda activate text-ML
or by going into settings for your code editor and selectingtext ml
from there, if your code editor has that feature. - Download the dataset from Kaggle
- Note that when running a program on one singular news article, it automatically uses the best model,
Mini LM