In this project I will use a public dataset from the BBC comprised of 1490 articles, each labeled under one of 5 categories: business, entertainment, politics, sport or tech. This is a great dataset for document classification, also for classification algorithms which I will use for this project. The dataset has 1490 rows in it. I’ll use two test techniques, which are K-Fold Validation and train-test split (%90-10). The goal will be to build a system that can accurately classify previously unseen news articles into the true category.
You can find more details in project report.