# News Classification with Machine Learning

https://thecleverprogrammer.com/2021/10/07/news-classification-with-machine-learning/

You must have seen the news divided into categories when you go to a news website. Some of the popular categories that you’ll see on almost any news website are tech, entertainment, and sports. If you want to know how to classify news categories using machine learning, this article is for you.

## News Classification

Every news website classifies the news article before publishing it so that every time visitors visit their website can easily click on the type of news that interests them.

Currently, the news articles are classified by hand by the content managers of news websites. But to save time, they can also implement a machine learning model on their websites that read the news headline or the content of the news and classifies the category of the news.

## News Classification using Python

In [5]:
import pandas as pd
import numpy as np

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB

In [6]:
data = pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-data/master/bbc-news-data.csv", sep='\t')

In [7]:
data.tail()

Unnamed: 0,category,filename,title,content
2220,tech,397.txt,BT program to beat dialler scams,BT is introducing two initiatives to help bea...
2221,tech,398.txt,Spam e-mails tempt net shoppers,Computer users across the world continue to i...
2222,tech,399.txt,Be careful how you code,A new European directive could put software w...
2223,tech,400.txt,US cyber security chief resigns,The man making sure US computer networks are ...
2224,tech,401.txt,Losing yourself in online gaming,"Online role playing games are time-consuming,..."


In [8]:
data.isnull().sum()

category    0
filename    0
title       0
content     0
dtype: int64

In [9]:
data.category.value_counts()

sport            511
business         510
politics         417
tech             401
entertainment    386
Name: category, dtype: int64

## News Classification Model

In [10]:
data = data[['title', 'category']]
x = np.array(data.title)
y = np.array(data.category)

cv = CountVectorizer()
X = cv.fit_transform(x)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

In [12]:
model = MultinomialNB()
model.fit(X_train, y_train)

In [13]:
user = input("Enter a text")

Enter a text Latest Apple iPhone SE 3 concept renders show a compact smartphone in the style of the iPhone 4


In [14]:
data = cv.transform([user]).toarray()
output = model.predict(data)

In [15]:
output

array(['tech'], dtype='<U13')