In [1]:
#Importing libraries and dependencies
import pandas as pd
import numpy as np
from sklearn.feature_extraction import text
from sklearn.metrics.pairwise import cosine_similarity

In [2]:
#Reading Dataset
data = pd.read_csv("articles.csv", encoding='latin1')
data.drop(["Unnamed: 0"],axis=1,inplace=True)

In [3]:
data

Unnamed: 0,Article,Title
0,Data analysis is the process of inspecting and...,Best Books to Learn Data Analysis
1,The performance of a machine learning algorith...,Assumptions of Machine Learning Algorithms
2,You must have seen the news divided into categ...,News Classification with Machine Learning
3,When there are only two classes in a classific...,Multiclass Classification Algorithms in Machin...
4,The Multinomial Naive Bayes is one of the vari...,Multinomial Naive Bayes in Machine Learning
5,You must have seen the news divided into categ...,News Classification with Machine Learning
6,Natural language processing or NLP is a subfie...,Best Books to Learn NLP
7,By using a third-party application or API to m...,Send Instagram Messages using Python
8,Twitter is one of the most popular social medi...,Pfizer Vaccine Sentiment Analysis using Python
9,The squid game is currently one of the most tr...,Squid Game Sentiment Analysis using Python


In [4]:
#Checking null values
data.isnull().sum()

Article    0
Title      0
dtype: int64

In [5]:
#Taking Articles as list 
articles = data["Article"].tolist()

In [6]:
articles

['Data analysis is the process of inspecting and exploring data generated by a particular population to find the information needed to make decisions and draw conclusions. With the use of data in decision making, most businesses today need data analysts. So, if you want to know about the best books to learn data analysis, this article is for you. In this article, I will introduce you to some of the best books to learn data analysis.',
 'The performance of a machine learning algorithm on a particular dataset often depends on whether the features of the dataset satisfies the assumptions of that machine learning algorithm. Not all machine learning algorithms have assumptions that differentiate them from each other. So, in this article, I will take you through the assumptions of machine learning algorithms.',
 "You must have seen the news divided into categories when you go to a news website. Some of the popular categories that you'll see on almost any news website are tech, entertainment,

In [7]:
#Initializing TfidfVectorizer
uni_tfidf = text.TfidfVectorizer(input = articles, stop_words="english")

In [8]:
uni_tfidf

TfidfVectorizer(input=['Data analysis is the process of inspecting and '
                       'exploring data generated by a particular population to '
                       'find the information needed to make decisions and draw '
                       'conclusions. With the use of data in decision making, '
                       'most businesses today need data analysts. So, if you '
                       'want to know about the best books to learn data '
                       'analysis, this article is for you. In this article, I '
                       'will intro...
                       'you to some of the best books to learn deep learning.',
                       'Many machine learning algorithms can be used to solve '
                       'complex problems that require a large amount of data '
                       'with a large number of features, but deep learning can '
                       'outperform all algorithms. So to understand where we '
               

In [9]:
uni_matrix = uni_tfidf.fit_transform(articles)

In [10]:
uni_matrix

<34x407 sparse matrix of type '<class 'numpy.float64'>'
	with 846 stored elements in Compressed Sparse Row format>

In [11]:
#Finding cosine similarity
uni_sim = cosine_similarity(uni_matrix)

In [12]:
uni_sim[5]

array([0.02014231, 0.07651482, 1.        , 0.1194975 , 0.04471043,
       1.        , 0.01901747, 0.03192296, 0.04904073, 0.02963846,
       0.00579617, 0.03875649, 0.0261384 , 0.05363   , 0.08863759,
       0.03396158, 0.0360815 , 0.07202181, 0.03821239, 0.03414062,
       0.04558645, 0.07698814, 0.05072101, 0.0368313 , 0.04157524,
       0.02838926, 0.04944574, 0.09175411, 0.08467406, 0.04707784,
       0.06024993, 0.08401534, 0.05252305, 0.03233971])

In [13]:
def recommend_articles(x):
    return ", ".join(data["Title"].loc[x.argsort()[-5:-1]])

In [14]:
data["Title"].loc[uni_sim[5].argsort()]

10                  Best Books to Learn Computer Vision
6                               Best Books to Learn NLP
0                     Best Books to Learn Data Analysis
12                 Best Python Frameworks to Build APIs
25                   Animated Scatter Plot using Python
9            Squid Game Sentiment Analysis using Python
7                  Send Instagram Messages using Python
33                          Swap Items of a Python List
15            Multilayer Perceptron in Machine Learning
19    Health Insurance Premium Prediction with Machi...
16                             Types of Neural Networks
23                DBSCAN Clustering in Machine Learning
18    For Loop Over Keys and Values in a Python Dict...
11                       Best Resources to Learn Python
24               K-Means Clustering in Machine Learning
4           Multinomial Naive Bayes in Machine Learning
20            Mean Shift Clustering in Machine Learning
29                        Applications of Deep L

In [15]:
data["Recommended Articles"] = [recommend_articles(x) for x in uni_sim]

In [16]:
data

Unnamed: 0,Article,Title,Recommended Articles
0,Data analysis is the process of inspecting and...,Best Books to Learn Data Analysis,"Introduction to Recommendation Systems, Best B..."
1,The performance of a machine learning algorith...,Assumptions of Machine Learning Algorithms,"Applications of Deep Learning, Best Books to L..."
2,You must have seen the news divided into categ...,News Classification with Machine Learning,"Language Detection with Machine Learning, Appl..."
3,When there are only two classes in a classific...,Multiclass Classification Algorithms in Machin...,"Assumptions of Machine Learning Algorithms, Be..."
4,The Multinomial Naive Bayes is one of the vari...,Multinomial Naive Bayes in Machine Learning,"Assumptions of Machine Learning Algorithms, Me..."
5,You must have seen the news divided into categ...,News Classification with Machine Learning,"Language Detection with Machine Learning, Appl..."
6,Natural language processing or NLP is a subfie...,Best Books to Learn NLP,"Language Detection with Machine Learning, Best..."
7,By using a third-party application or API to m...,Send Instagram Messages using Python,For Loop Over Keys and Values in a Python Dict...
8,Twitter is one of the most popular social medi...,Pfizer Vaccine Sentiment Analysis using Python,Use Cases of Different Machine Learning Algori...
9,The squid game is currently one of the most tr...,Squid Game Sentiment Analysis using Python,Apple Stock Price Prediction with Machine Lear...


In [17]:
#Printing Article
print(data["Article"][5])

You must have seen the news divided into categories when you go to a news website. Some of the popular categories that you'll see on almost any news website are tech, entertainment, and sports. If you want to know how to classify news categories using machine learning, this article is for you. In this article, I will walk you through the task of news classification with machine learning using Python.


In [18]:
#Printing Recommended Articles
print(data["Recommended Articles"][5])

Language Detection with Machine Learning, Apple Stock Price Prediction with Machine Learning, Multiclass Classification Algorithms in Machine Learning, News Classification with Machine Learning
