#### A recommendation system 
is one of the applications of data science that is used by almost any application or website today. 

Many websites today use a recommendation system to recommend articles to their readers. 

Even the website you’re currently reading this article on is also using a recommendation system to recommend articles to its readers. 

#### focus on content rather than user interest

To create an articles recommendation system, we need to focus on content rather than user interest. For example, if a user reads an article based on clustering, all recommended articles should also be based on clustering. So to recommend articles based on the content:

1) we need to understand the content of the article

2) match the content with all the other articles
and 

3) recommend the most suitable articles for the article that the reader is already reading

#### cosine similarity in machine learning.
For this task, we can use this concept

Cosine similarity is a method of building recommendation systems based on the content. 

It is used to find similarities between two different pieces of text documents. 

So we can use cosine similarity to build an article recommendation system. 

#### Mechanism of Cosine Similarity
 It does this by calculating the similarity score between the vectors, which is done by finding the angles between them. 
 
 The range of similarities is between 0 and 1. 
 
 If the value of the similarity score between two vectors is 1,
     it means that there is a greater similarity between the two vectors.

On the other hand, if the value of the similarity score between two vectors is 0, it means that there is no similarity between the two vectors. 

When the similarity score is one, the angle between two vectors is 0 and when the similarity score is 0, the angle between two vectors is 90 degrees.

### Article Recommendation System using Python

In [33]:
# import numpy as np
# from sklearn.feature_extraction import text
# from sklearn.metrics.pairwise import cosine_similarity

import pandas as pd
df = pd.read_csv("article_recommendation.csv",
                 encoding='latin1')
df.head()

Unnamed: 0,Article,Title
0,Data analysis is the process of inspecting and...,Best Books to Learn Data Analysis
1,The performance of a machine learning algorith...,Assumptions of Machine Learning Algorithms
2,You must have seen the news divided into categ...,News Classification with Machine Learning
3,When there are only two classes in a classific...,Multiclass Classification Algorithms in Machin...
4,The Multinomial Naive Bayes is one of the vari...,Multinomial Naive Bayes in Machine Learning


In [8]:
df.shape

(34, 2)

In [18]:
# let’s use the cosine similarity algorithm
articles = df['Article'].tolist()

from sklearn.feature_extraction import text
uni_tfidf = text.TfidfVectorizer(input=articles,
                                stop_words='english')
uni_tfidf

TfidfVectorizer(input=['Data analysis is the process of inspecting and '
                       'exploring data generated by a particular population to '
                       'find the information needed to make decisions and draw '
                       'conclusions. With the use of data in decision making, '
                       'most businesses today need data analysts. So, if you '
                       'want to know about the best books to learn data '
                       'analysis, this article is for you. In this article, I '
                       'will intro...
                       'you to some of the best books to learn deep learning.',
                       'Many machine learning algorithms can be used to solve '
                       'complex problems that require a large amount of data '
                       'with a large number of features, but deep learning can '
                       'outperform all algorithms. So to understand where we '
               

In [19]:
uni_matrix = uni_tfidf.fit_transform(articles)
uni_matrix

<34x407 sparse matrix of type '<class 'numpy.float64'>'
	with 846 stored elements in Compressed Sparse Row format>

In [16]:
from sklearn.metrics.pairwise import cosine_similarity
uni_sim = cosine_similarity(uni_matrix)
uni_sim

array([[1.        , 0.02858003, 0.02014231, ..., 0.12022323, 0.00455773,
        0.02511323],
       [0.02858003, 1.        , 0.07651482, ..., 0.30365338, 0.27795728,
        0.00383369],
       [0.02014231, 0.07651482, 1.        , ..., 0.08401534, 0.05252305,
        0.03233971],
       ...,
       [0.12022323, 0.30365338, 0.08401534, ..., 1.        , 0.12620279,
        0.04275628],
       [0.00455773, 0.27795728, 0.05252305, ..., 0.12620279, 1.        ,
        0.02113943],
       [0.02511323, 0.00383369, 0.03233971, ..., 0.04275628, 0.02113943,
        1.        ]])

In [24]:
# let's write a Python function to recommend articles:
def recommend_articles(x):
    return ", ".join(df["Title"].loc[x.argsort()[-5:-1]])

In [27]:
df["Recommended Articles"] = [recommend_articles(x) for x in uni_sim]
df.head()

# a new column has been added to the dataset that contains the titles of all the recommended articles.

Unnamed: 0,Article,Title,Recommended Articles
0,Data analysis is the process of inspecting and...,Best Books to Learn Data Analysis,"Introduction to Recommendation Systems, Best B..."
1,The performance of a machine learning algorith...,Assumptions of Machine Learning Algorithms,"Applications of Deep Learning, Best Books to L..."
2,You must have seen the news divided into categ...,News Classification with Machine Learning,"Language Detection with Machine Learning, Appl..."
3,When there are only two classes in a classific...,Multiclass Classification Algorithms in Machin...,"Assumptions of Machine Learning Algorithms, Be..."
4,The Multinomial Naive Bayes is one of the vari...,Multinomial Naive Bayes in Machine Learning,"Assumptions of Machine Learning Algorithms, Me..."


In [29]:
df["Article"][2]

"You must have seen the news divided into categories when you go to a news website. Some of the popular categories that you'll see on almost any news website are tech, entertainment, and sports. If you want to know how to classify news categories using machine learning, this article is for you. In this article, I will walk you through the task of news classification with machine learning using Python."

In [30]:
df["Title"][2]

'News Classification with Machine Learning'

In [28]:
# let’s see all the recommendations for an article:
df["Recommended Articles"][2]

'Language Detection with Machine Learning, Apple Stock Price Prediction with Machine Learning, Multiclass Classification Algorithms in Machine Learning, News Classification with Machine Learning'

In [31]:
# all the recommended articles are also based on the concepts of clustering, 
    #so we can say that this recommender system can also give great results in real-time.