### Article Recommendation System
* Recommendation systems in machine learning are one such algorithm that works based on the similarity of contents.
* There are various ways to measure the similarity between the two contents and recommendation systems basically use the [similarity matrix] to recommend the similar content to the user based on his accessing characteristics. So any recommendation data can be acquired and the required features that would be useful for recommending the contents can be taken out from the data. Once the required textual data is available the textual data has to be vectorized using the CountVectorizer to obtain the similarity matrix. So once the similarity matrix is obtained the cosine similarity metrics of scikit learn can be used to recommend the user.

### Cosine Similarity in recommendation system
* So the cosine similarity would yield a similarity matrix for the selected textual data for recommendation and the content with higher similarity scores can be sorted using lists.
* Here cosine similarity would consider the frequently occurring terms in the textual data and that terms would be vectorized with higher frequencies and that content would be recommended with higher recommendation percentages. So this is how cosine similarity is used in recommendation systems.

### Example of cosine similarity for Numbers

In [1]:
import numpy as np
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity

a = np.array([10, 5, 15, 7, 5])
b = np.array([5, 10, 17, 5, 3])
cosine = cosine_similarity(a.reshape(1, -1), b.reshape(1, -1))
print(cosine)

[[0.92925111]]


**Cosine similarity is near to 1, that means both the arrays are likely to be similar.**

### Example of cosine similarity for Text

In [34]:
df=pd.DataFrame({
    "Article":['Machine learning is divided into two basic types, one is supervised and another is unsupervised.'],
    "Recommender":"Supervised, unsupervised are two types of machine learning algorithm that used in prediction."
})

In [35]:
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
count_vec=CountVectorizer()
sim_matrix1=count_vec.fit_transform(df['Article'])
sim_matrix2=count_vec.fit_transform(df['Recommender'])
print('Similarity Matrix',sim_matrix1.toarray())
print('Similarity Matrix',sim_matrix2.toarray())

cos_sim = cosine_similarity(sim_matrix1, sim_matrix2)
cos_sim

Similarity Matrix [[1 1 1 1 1 3 1 1 1 1 1 1 1]]
Similarity Matrix [[1 1 1 1 1 1 1 1 1 1 1 1 1]]


array([[0.9078413]])

**Cosine similarity score is near to 1, hence both sentences are nearly similarity.**

## Article Recommendation for medium.com

In [26]:
pd.options.display.max_colwidth=100000000

In [36]:
data=pd.read_csv("D:\\PGP IN DATA SCIENCE with Careerera\\Data Sets\\ML Datasets\\articles.csv")
data.head()

Unnamed: 0,Article,Title
0,"Data analysis is the process of inspecting and exploring data generated by a particular population to find the information needed to make decisions and draw conclusions. With the use of data in decision making, most businesses today need data analysts. So, if you want to know about the best books to learn data analysis, this article is for you. In this article, I will introduce you to some of the best books to learn data analysis.",Best Books to Learn Data Analysis
1,"The performance of a machine learning algorithm on a particular dataset often depends on whether the features of the dataset satisfies the assumptions of that machine learning algorithm. Not all machine learning algorithms have assumptions that differentiate them from each other. So, in this article, I will take you through the assumptions of machine learning algorithms.",Assumptions of Machine Learning Algorithms
2,"You must have seen the news divided into categories when you go to a news website. Some of the popular categories that you'll see on almost any news website are tech, entertainment, and sports. If you want to know how to classify news categories using machine learning, this article is for you. In this article, I will walk you through the task of news classification with machine learning using Python.",News Classification with Machine Learning
3,"When there are only two classes in a classification problem, this is the problem of binary classification, just like that, classification with more than two classes is called multiclass classification. If you want to know the best machine learning algorithms for multiclass classification, this article is for you. In this article, I will introduce you to some of the best multiclass classification algorithms in machine learning.",Multiclass Classification Algorithms in Machine Learning
4,"The Multinomial Naive Bayes is one of the variants of the Naive Bayes algorithm in machine learning. It is very useful when the data is distributed in a multinomial way. This algorithm is especially preferred in classification tasks based on natural language processing. Spam detection is one of the applications where this algorithm can be used. If you have never used the Multinomial Naive Bayes algorithm before, this article is for you. In this article, I will take you through an introduction to the Multinomial Naive Bayes algorithm in machine learning and its implementation using Python.",Multinomial Naive Bayes in Machine Learning


**In the datasets we have Article and Title of that Article.**

**Lets use Cosine Similarity Matrix to Recommend reader a Similar Content.**

In [37]:
articles = data["Article"].tolist()
tf = TfidfVectorizer(input=articles, stop_words="english")
matrix = tf.fit_transform(articles)
cos_sim = cosine_similarity(matrix)
def recommend_articles(x):
    return ", ".join(data["Title"].loc[x.argsort()[-5:-1]])    
data["Recommended Articles"] = [recommend_articles(x) for x in cos_sim]
data.head()

Unnamed: 0,Article,Title,Recommended Articles
0,"Data analysis is the process of inspecting and exploring data generated by a particular population to find the information needed to make decisions and draw conclusions. With the use of data in decision making, most businesses today need data analysts. So, if you want to know about the best books to learn data analysis, this article is for you. In this article, I will introduce you to some of the best books to learn data analysis.",Best Books to Learn Data Analysis,"Introduction to Recommendation Systems, Best Books to Learn Computer Vision, Best Books to Learn Deep Learning, Best Resources to Learn Python"
1,"The performance of a machine learning algorithm on a particular dataset often depends on whether the features of the dataset satisfies the assumptions of that machine learning algorithm. Not all machine learning algorithms have assumptions that differentiate them from each other. So, in this article, I will take you through the assumptions of machine learning algorithms.",Assumptions of Machine Learning Algorithms,"Applications of Deep Learning, Best Books to Learn Deep Learning, Naive Bayes Algorithm in Machine Learning, Use Cases of Different Machine Learning Algorithms"
2,"You must have seen the news divided into categories when you go to a news website. Some of the popular categories that you'll see on almost any news website are tech, entertainment, and sports. If you want to know how to classify news categories using machine learning, this article is for you. In this article, I will walk you through the task of news classification with machine learning using Python.",News Classification with Machine Learning,"Language Detection with Machine Learning, Apple Stock Price Prediction with Machine Learning, Multiclass Classification Algorithms in Machine Learning, News Classification with Machine Learning"
3,"When there are only two classes in a classification problem, this is the problem of binary classification, just like that, classification with more than two classes is called multiclass classification. If you want to know the best machine learning algorithms for multiclass classification, this article is for you. In this article, I will introduce you to some of the best multiclass classification algorithms in machine learning.",Multiclass Classification Algorithms in Machine Learning,"Assumptions of Machine Learning Algorithms, Best Books to Learn Deep Learning, Use Cases of Different Machine Learning Algorithms, Clustering Algorithms in Machine Learning"
4,"The Multinomial Naive Bayes is one of the variants of the Naive Bayes algorithm in machine learning. It is very useful when the data is distributed in a multinomial way. This algorithm is especially preferred in classification tasks based on natural language processing. Spam detection is one of the applications where this algorithm can be used. If you have never used the Multinomial Naive Bayes algorithm before, this article is for you. In this article, I will take you through an introduction to the Multinomial Naive Bayes algorithm in machine learning and its implementation using Python.",Multinomial Naive Bayes in Machine Learning,"Assumptions of Machine Learning Algorithms, Mean Shift Clustering in Machine Learning, Language Detection with Machine Learning, Naive Bayes Algorithm in Machine Learning"


#### In this Way many online websites using Recommendation system to suggest content for readers or viewers.