<a href="https://colab.research.google.com/github/SuvarnaDalin/Analytics-Projects/blob/master/Content_Based_Recommendation_System.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import pandas as pd

In [None]:
!git clone https://github.com/SuvarnaDalin/Data-for-Analysis.git

Cloning into 'Data-for-Analysis'...
remote: Enumerating objects: 18, done.[K
remote: Counting objects: 100% (18/18), done.[K
remote: Compressing objects: 100% (17/17), done.[K
remote: Total 18 (delta 2), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (18/18), done.


In [None]:
!cd Data-for-Analysis/

In [None]:
!ls Data-for-Analysis/

Iris.csv  posts.csv  README.md	SampleSuperstore.csv  users.csv  views.csv


In [None]:
posts_data = pd.read_csv('Data-for-Analysis/posts.csv')
users_data = pd.read_csv('Data-for-Analysis/users.csv')
views_data = pd.read_csv('Data-for-Analysis/views.csv')

Recommendation System, based on:
# 1. Content Based Filtering

In [None]:
posts_data.isnull().sum()

_id            0
title          0
category      28
 post_type     0
dtype: int64

In [None]:
posts_copy = posts_data
posts_copy = posts_copy.fillna(posts_copy['category'].value_counts().index[0])

In [None]:
posts_copy.isnull().sum()

_id           0
title         0
category      0
 post_type    0
dtype: int64

In [None]:
# Training the data - TFID method

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import linear_kernel

tf = TfidfVectorizer(analyzer='word', ngram_range=(1, 3), min_df=0, stop_words='english')
tfidf_matrix = tf.fit_transform(posts_copy['category'])

cosine_similarities = linear_kernel(tfidf_matrix, tfidf_matrix)

results = {}

for idx, row in posts_copy.iterrows():
    similar_indices = cosine_similarities[idx].argsort()[:-100:-1]
    similar_items = [(cosine_similarities[idx][i], posts_copy['_id'][i]) for i in similar_indices]
    results[row['_id']] = similar_items[1:]

In [None]:
# Making Predictions

def item(id):
    return posts_copy.loc[posts_copy['_id'] == id]['title'].tolist()[0].split(' - ')[0]

def recommend(user_id, num):
    print("Recommending " + str(num) + " products for the user " + user_id + "...")
    print("-------")
    recs = results[user_id][:num]
    for rec in recs:
        print("Recommended: " + item(rec[1]) + " (score:" + str(rec[0]) + ")")

### Result: Recommend posts for the given user

In [None]:
# Give any item_id from the given snippet to check the results and number of recommendations between 1-99
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
              _id 	              title    	                        category 	                                  post_type
0 	5d62abaa65218653a132c956 	hello there 	                  Plant Biotechnology 	                              blog
1 	5d6d39567fa40e1417a4931c 	Ml and AI 	                    Artificial Intelligence|Machine Learning|Infor... 	blog
2 	5d7d23315720533e15c3b1ee 	What is an Operating System ? 	Operating Systems 	                                blog
3 	5d7d405e5720533e15c3b1f3 	Lord Shiva 	                    Drawings 	                                          artwork
4 	5d80dfbc6c53455f896e600e 	How Competition law evolved? 	  Competition Laws 	                                  blog

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""


'""\n              _id \t              title    \t                        category \t                                  post_type\n0 \t5d62abaa65218653a132c956 \thello there \t                  Plant Biotechnology \t                              blog\n1 \t5d6d39567fa40e1417a4931c \tMl and AI \t                    Artificial Intelligence|Machine Learning|Infor... \tblog\n2 \t5d7d23315720533e15c3b1ee \tWhat is an Operating System ? \tOperating Systems \t                                blog\n3 \t5d7d405e5720533e15c3b1f3 \tLord Shiva \t                    Drawings \t                                          artwork\n4 \t5d80dfbc6c53455f896e600e \tHow Competition law evolved? \t  Competition Laws \t                                  blog\n\n'

In [None]:
recommend(user_id='5d7d405e5720533e15c3b1f3', num=10)

Recommending 10 products for the user 5d7d405e5720533e15c3b1f3...
-------
Recommended: Painting (score:1.0)
Recommended: Shree Ganesh Drawing (score:1.0)
Recommended: God Drawing (score:1.0)
Recommended: God (score:1.0)
Recommended: Shiva Portrait (score:1.0)
Recommended: Inside life (score:1.0)
Recommended: No one's worth hate🧡 (score:1.0)
Recommended: Love binds (score:1.0)
Recommended: Daaku (score:1.0)
Recommended: ROMAN REIGNS (score:1.0)


### Result: Recommend similar posts for the given post

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import linear_kernel

tf = TfidfVectorizer(analyzer='word', ngram_range=(1, 3), min_df=0, stop_words='english')
tfidf_matrix = tf.fit_transform(posts_copy['category'])

cosine_similarities = linear_kernel(tfidf_matrix, tfidf_matrix)
results = {}

for idx, row in posts_copy.iterrows():
    similar_indices = cosine_similarities[idx].argsort()[:-100:-1]
    similar_items = [(cosine_similarities[idx][i], posts_copy['title'][i]) for i in similar_indices]
    results[row['title']] = similar_items[1:]

In [None]:
def item(title):
    return posts_copy.loc[posts_copy['title'] == title]['title'].tolist()[0].split(' - ')[0]

def recommend(post_title, num):
    print("Recommending " + str(num) + " products similar to " + item(post_title) + "...")
    print("-------")
    recs = results[post_title][:num]
    for rec in recs:
        print("Recommended: " + item(rec[1]) + " (score:" + str(rec[0]) + ")")

In [None]:
# Give the title of any of the posts from the given snippet to check the results and number of recommendations between 1-99
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
              _id 	              title    	                        category 	                                  post_type
0 	5d62abaa65218653a132c956 	hello there 	                  Plant Biotechnology 	                              blog
1 	5d6d39567fa40e1417a4931c 	Ml and AI 	                    Artificial Intelligence|Machine Learning|Infor... 	blog
2 	5d7d23315720533e15c3b1ee 	What is an Operating System ? 	Operating Systems 	                                blog
3 	5d7d405e5720533e15c3b1f3 	Lord Shiva 	                    Drawings 	                                          artwork
4 	5d80dfbc6c53455f896e600e 	How Competition law evolved? 	  Competition Laws 	                                  blog

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

'""\n              _id \t              title    \t                        category \t                                  post_type\n0 \t5d62abaa65218653a132c956 \thello there \t                  Plant Biotechnology \t                              blog\n1 \t5d6d39567fa40e1417a4931c \tMl and AI \t                    Artificial Intelligence|Machine Learning|Infor... \tblog\n2 \t5d7d23315720533e15c3b1ee \tWhat is an Operating System ? \tOperating Systems \t                                blog\n3 \t5d7d405e5720533e15c3b1f3 \tLord Shiva \t                    Drawings \t                                          artwork\n4 \t5d80dfbc6c53455f896e600e \tHow Competition law evolved? \t  Competition Laws \t                                  blog\n\n'

In [None]:
recommend(post_title='How Competition law evolved?', num=10)

Recommending 10 products similar to How Competition law evolved?...
-------
Recommended: How Competition law evolved? (score:0.9999999999999998)
Recommended: Raghavan Committee (score:0.9999999999999998)
Recommended: Let's discuss some Case laws! (score:0.9999999999999998)
Recommended: Forms of Cartel. (score:0.9999999999999998)
Recommended: Custom laws (score:0.1881125750680073)
Recommended: What are Set Off and Carry Forward Losses (score:0.1356266594211393)
Recommended: Configure CI/CD Pipeline in GitLab and deployment to server via SSH (score:0.0)
Recommended: 3D composition. (score:0.0)
Recommended: Shiva Portrait (score:0.0)
Recommended: How Does A Person's Personal Development Affect His Business Leadership Ability? (score:0.0)
