# <center>Content based Video Recommendation System

We are going to build an engine that computes similarity between Videos based on certain metrics and suggests videos that are most similar to a particular video that a user liked. Since we will be using video metadata (or content) to build this engine, this also known as Content Based Filtering.

Here,we are going to build content based recommender based on the 'Genre' of the video.

In [1]:
#Importing required modules
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.metrics.pairwise import linear_kernel, cosine_similarity

We have collected some information related to few videos and have stored the same in an excel sheet.

In [84]:
#Importing video data
video=pd.read_excel(r'C:\Users\Shubhangi\Desktop\programming\machine_learning\data analysis\Datasets\video_topicc.xlsx')

In [85]:
#Printing the data
video

Unnamed: 0,Video Name,Video Topic,Youtube Link,Rating
0,Python,Python introduction,https://www.youtube.com/watch?v=CtbckFw0pJs&l...,4
1,Python,numpy full tutorial in Python,https://www.youtube.com/watch?v=Rbh1rieb3zc,7
2,Python,pandas full tutorial in Python,https://www.youtube.com/watch?v=RhEjmHeDNoA,10
3,API,what is API,https://www.youtube.com/watch?v=E0Qqpn8ymko,9
4,Python,OOPS Tutorial in Python,https://www.youtube.com/watch?v=ZDa-Z5JzLYM,5
5,Python,String Slicing and other function in Python,https://www.youtube.com/watch?v=lPZn7zcGXQo,8
6,Python,List functions tutorial in Python,https://www.youtube.com/watch?v=neTsPE9XFsQ,8
7,Python,Tuple in Python,https://www.youtube.com/watch?v=aVHNlC-cAjw,2
8,Python,Sets in python,https://www.youtube.com/watch?v=iVJv3zdgkD4,1
9,Python,If Else & Elif in python,https://www.youtube.com/watch?v=3VejIihDfwU,1


In [86]:
#Printing few rows 
video.head()

Unnamed: 0,Video Name,Video Topic,Youtube Link,Rating
0,Python,Python introduction,https://www.youtube.com/watch?v=CtbckFw0pJs&l...,4
1,Python,numpy full tutorial in Python,https://www.youtube.com/watch?v=Rbh1rieb3zc,7
2,Python,pandas full tutorial in Python,https://www.youtube.com/watch?v=RhEjmHeDNoA,10
3,API,what is API,https://www.youtube.com/watch?v=E0Qqpn8ymko,9
4,Python,OOPS Tutorial in Python,https://www.youtube.com/watch?v=ZDa-Z5JzLYM,5


In [87]:
video.rename(columns={'Video Name':'Genre'},inplace=True)

In [88]:
#Define a TF-IDF Vectorizer Object. Remove all english stop words such as 'the', 'a'
tf = TfidfVectorizer(analyzer='word',ngram_range=(1, 2),min_df=0, stop_words='english')
#Construct the required TF-IDF matrix by fitting and transforming the data
tfidf_matrix = tf.fit_transform(video['Genre'])

In [92]:
tfidf_matrix

<33x10 sparse matrix of type '<class 'numpy.float64'>'
	with 79 stored elements in Compressed Sparse Row format>

In [93]:
tfidf_matrix.shape

(33, 10)

In [60]:
# Compute the cosine similarity matrix
cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)

In [63]:
cosine_sim[1]

array([1., 1., 1., 0., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [64]:
#Construct a reverse map of indices and Video topics
video = video.reset_index()
titles = video['Video Topic']
indices = pd.Series(video.index, index=video['Video Topic'])

In [112]:
indices

Video Topic
Python introduction                                      0
numpy full tutorial in Python                            1
pandas full tutorial in Python                           2
what is API                                              3
OOPS Tutorial in Python                                  4
String Slicing and other function in Python              5
List functions tutorial in Python                        6
Tuple in Python                                          7
Sets in python                                           8
If Else & Elif in python                                 9
what is machine learning                                10
linear regression single variable ML                    11
linear regression multiple variable ML                  12
Logistic Regression ML                                  13
Support Vector Machine ML                               14
K Fold cross validation in ML                           15
K means clustering ML                       

In [65]:
# Function that takes in Video topic as input and outputs most similar videos
def get_recommendations(title):
    # Get the index of the Video that matches the title
    idx = indices[title]
    # Get the pairwsie similarity scores of all Video with that Video
    sim_scores = list(enumerate(cosine_sim[idx]))
    # Sort the Videos based on the similarity scores
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    # Get the scores of the 20 most similar Videos
    sim_scores = sim_scores[1:20]
    # Get the Video indices
    video_indices = [i[0] for i in sim_scores]
    # Return the top 20 most similar movies
    return titles.iloc[video_indices]


<h2> Now Get the recommended videos on the basis of video title<h2>

In [111]:
get_recommendations('Tuple in Python').head(7)

1                  numpy full tutorial in Python
2                 pandas full tutorial in Python
4                        OOPS Tutorial in Python
5    String Slicing and other function in Python
6              List functions tutorial in Python
7                                Tuple in Python
8                                 Sets in python
Name: Video Topic, dtype: object

In [67]:
get_recommendations('Sunjara').head(5)

32         Rabba Rabba
26       kal ho naa ho
27            Hawayein
28    Dil diyan Gallan
29     Tera sang yarra
Name: Video Topic, dtype: object

In [113]:
get_recommendations('Logistic Regression ML').head(12)

11                 linear regression single variable ML
12               linear regression multiple variable ML
13                               Logistic Regression ML
14                            Support Vector Machine ML
15                        K Fold cross validation in ML
16                                K means clustering ML
17                                Naïve Bayes part 1 ML
18                                Naïve Bayes part 2 ML
19                          Training and testing in  ML
20    Simple Explanation of Convolutional nural Netw...
21                   What are features and labels in ML
22                   How does linear regression work ML
Name: Video Topic, dtype: object