# Netflix Movies and TV shows Recommendation system

Algorithm : Content Based Filtering - Cosine Similarity

Notebook summary

1. Import necessary libraries
2. Import the required dataset
3. Build the Movie recommendation engine
4. Build the TV Show Recommendation Engine
5.RECOMMENDATION ENGINE FUNCTION
6. Movies recommendation test
7. TV Shows recommendation test
8. Data visualization using plotly

In [1]:
! pip install neattext

Collecting neattext
  Downloading neattext-0.1.3-py3-none-any.whl.metadata (12 kB)
Downloading neattext-0.1.3-py3-none-any.whl (114 kB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/114.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m112.6/114.7 kB[0m [31m3.6 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m114.7/114.7 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: neattext
Successfully installed neattext-0.1.3


# 1. Import necessary libraries

In [2]:
import pandas as pd
import numpy as np
import neattext.functions as nfx
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import seaborn as sns
import warnings
warnings.filterwarnings("ignore")

# 2. Import the required dataset

In [9]:
df = pd.read_csv('netflix_titles.csv')
df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...


In [4]:
df.shape

(8807, 12)

In [10]:
# Renaming Columns
df.rename(columns = {'listed_in': 'genres'}, inplace= True)

In [11]:
df.type.value_counts()

Unnamed: 0_level_0,count
type,Unnamed: 1_level_1
Movie,6131
TV Show,2676


# 3. Build the Movie recommendation engine

### a. Filter only movies from type

In [12]:
# Getting movies

movies_df = df[df['type'] == 'Movie'].reset_index(drop= True)
movies_df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
1,s7,Movie,My Little Pony: A New Generation,"Robert Cullen, José Luis Ucha","Vanessa Hudgens, Kimiko Glenn, James Marsden, ...",,"September 24, 2021",2021,PG,91 min,Children & Family Movies,Equestria's divided. But a bright-eyed hero be...
2,s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...","September 24, 2021",1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s..."
3,s10,Movie,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,"September 24, 2021",2021,PG-13,104 min,"Comedies, Dramas",A woman adjusting to life after a loss contend...
4,s13,Movie,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...","Germany, Czech Republic","September 23, 2021",2021,TV-MA,127 min,"Dramas, International Movies",After most of her family is murdered in a terr...


### b. baseline EDA and cleaning

In [13]:
movies_df.type.value_counts()

Unnamed: 0_level_0,count
type,Unnamed: 1_level_1
Movie,6131


Checking for duplicates :

1. number of movies = no of show_id / title

2. directly check for duplicates

In [16]:
movies_df.nunique()

Unnamed: 0,0
show_id,6131
type,1
title,6131
director,4354
cast,5445
country,651
date_added,1533
release_year,73
rating,17
duration,205


In [14]:
# Checking for duplicate
movies_df.duplicated().sum()

0

In [17]:
# Checking for null values

movies_df.isnull().sum()

Unnamed: 0,0
show_id,0
type,0
title,0
director,188
cast,475
country,440
date_added,0
release_year,0
rating,2
duration,3


In [106]:
movies_df.shape

(6131, 12)

In [18]:
# filling NaN manually at rating column so pandas can treat it as a Non null-value
movies_df['rating'].fillna('NaN', inplace= True)

# Dropping null values
movies_df.dropna(inplace= True)
movies_df = movies_df.reset_index(drop=True)
movies_df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description
0,s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...","September 24, 2021",1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s..."
1,s10,Movie,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,"September 24, 2021",2021,PG-13,104 min,"Comedies, Dramas",A woman adjusting to life after a loss contend...
2,s13,Movie,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...","Germany, Czech Republic","September 23, 2021",2021,TV-MA,127 min,"Dramas, International Movies",After most of her family is murdered in a terr...
3,s25,Movie,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,"September 21, 2021",1998,TV-14,166 min,"Comedies, International Movies, Romantic Movies",When the father of the man she loves insists t...
4,s28,Movie,Grown Ups,Dennis Dugan,"Adam Sandler, Kevin James, Chris Rock, David S...",United States,"September 20, 2021",2010,PG-13,103 min,Comedies,Mourning the loss of their beloved junior high...


In [19]:
movies_df.shape

(5186, 12)

In [20]:
movies_df.isnull().sum()

Unnamed: 0,0
show_id,0
type,0
title,0
director,0
cast,0
country,0
date_added,0
release_year,0
rating,0
duration,0


In [21]:
movies_df.columns

Index(['show_id', 'type', 'title', 'director', 'cast', 'country', 'date_added',
       'release_year', 'rating', 'duration', 'genres', 'description'],
      dtype='object')

In [22]:
movies_df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description
0,s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...","September 24, 2021",1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s..."
1,s10,Movie,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,"September 24, 2021",2021,PG-13,104 min,"Comedies, Dramas",A woman adjusting to life after a loss contend...
2,s13,Movie,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...","Germany, Czech Republic","September 23, 2021",2021,TV-MA,127 min,"Dramas, International Movies",After most of her family is murdered in a terr...
3,s25,Movie,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,"September 21, 2021",1998,TV-14,166 min,"Comedies, International Movies, Romantic Movies",When the father of the man she loves insists t...
4,s28,Movie,Grown Ups,Dennis Dugan,"Adam Sandler, Kevin James, Chris Rock, David S...",United States,"September 20, 2021",2010,PG-13,103 min,Comedies,Mourning the loss of their beloved junior high...


In [23]:
movies_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5186 entries, 0 to 5185
Data columns (total 12 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   show_id       5186 non-null   object
 1   type          5186 non-null   object
 2   title         5186 non-null   object
 3   director      5186 non-null   object
 4   cast          5186 non-null   object
 5   country       5186 non-null   object
 6   date_added    5186 non-null   object
 7   release_year  5186 non-null   int64 
 8   rating        5186 non-null   object
 9   duration      5186 non-null   object
 10  genres        5186 non-null   object
 11  description   5186 non-null   object
dtypes: int64(1), object(11)
memory usage: 486.3+ KB


### c. Filtering features to be considered for content based filtering

In [24]:
# Selecting features for working

movies = movies_df[['title','director', 'cast', 'country', 'rating', 'genres']]
movies.head()

Unnamed: 0,title,director,cast,country,rating,genres
0,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...",TV-MA,"Dramas, Independent Movies, International Movies"
1,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,PG-13,"Comedies, Dramas"
2,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...","Germany, Czech Republic",TV-MA,"Dramas, International Movies"
3,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,TV-14,"Comedies, International Movies, Romantic Movies"
4,Grown Ups,Dennis Dugan,"Adam Sandler, Kevin James, Chris Rock, David S...",United States,PG-13,Comedies


In [26]:
movies.describe().T

Unnamed: 0,count,unique,top,freq
title,5186,5186,Sankofa,1
director,5186,3829,"Raúl Campos, Jan Suter",18
cast,5186,5062,Samuel West,10
country,5186,594,United States,1819
rating,5186,15,TV-MA,1741
genres,5186,268,"Dramas, International Movies",336


### d. Preparing data for vectorization

In [27]:
# Remove stopwords
movies['director'] = movies['director'].apply(nfx.remove_stopwords)
movies['cast'] = movies['cast'].apply(nfx.remove_stopwords)
movies['country'] = movies['country'].apply(nfx.remove_stopwords)
movies['genres'] = movies['genres'].apply(nfx.remove_stopwords)

# # Remove special characters
movies['country'] = movies['country'].apply(nfx.remove_special_characters)

movies.head()

Unnamed: 0,title,director,cast,country,rating,genres
0,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...",United States Ghana Burkina Faso United Kingdo...,TV-MA,"Dramas, Independent Movies, International Movies"
1,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,PG-13,"Comedies, Dramas"
2,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...",Germany Czech Republic,TV-MA,"Dramas, International Movies"
3,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,TV-14,"Comedies, International Movies, Romantic Movies"
4,Grown Ups,Dennis Dugan,"Adam Sandler, Kevin James, Chris Rock, David S...",United States,PG-13,Comedies


### e. Vectorizing features using Count vectorizer

In [28]:
countVector = CountVectorizer(binary= True)

In [29]:
movies['country']

Unnamed: 0,country
0,United States Ghana Burkina Faso United Kingdo...
1,United States
2,Germany Czech Republic
3,India
4,United States
...,...
5181,United Arab Emirates Jordan
5182,United States
5183,United States
5184,United States


In [30]:
movies['country'].value_counts()

Unnamed: 0_level_0,count
country,Unnamed: 1_level_1
United States,1819
India,868
United Kingdom,165
Canada,104
Egypt,90
...,...
United States South Korea Japan,1
Spain United Kingdom,1
Canada Norway,1
France Senegal Belgium,1


In [31]:
country = countVector.fit_transform(movies['country']).toarray()

In [32]:
country.shape

(5186, 121)

In [33]:
movies['country'][0]

'United States Ghana Burkina Faso United Kingdom Germany Ethiopia'

In [34]:
country[0]

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0])

In [35]:
movies['country'][1]

'United States'

In [36]:
country[1]

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0])

In [37]:
countVector = CountVectorizer(binary= True,
                             tokenizer=lambda x:x.split(','))
director = countVector.fit_transform(movies['director']).toarray()
cast = countVector.fit_transform(movies['cast']).toarray()
genres = countVector.fit_transform(movies['genres']).toarray()

In [38]:
director.shape , cast.shape, country.shape, genres.shape

((5186, 4255), (5186, 26411), (5186, 121), (5186, 36))

In [39]:
# Turning vectors to dataframe

binary_director = pd.DataFrame(director).transpose()
binary_cast = pd.DataFrame(cast).transpose()
binary_country = pd.DataFrame(country).transpose()
binary_genres = pd.DataFrame(genres).transpose()

In [40]:
# Concating Dataframe

movies_binary = pd.concat([binary_director, binary_cast,  binary_country, binary_genres], axis=0,ignore_index=True)
movies_binary.T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,30813,30814,30815,30816,30817,30818,30819,30820,30821,30822
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5181,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5182,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5183,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5184,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### f. Calculating cosine similarity between movies on the complete vectorized feature set

In [41]:
movies_sim = cosine_similarity(movies_binary.T)
movies_sim

array([[1.        , 0.1118034 , 0.16269784, ..., 0.12909944, 0.11952286,
        0.12403473],
       [0.1118034 , 1.        , 0.        , ..., 0.21650635, 0.13363062,
        0.        ],
       [0.16269784, 0.        , 1.        , ..., 0.        , 0.        ,
        0.13453456],
       ...,
       [0.12909944, 0.21650635, 0.        , ..., 1.        , 0.15430335,
        0.        ],
       [0.11952286, 0.13363062, 0.        , ..., 0.15430335, 1.        ,
        0.        ],
       [0.12403473, 0.        , 0.13453456, ..., 0.        , 0.        ,
        1.        ]])

In [45]:
movies_sim.shape

(5186, 5186)

In [46]:
movies

Unnamed: 0,title,director,cast,country,rating,genres
0,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...",United States Ghana Burkina Faso United Kingdo...,TV-MA,"Dramas, Independent Movies, International Movies"
1,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,PG-13,"Comedies, Dramas"
2,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...",Germany Czech Republic,TV-MA,"Dramas, International Movies"
3,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,TV-14,"Comedies, International Movies, Romantic Movies"
4,Grown Ups,Dennis Dugan,"Adam Sandler, Kevin James, Chris Rock, David S...",United States,PG-13,Comedies
...,...,...,...,...,...,...
5181,Zinzana,Majid Al Ansari,"Ali Suliman, Saleh Bakri, Yasa, Ali Al-Jabri, ...",United Arab Emirates Jordan,TV-MA,"Dramas, International Movies, Thrillers"
5182,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,R,"Cult Movies, Dramas, Thrillers"
5183,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,R,"Comedies, Horror Movies"
5184,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,PG,"Children & Family Movies, Comedies"


In [44]:
movie_titles = movies['title']

In [141]:
movie_titles.head() , movie_titles.shape

(0         Sankofa
 1    The Starling
 2    Je Suis Karl
 3           Jeans
 4       Grown Ups
 Name: title, dtype: object,
 (5186,))

### g. Creating the Cosine similarity dataframe with all the movies

In [47]:
cosine_similarity_df = pd.DataFrame(movies_sim , index=movie_titles, columns=movie_titles)

In [48]:
cosine_similarity_df.head()

title,Sankofa,The Starling,Je Suis Karl,Jeans,Grown Ups,Dark Skies,Paranoia,Birth of the Dragon,Jaws,Jaws 2,...,Young Tiger,"Yours, Mine and Ours",اشتباك,Zed Plus,Zenda,Zinzana,Zodiac,Zombieland,Zoom,Zubaan
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Sankofa,1.0,0.111803,0.162698,0.074536,0.11547,0.11547,0.111803,0.111803,0.111803,0.167705,...,0.070711,0.11547,0.162698,0.057735,0.129099,0.179284,0.111803,0.129099,0.119523,0.124035
The Starling,0.111803,1.0,0.0,0.083333,0.193649,0.129099,0.125,0.1875,0.1875,0.125,...,0.0,0.129099,0.0,0.129099,0.0,0.066815,0.1875,0.216506,0.133631,0.0
Je Suis Karl,0.162698,0.0,1.0,0.080845,0.0,0.0,0.0,0.0,0.0,0.060634,...,0.076696,0.0,0.117647,0.062622,0.140028,0.129641,0.0,0.0,0.0,0.134535
Jeans,0.074536,0.083333,0.080845,1.0,0.086066,0.0,0.083333,0.0,0.0,0.0,...,0.105409,0.0,0.080845,0.258199,0.19245,0.089087,0.0,0.096225,0.0,0.1849
Grown Ups,0.11547,0.193649,0.0,0.086066,1.0,0.133333,0.129099,0.129099,0.129099,0.129099,...,0.0,0.133333,0.0,0.066667,0.0,0.069007,0.129099,0.223607,0.138013,0.0


In [49]:
movies_sim.shape

(5186, 5186)

# 4. Build the TV Show Recommendation Engine

a. Filter only TV Shows from type

In [50]:
# Getting Tv Shows
tv_show = df[df['type'] == 'TV Show'].reset_index(drop= True)
tv_show.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description
0,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
1,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
2,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
3,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...
4,s6,TV Show,Midnight Mass,Mike Flanagan,"Kate Siegel, Zach Gilford, Hamish Linklater, H...",,"September 24, 2021",2021,TV-MA,1 Season,"TV Dramas, TV Horror, TV Mysteries",The arrival of a charismatic young priest brin...


### b. baseline EDA and cleaning

In [51]:
# Checking for duplicate
tv_show.duplicated().sum()

0

In [52]:
# Checking for null values
tv_show.isnull().sum()

Unnamed: 0,0
show_id,0
type,0
title,0
director,2446
cast,350
country,391
date_added,10
release_year,0
rating,2
duration,0


In [53]:
# filling NaN manually at rating column so pandas can treat it as a Non null-value
tv_show['director'].fillna('NaN', inplace = True)

# Dropping null values
tv_show.dropna(inplace= True)
tv_show = tv_show.reset_index(drop=True)
tv_show.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description
0,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
1,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...
2,s9,TV Show,The Great British Baking Show,Andy Devonshire,"Mel Giedroyc, Sue Perkins, Mary Berry, Paul Ho...",United Kingdom,"September 24, 2021",2021,TV-14,9 Seasons,"British TV Shows, Reality TV",A talented batch of amateur bakers face off in...
3,s16,TV Show,Dear White People,,"Logan Browning, Brandon P. Bell, DeRon Horton,...",United States,"September 22, 2021",2021,TV-MA,4 Seasons,"TV Comedies, TV Dramas",Students of color navigate the daily slights a...
4,s18,TV Show,Falsa identidad,,"Luis Ernesto Franco, Camila Sodi, Sergio Goyri...",Mexico,"September 22, 2021",2020,TV-MA,2 Seasons,"Crime TV Shows, Spanish-Language TV Shows, TV ...",Strangers Diego and Isabel flee their home in ...


### c. Filtering features to be considered for content based filtering

In [54]:
# Selecting features for working
tv_df = tv_show[['title','director', 'cast', 'country', 'rating', 'genres']]
tv_df.head()

Unnamed: 0,title,director,cast,country,rating,genres
0,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,TV-MA,"International TV Shows, TV Dramas, TV Mysteries"
1,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,TV-MA,"International TV Shows, Romantic TV Shows, TV ..."
2,The Great British Baking Show,Andy Devonshire,"Mel Giedroyc, Sue Perkins, Mary Berry, Paul Ho...",United Kingdom,TV-14,"British TV Shows, Reality TV"
3,Dear White People,,"Logan Browning, Brandon P. Bell, DeRon Horton,...",United States,TV-MA,"TV Comedies, TV Dramas"
4,Falsa identidad,,"Luis Ernesto Franco, Camila Sodi, Sergio Goyri...",Mexico,TV-MA,"Crime TV Shows, Spanish-Language TV Shows, TV ..."


In [149]:
tv_df.describe().T

Unnamed: 0,count,unique,top,freq
title,2013,2013,Blood & Water,1
director,2013,142,,1866
cast,2013,1980,David Attenborough,14
country,2013,184,United States,618
rating,2013,9,TV-MA,881
genres,2013,219,Kids' TV,161


### d. Preparing data for vectorization

In [150]:
# Remove stopwords
tv_df['cast'] = tv_df['cast'].apply(nfx.remove_stopwords)
tv_df['country'] = tv_df['country'].apply(nfx.remove_stopwords)
tv_df['genres'] = tv_df['genres'].apply(nfx.remove_stopwords)

# # Remove special characters
tv_df['country'] = tv_df['country'].apply(nfx.remove_special_characters)

tv_df.head()

Unnamed: 0,title,director,cast,country,rating,genres
0,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,TV-MA,"International TV Shows, TV Dramas, TV Mysteries"
1,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,TV-MA,"International TV Shows, Romantic TV Shows, TV ..."
2,The Great British Baking Show,Andy Devonshire,"Mel Giedroyc, Sue Perkins, Mary Berry, Paul Ho...",United Kingdom,TV-14,"British TV Shows, Reality TV"
3,Dear White People,,"Logan Browning, Brandon P. Bell, DeRon Horton,...",United States,TV-MA,"TV Comedies, TV Dramas"
4,Falsa identidad,,"Luis Ernesto Franco, Camila Sodi, Sergio Goyri...",Mexico,TV-MA,"Crime TV Shows, Spanish-Language TV Shows, TV ..."


### e. Vectorizing features using Count vectorizer

In [55]:
# Vectorizing Data
countVector = CountVectorizer(binary= True)
country = countVector.fit_transform(tv_df['country']).toarray()

countVector = CountVectorizer(binary= True,
                             tokenizer=lambda x:x.split(','))
cast = countVector.fit_transform(tv_df['cast']).toarray()
genres = countVector.fit_transform(tv_df['genres']).toarray()

In [56]:
# Turning vectors to dataframe
tv_binary_cast = pd.DataFrame(cast).transpose()
tv_binary_country = pd.DataFrame(country).transpose()
tv_binary_genres = pd.DataFrame(genres).transpose()

In [57]:
# Concating Dataframe
tv_binary = pd.concat([tv_binary_cast,  tv_binary_country, tv_binary_genres], axis=0,ignore_index=True)
tv_binary.T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,14061,14062,14063,14064,14065,14066,14067,14068,14069,14070
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2008,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2009,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2010,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2011,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### f. Calculating cosine similarity between movies on the complete vectorized feature set

In [58]:
tv_sim = cosine_similarity(tv_binary.T)
tv_sim

array([[1.        , 0.05892557, 0.        , ..., 0.08908708, 0.05455447,
        0.1132277 ],
       [0.05892557, 1.        , 0.        , ..., 0.06299408, 0.        ,
        0.16012815],
       [0.        , 0.        , 1.        , ..., 0.        , 0.09449112,
        0.        ],
       ...,
       [0.08908708, 0.06299408, 0.        , ..., 1.        , 0.        ,
        0.12104551],
       [0.05455447, 0.        , 0.09449112, ..., 0.        , 1.        ,
        0.        ],
       [0.1132277 , 0.16012815, 0.        , ..., 0.12104551, 0.        ,
        1.        ]])

In [59]:
tv_df.head()

Unnamed: 0,title,director,cast,country,rating,genres
0,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,TV-MA,"International TV Shows, TV Dramas, TV Mysteries"
1,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,TV-MA,"International TV Shows, Romantic TV Shows, TV ..."
2,The Great British Baking Show,Andy Devonshire,"Mel Giedroyc, Sue Perkins, Mary Berry, Paul Ho...",United Kingdom,TV-14,"British TV Shows, Reality TV"
3,Dear White People,,"Logan Browning, Brandon P. Bell, DeRon Horton,...",United States,TV-MA,"TV Comedies, TV Dramas"
4,Falsa identidad,,"Luis Ernesto Franco, Camila Sodi, Sergio Goyri...",Mexico,TV-MA,"Crime TV Shows, Spanish-Language TV Shows, TV ..."


In [60]:
tvshow_titles = tv_df['title']

In [61]:
tvshow_titles.head() , tvshow_titles.shape

(0                    Blood & Water
 1                     Kota Factory
 2    The Great British Baking Show
 3                Dear White People
 4                  Falsa identidad
 Name: title, dtype: object,
 (2013,))

### g. Creating the Cosine similarity dataframe with all the movies

In [62]:
cosine_similarity_df_tvshows = pd.DataFrame(tv_sim , index=tvshow_titles, columns=tvshow_titles)

In [63]:
cosine_similarity_df_tvshows

title,Blood & Water,Kota Factory,The Great British Baking Show,Dear White People,Falsa identidad,Resurrection: Ertugrul,Love on the Spectrum,Sex Education,Angry Birds,Chhota Bheem,...,Wild Arabia,Winsanity,Winter Sun,World's Busiest Cities,Yeh Meri Family,Yo-Kai Watch,Yu-Gi-Oh! Arc-V,Yunus Emre,Zak Storm,Zindagi Gulzar Hai
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Blood & Water,1.000000,0.058926,0.000000,0.058926,0.051031,0.109109,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.000000,0.109109,0.000000,0.064550,0.000000,0.000000,0.089087,0.054554,0.113228
Kota Factory,0.058926,1.000000,0.000000,0.000000,0.000000,0.077152,0.000000,0.083333,0.102062,0.096225,...,0.000000,0.000000,0.077152,0.000000,0.365148,0.000000,0.000000,0.062994,0.000000,0.160128
The Great British Baking Show,0.000000,0.000000,1.000000,0.102062,0.000000,0.000000,0.158114,0.306186,0.000000,0.000000,...,0.433013,0.176777,0.000000,0.400892,0.000000,0.111803,0.000000,0.000000,0.094491,0.000000
Dear White People,0.058926,0.000000,0.102062,1.000000,0.072169,0.077152,0.000000,0.083333,0.000000,0.000000,...,0.117851,0.288675,0.077152,0.109109,0.000000,0.182574,0.000000,0.062994,0.154303,0.080064
Falsa identidad,0.051031,0.000000,0.000000,0.072169,1.000000,0.066815,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.000000,0.066815,0.000000,0.000000,0.000000,0.000000,0.054554,0.000000,0.069338
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Yo-Kai Watch,0.000000,0.000000,0.111803,0.182574,0.000000,0.000000,0.000000,0.091287,0.000000,0.000000,...,0.129099,0.316228,0.000000,0.119523,0.000000,1.000000,0.210819,0.000000,0.169031,0.000000
Yu-Gi-Oh! Arc-V,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.210819,1.000000,0.000000,0.000000,0.000000
Yunus Emre,0.089087,0.062994,0.000000,0.062994,0.054554,0.174964,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.000000,0.174964,0.000000,0.069007,0.000000,0.000000,1.000000,0.000000,0.121046
Zak Storm,0.054554,0.000000,0.094491,0.154303,0.000000,0.000000,0.000000,0.077152,0.094491,0.089087,...,0.109109,0.267261,0.000000,0.101015,0.000000,0.169031,0.000000,0.000000,1.000000,0.000000


In [64]:
tv_sim.shape

(2013, 2013)

# 5.RECOMMENDATION ENGINE FUNCTION

In [65]:
def recommend(title):
    if title in movies_df['title'].values:
        movies_index = movies_df[movies_df['title'] == title].index.item()
        scores = dict(enumerate(movies_sim[movies_index]))
        sorted_scores = dict(sorted(scores.items(), key=lambda x: x[1], reverse=True))

        selected_movies_index = [id for id, scores in sorted_scores.items()]
        selected_movies_score = [scores for id, scores in sorted_scores.items()]

        rec_movies = movies_df.iloc[selected_movies_index]
        rec_movies['similiarity'] = selected_movies_score

        movie_recommendation = rec_movies.reset_index(drop=True)
        return movie_recommendation[1:6] # Skipping the first row

    elif title in tv_show['title'].values:
        tv_index = tv_show[tv_show['title'] == title].index.item()
        scores = dict(enumerate(tv_sim[tv_index]))
        sorted_scores = dict(sorted(scores.items(), key=lambda x: x[1], reverse=True))

        selected_tv_index = [id for id, scores in sorted_scores.items()]
        selected_tv_score = [scores for id, scores in sorted_scores.items()]

        rec_tv = tv_show.iloc[selected_tv_index]
        rec_tv['similiarity'] = selected_tv_score

        tv_recommendation = rec_tv.reset_index(drop=True)
        return tv_recommendation[1:6] # Skipping the first row

    else:
        print("Title not in dataset. Please check spelling.")

# 6. Movies recommendation test

In [66]:
movies.head()

Unnamed: 0,title,director,cast,country,rating,genres
0,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...",United States Ghana Burkina Faso United Kingdo...,TV-MA,"Dramas, Independent Movies, International Movies"
1,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,PG-13,"Comedies, Dramas"
2,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...",Germany Czech Republic,TV-MA,"Dramas, International Movies"
3,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,TV-14,"Comedies, International Movies, Romantic Movies"
4,Grown Ups,Dennis Dugan,"Adam Sandler, Kevin James, Chris Rock, David S...",United States,PG-13,Comedies


In [67]:
recommend("Sankofa")

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s4307,Movie,ROMA,Alfonso Cuarón,"Yalitza Aparicio, Marina de Tavira","Mexico, United States","December 14, 2018",2018,R,135 min,"Dramas, Independent Movies, International Movies","Director Alfonso Cuarón delivers a vivid, emot...",0.372678
2,s3008,Movie,WHAT DID JACK DO?,David Lynch,David Lynch,United States,"January 20, 2020",2020,TV-14,17 min,"Dramas, Independent Movies",A detective interrogates a monkey who is suspe...,0.365148
3,s7511,Movie,Morris from America,Chad Hartigan,"Markees Christmas, Craig Robinson, Lina Keller...","Germany, United States","November 1, 2018",2016,R,91 min,"Dramas, Independent Movies, International Movies",When his father moves from the U.S. to Heidelb...,0.358569
4,s6864,Movie,God's Own Country,Francis Lee,"Josh O'Connor, Alec Secareanu, Ian Hart, Gemma...",United Kingdom,"May 1, 2018",2017,TV-MA,105 min,"Dramas, Independent Movies, International Movies","In Yorkshire, a withdrawn gay farmer begins a ...",0.353553
5,s1139,Movie,The Pianist,Roman Polański,"Adrien Brody, Thomas Kretschmann, Frank Finlay...","United Kingdom, France, Poland, Germany, Unite...","April 1, 2021",2002,R,149 min,"Dramas, Independent Movies, International Movies",Famed Polish pianist Wladyslaw Szpilman strugg...,0.35


In [68]:
recommend("Grown Ups")

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s5534,Movie,Sandy Wexler,Steven Brill,"Adam Sandler, Jennifer Hudson, Kevin James, Te...",United States,"April 14, 2017",2017,TV-14,131 min,Comedies,When a hapless but dedicated talent manager si...,0.483046
2,s1880,Movie,Hubie Halloween,Steve Brill,"Adam Sandler, Kevin James, Julie Bowen, Ray Li...",United States,"October 7, 2020",2020,PG-13,104 min,"Comedies, Horror Movies","Hubie's not the most popular guy in Salem, Mas...",0.439155
3,s6304,Movie,Big Daddy,Dennis Dugan,"Adam Sandler, Joey Lauren Adams, Jon Stewart, ...",United States,"October 1, 2020",1999,PG-13,93 min,Comedies,Dumped by his girlfriend because he refuses to...,0.414039
4,s6019,Movie,50 First Dates,Peter Segal,"Adam Sandler, Drew Barrymore, Rob Schneider, S...",United States,"December 1, 2020",2004,PG-13,99 min,"Comedies, Romantic Movies",After falling for a pretty art teacher who has...,0.4
5,s1618,Movie,Natalie Palamides: Nate - A One Man Show,Phil Burgers,Natalie Palamides,United States,"December 1, 2020",2020,TV-MA,60 min,Comedies,"Tough talk takes a soft turn as Nate, played b...",0.34641


In [69]:
recommend("Child's Play")

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s6416,Movie,Candyman,Bernard Rose,"Virginia Madsen, Tony Todd, Xander Berkeley, K...","United States, United Kingdom","October 1, 2019",1992,R,99 min,"Cult Movies, Horror Movies",Grad student Helen Lyle unintentionally summon...,0.276026
2,s8238,Movie,The Car,Elliot Silverstein,"James Brolin, Kathleen Lloyd, John Marley, R.G...",United States,"June 1, 2020",1977,PG,96 min,"Cult Movies, Horror Movies","In his small Southwestern town, sheriff Wade P...",0.276026
3,s797,Movie,Hostel: Part III,Scott Spiegel,"Kip Pardue, Brian Hallisay, John Hensley, Sara...",United States,"June 2, 2021",2011,R,88 min,"Cult Movies, Horror Movies",In this installment in the popular horror fran...,0.266667
4,s4525,Movie,Tales From the Hood 2,"Rusty Cundieff, Darin Scott","Keith David, Bryan Batt, Alexandria Deberry, B...",United States,"October 10, 2018",2018,R,110 min,"Cult Movies, Horror Movies, Independent Movies",Buckle up for an anthology of socially conscio...,0.266667
5,s6545,Movie,Cult of Chucky,Don Mancini,"Fiona Dourif, Michael Therriault, Adam Hurtig,...",United States,"October 3, 2017",2017,R,90 min,Horror Movies,Following a string of murders in the asylum wh...,0.266667


In [70]:
recommend('Hubie Halloween')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s28,Movie,Grown Ups,Dennis Dugan,"Adam Sandler, Kevin James, Chris Rock, David S...",United States,"September 20, 2021",2010,PG-13,103 min,Comedies,Mourning the loss of their beloved junior high...,0.439155
2,s5534,Movie,Sandy Wexler,Steven Brill,"Adam Sandler, Jennifer Hudson, Kevin James, Te...",United States,"April 14, 2017",2017,TV-14,131 min,Comedies,When a hapless but dedicated talent manager si...,0.353553
3,s4483,Movie,ADAM SANDLER 100% FRESH,Steve Brill,Adam Sandler,United States,"October 23, 2018",2018,TV-MA,74 min,Stand-Up Comedy,"From ""Heroes"" to ""Ice Cream Ladies"" – Adam San...",0.338062
4,s6019,Movie,50 First Dates,Peter Segal,"Adam Sandler, Drew Barrymore, Rob Schneider, S...",United States,"December 1, 2020",2004,PG-13,99 min,"Comedies, Romantic Movies",After falling for a pretty art teacher who has...,0.29277
5,s7518,Movie,Mr. Deeds,Steve Brill,"Adam Sandler, Winona Ryder, Peter Gallagher, J...",United States,"August 1, 2020",2002,PG-13,97 min,"Comedies, Romantic Movies","After inheriting a media empire, humble Longfe...",0.29277


In [71]:
recommend('After')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s1502,Movie,After We Collided,Roger Kumble,"Josephine Langford, Hero Fiennes Tiffin, Dylan...",United States,"December 22, 2020",2020,R,105 min,"Dramas, Romantic Movies","Tessa fell hard and fast for Hardin, but after...",0.726722
2,s5512,Movie,Rodney King,Spike Lee,Roger Guenveur Smith,United States,"April 28, 2017",2017,TV-MA,53 min,Dramas,Roger Guenveur Smith gives voice to the man at...,0.33541
3,s1331,Movie,The World We Make,Brian Baugh,"Caleb Castille, Rose Reid, Kevin Sizemore, Gre...",United States,"February 10, 2021",2019,PG,108 min,"Dramas, Romantic Movies",A teenage equestrian and a local football play...,0.333333
4,s5688,Movie,Blue Jay,Alex Lehmann,"Sarah Paulson, Mark Duplass, Clu Gulager",United States,"December 6, 2016",2016,TV-MA,81 min,"Dramas, Independent Movies, Romantic Movies",Two former high school sweethearts unexpectedl...,0.333333
5,s6391,Movie,Burlesque,Steve Antin,"Cher, Christina Aguilera, Alan Cumming, Eric D...",United States,"December 16, 2019",2010,PG-13,119 min,"Dramas, Romantic Movies","After leaving Iowa with stars in her eyes, Ali...",0.322749


In [72]:
recommend('Pianist')

Title not in dataset. Please check spelling.


# 7. TV Shows recommendation test

In [73]:
tv_show.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description
0,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
1,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...
2,s9,TV Show,The Great British Baking Show,Andy Devonshire,"Mel Giedroyc, Sue Perkins, Mary Berry, Paul Ho...",United Kingdom,"September 24, 2021",2021,TV-14,9 Seasons,"British TV Shows, Reality TV",A talented batch of amateur bakers face off in...
3,s16,TV Show,Dear White People,,"Logan Browning, Brandon P. Bell, DeRon Horton,...",United States,"September 22, 2021",2021,TV-MA,4 Seasons,"TV Comedies, TV Dramas",Students of color navigate the daily slights a...
4,s18,TV Show,Falsa identidad,,"Luis Ernesto Franco, Camila Sodi, Sergio Goyri...",Mexico,"September 22, 2021",2020,TV-MA,2 Seasons,"Crime TV Shows, Spanish-Language TV Shows, TV ...",Strangers Diego and Isabel flee their home in ...


In [74]:
recommend('Blood & Water')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s1956,TV Show,The School Nurse Files,,"Jung Yu-mi, Nam Joo-hyuk",South Korea,"September 25, 2020",2020,TV-MA,1 Season,"International TV Shows, TV Dramas, TV Mysteries",Wielding a light-up sword through the dark cor...,0.308607
2,s5039,TV Show,Re:Mind,,Keyakizaka46,Japan,"February 15, 2018",2017,TV-MA,1 Season,"International TV Shows, TV Dramas, TV Mysteries","Eleven high school classmates awaken, restrain...",0.273861
3,s4031,TV Show,Disappearance,,"Nelly Karim, Mohamed Mamdouh, Hesham Selim",Egypt,"March 8, 2019",2018,TV-MA,1 Season,"International TV Shows, TV Dramas, TV Mysteries",A university lecturer in Russia returns to Egy...,0.231455
4,s1515,TV Show,Diamond City,,"Noxee Maqashalala, Angela Sithole, Nambitha Be...",South Africa,"December 18, 2020",2019,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Dramas",A prominent prosecuting attorney must defend h...,0.226455
5,s194,TV Show,D.P.,,"Jung Hae-in, Koo Kyo-hwan, Kim Sung-kyun, Son ...",", South Korea","August 27, 2021",2021,TV-MA,1 Season,"International TV Shows, TV Dramas",A young private’s assignment to capture army d...,0.216506


In [75]:
recommend('Bridgerton')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s639,TV Show,Sex/Life,,"Sarah Shahi, Mike Vogel, Adam Demos, Margaret ...",United States,"June 25, 2021",2021,TV-MA,1 Season,"Romantic TV Shows, TV Dramas",A woman's daring sexual past collides with her...,0.288675
2,s1723,TV Show,DASH & LILY,,"Midori Francis, Austin Abrams, Dante Brown, Tr...",United States,"November 10, 2020",2020,TV-14,1 Season,"Romantic TV Shows, TV Comedies, TV Dramas",Opposites attract at Christmas as cynical Dash...,0.272166
3,s4571,TV Show,Hot Date,,"Emily Axford, Brian Murphy",United States,"October 1, 2018",2018,TV-MA,1 Season,"Romantic TV Shows, TV Comedies",Interconnected sketches and performances skewe...,0.25
4,s489,TV Show,Virgin River,,"Alexandra Breckenridge, Martin Henderson, Tim ...",United States,"July 9, 2021",2021,TV-14,3 Seasons,"Romantic TV Shows, TV Dramas","Searching for a fresh start, a nurse practitio...",0.246183
5,s5285,TV Show,No Tomorrow,,"Joshua Sasse, Tori Anderson, Jonathan Langdon,...",United States,"September 5, 2017",2016,TV-PG,1 Season,"Romantic TV Shows, TV Comedies, TV Dramas",Her straitjacketed life turned topsy-turvy by ...,0.246183


In [76]:
recommend('Kota Factory')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s3294,TV Show,Little Things,,"Dhruv Sehgal, Mithila Palkar",India,"November 9, 2019",2019,TV-MA,3 Seasons,"International TV Shows, Romantic TV Shows, TV ...",A cohabiting couple in their 20s navigate the ...,0.471405
2,s7873,TV Show,Rishta.com,,"Shruti Seth, Kavi Shastri, Siddhant Karnick, K...",India,"March 15, 2018",2010,TV-14,1 Season,"International TV Shows, Romantic TV Shows, TV ...",Partners at an Indian matrimonial agency face ...,0.408248
3,s7454,TV Show,Midnight Misadventures With Mallika Dua,,Mallika Dua,India,"April 1, 2019",2018,TV-14,1 Season,"International TV Shows, Stand-Up Comedy & Talk...","In this talk show, comedian Mallika Dua serves...",0.387298
4,s8776,TV Show,Yeh Meri Family,,"Vishesh Bansal, Mona Singh, Akarsh Khurana, Ah...",India,"August 31, 2018",2018,TV-PG,1 Season,"International TV Shows, TV Comedies","In the summer of 1998, middle child Harshu bal...",0.365148
5,s1590,TV Show,Bhaag Beanie Bhaag,,"Swara Bhasker, Dolly Singh, Ravi Patel, Varun ...",India,"December 4, 2020",2020,TV-MA,1 Season,"International TV Shows, Romantic TV Shows, TV ...","Facing disapproving parents, a knotty love lif...",0.365148


In [77]:
recommend('Breaking Bad')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s2932,TV Show,Better Call Saul,,"Bob Odenkirk, Jonathan Banks, Michael McKean, ...",United States,"February 9, 2020",2018,TV-MA,4 Seasons,"Crime TV Shows, TV Comedies, TV Dramas","This Emmy-nominated prequel to ""Breaking Bad"" ...",0.447214
2,s679,TV Show,The Assassination of Gianni Versace,,"Edgar Ramírez, Darren Criss, Ricky Martin, Pen...",United States,"June 19, 2021",2018,TV-MA,1 Season,"Crime TV Shows, TV Dramas, TV Thrillers","Defining moments in Andrew Cunanan's life, sta...",0.430331
3,s4080,TV Show,Unsolved,,"Josh Duhamel, Jimmi Simpson, Bokeem Woodbine",United States,"February 27, 2019",2018,TV-MA,1 Season,"Crime TV Shows, TV Dramas",Ride along for a dramatized version of the rea...,0.39036
4,s6842,TV Show,Get Shorty,,"Ray Romano, Chris O'Dowd",United States,"November 1, 2018",2017,TV-MA,1 Season,"Crime TV Shows, TV Comedies, TV Dramas",Organized crime enforcer Miles Daly strives to...,0.39036
5,s1981,TV Show,The Blacklist,,"James Spader, Megan Boone, Diego Klattenhoff, ...",United States,"September 18, 2020",2019,TV-14,7 Seasons,"Crime TV Shows, TV Dramas, TV Thrillers","After turning himself in, a brilliant fugitive...",0.358057


In [78]:
recommend('Peaky Blinders')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s2185,TV Show,Get Even,,"Kim Adis, Mia McKenna-Bruce, Bethany Antonia, ...",United Kingdom,"July 31, 2020",2020,TV-PG,1 Season,"British TV Shows, Crime TV Shows, Internationa...","In a secret act of skillful revenge, four priv...",0.32075
2,s1126,TV Show,Murder Maps,,Nicholas Day,United Kingdom,"April 1, 2021",2017,TV-MA,2 Seasons,"British TV Shows, Crime TV Shows, Docuseries",Dramatic reenactments paired with archival sou...,0.31427
3,s1130,TV Show,Secrets of Great British Castles,,Dan Jones,United Kingdom,"April 1, 2021",2016,TV-PG,2 Seasons,"British TV Shows, Docuseries, International TV...",Join historian Dan Jones on a journey back in ...,0.31427
4,s1318,TV Show,Nadiya Bakes,,Nadiya Hussain,United Kingdom,"February 12, 2021",2021,TV-G,1 Season,"British TV Shows, International TV Shows, Real...",Delightful cakes and heavenly breads pop from ...,0.31427
5,s1428,TV Show,Inside the World’s Toughest Prisons,,Paul Connolly,United Kingdom,"January 8, 2021",2021,TV-MA,5 Seasons,"British TV Shows, Crime TV Shows, Docuseries",Investigative journalist Paul Connolly becomes...,0.31427


In [79]:
recommend('Elite')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s110,TV Show,La casa de papel,,"Úrsula Corberó, Itziar Ituño, Álvaro Morte, Pa...",Spain,"September 3, 2021",2021,TV-MA,5 Seasons,"Crime TV Shows, International TV Shows, Spanis...",Eight thieves take hostages and lock themselve...,0.438357
2,s1433,TV Show,The Idhun Chronicles,Maite Ruiz De Austri,"Michelle Jenner, Itzan Escamilla, Sergio Mur, ...",Spain,"January 8, 2021",2021,TV-14,2 Seasons,"Anime Series, International TV Shows, Spanish-...",A boy suddenly orphaned fights his parents' ki...,0.344265
3,s5279,TV Show,Apaches,,"Alberto Ammann, Eloy Azorín, Verónica Echegui,...",Spain,"September 8, 2017",2016,TV-MA,1 Season,"Crime TV Shows, International TV Shows, Spanis...",A young journalist is forced into a life of cr...,0.344265
4,s6792,TV Show,Four Seasons in Havana,,"Jorge Perugorría, Carlos Enrique Almirante, Ma...","Spain, Cuba","December 9, 2016",2016,TV-MA,1 Season,"Crime TV Shows, International TV Shows, Spanis...","As Havana slowly revolves through the year, wi...",0.3114
5,s1841,TV Show,Someone Has to Die,Manolo Caro,"Carmen Maura, Cecilia Suárez, Ester Expósito, ...","Mexico, Spain","October 16, 2020",2020,TV-MA,1 Season,"Crime TV Shows, International TV Shows, Spanis...","In conservative 1950s Spain, the alleged relat...",0.30429


In [80]:
recommend('Narcos')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genres,description,similiarity
1,s2922,TV Show,Narcos: Mexico,,"Michael Peña, Diego Luna, Tenoch Huerta, Joaqu...","Mexico, United States","February 13, 2020",2020,TV-MA,2 Seasons,"Crime TV Shows, TV Action & Adventure, TV Dramas",Witness the birth of the Mexican drug war in t...,0.27735
2,s2416,TV Show,Queen of the South,,"Alice Braga, Veronica Falcón, Justina Machado,...","United States, Mexico, Spain, Malta","June 6, 2020",2018,TV-MA,4 Seasons,"Crime TV Shows, TV Action & Adventure, TV Dramas",Forced to work for a cartel that recently kill...,0.229081
3,s4656,TV Show,Marvel's Iron Fist,,"Finn Jones, Jessica Henwick, David Wenham, Jes...",United States,"September 7, 2018",2018,TV-MA,2 Seasons,"Crime TV Shows, TV Action & Adventure, TV Dramas",Danny Rand resurfaces 15 years after being pre...,0.225877
4,s750,TV Show,L.A.’s Finest,,"Jessica Alba, Gabrielle Union",United States,"June 9, 2021",2021,TV-MA,2 Seasons,"Crime TV Shows, TV Action & Adventure, TV Come...","In this spinoff of the ""Bad Boys"" franchise, t...",0.21598
5,s4080,TV Show,Unsolved,,"Josh Duhamel, Jimmi Simpson, Bokeem Woodbine",United States,"February 27, 2019",2018,TV-MA,1 Season,"Crime TV Shows, TV Dramas",Ride along for a dramatized version of the rea...,0.21598


# 8. Data visualization using plotly

In [81]:
import plotly.graph_objects as go

In [82]:
def Table(df):
    fig = go.Figure(data=[go.Table(
        columnorder=[1, 2, 3, 4, 5],
        columnwidth=[20, 20, 20, 30, 50],
        header=dict(values=list(['Type', 'Title', 'Country', 'Genre(s)', 'Description']),
                    line_color='black', font=dict(color='black', family="Gravitas One", size=20), height=40,
                    fill_color='#FF6865',
                    align='center'),
        cells=dict(values=[df.type, df.title, df.country, df.genres, df.description],
                   font=dict(color='black', family="Lato", size=16),
                   fill_color='#FFB3B2',
                   align='left'))
    ])

    fig.update_layout(height=700,
                      title={'text': "Top 5 Movie Recommendations", 'font': {'size': 22, 'family': 'Gravitas One'}},
                      title_x=0.5
                      )
    fig.show()

In [84]:
Table(recommend('Narcos'))