# Netflix Recommendation System using Python

Netflix is a subscription-based streaming platform that allows users to watch movies and TV shows without advertisements. One of the reasons behind the popularity of Netflix is its recommendation system. Its recommendation system recommends movies and TV shows based on the user’s interest. If you are a Data Science student and want to learn how to create a Netflix recommendation system, this article is for you. This article will take you through how to build a Netflix recommendation system using Python.

Here’s How Netflix Recommendation System Works
The recommendation system of Netflix shows you movies and TV shows according to your interests. Netflix has a lot of data because of its user base. Its recommendation system predicts a personalised catalogue for you based on factors like:

your viewing history

the viewing history of other users with similar tastes and preferences as yours
genres, category, description, and more information about the content that you ,

watched in the past

The genre of the content is one of the most valuable factors that helps Netflix recommend more content even to new users. I hope you have understood how Netflix recommends content to its users. You can learn more about it here. In the section below, I will take you through how to build a Netflix recommendation system using Python.;

In [2]:
import numpy as np
import pandas as pd
from sklearn.feature_extraction import text
from sklearn.metrics.pairwise import cosine_similarity

df= pd.read_csv("netflixData.csv")

In [3]:
df

Unnamed: 0,Show Id,Title,Description,Director,Genres,Cast,Production Country,Release Date,Rating,Duration,Imdb Score,Content Type,Date Added
0,cc1b6ed9-cf9e-4057-8303-34577fb54477,(Un)Well,This docuseries takes a deep dive into the luc...,,Reality TV,,United States,2020.0,TV-MA,1 Season,6.6/10,TV Show,
1,e2ef4e91-fb25-42ab-b485-be8e3b23dedb,#Alive,"As a grisly virus rampages a city, a lone man ...",Cho Il,"Horror Movies, International Movies, Thrillers","Yoo Ah-in, Park Shin-hye",South Korea,2020.0,TV-MA,99 min,6.2/10,Movie,"September 8, 2020"
2,b01b73b7-81f6-47a7-86d8-acb63080d525,#AnneFrank - Parallel Stories,"Through her diary, Anne Frank's story is retol...","Sabina Fedeli, Anna Migotto","Documentaries, International Movies","Helen Mirren, Gengher Gatti",Italy,2019.0,TV-14,95 min,6.4/10,Movie,"July 1, 2020"
3,b6611af0-f53c-4a08-9ffa-9716dc57eb9c,#blackAF,Kenya Barris and his family navigate relations...,,TV Comedies,"Kenya Barris, Rashida Jones, Iman Benson, Genn...",United States,2020.0,TV-MA,1 Season,6.6/10,TV Show,
4,7f2d4170-bab8-4d75-adc2-197f7124c070,#cats_the_mewvie,This pawesome documentary explores how our fel...,Michael Margolis,"Documentaries, International Movies",,Canada,2020.0,TV-14,90 min,5.1/10,Movie,"February 5, 2020"
...,...,...,...,...,...,...,...,...,...,...,...,...,...
5962,62b8b682-f191-4c10-aa04-32319329bd8d,الف مبروك,"On his wedding day, an arrogant, greedy accoun...",Ahmed Nader Galal,"Comedies, Dramas, International Movies","Ahmed Helmy, Laila Ezz El Arab, Mahmoud El Fis...",Egypt,2009.0,TV-14,115 min,7.4/10,Movie,"April 25, 2020"
5963,5bed77ab-5e31-4216-8b51-44c9a35442e6,دفعة القاهرة,A group of women leaves Kuwait to attend unive...,,"International TV Shows, TV Dramas","Bashar al-Shatti, Fatima Al Safi, Maram Baloch...",,2019.0,TV-14,1 Season,,TV Show,
5964,4661ec0c-8692-4661-bc76-a96412b311fd,海的儿子,"Two brothers start a new life in Singapore, wh...",,"International TV Shows, TV Dramas","Li Nanxing, Christopher Lee, Jesseca Liu, Appl...",,2016.0,TV-14,1 Season,,TV Show,
5965,145c93a7-1924-403c-a933-4ede8ad66f26,반드시 잡는다,After people in his town start turning up dead...,Hong-seon Kim,"Dramas, International Movies, Thrillers",Baek Yoon-sik,South Korea,2017.0,TV-MA,110 min,6.5/10,Movie,"February 28, 2018"


In [4]:
df.isnull().sum()

Unnamed: 0,0
Show Id,0
Title,0
Description,0
Director,2064
Genres,0
Cast,530
Production Country,559
Release Date,3
Rating,4
Duration,3


In [5]:
df = df[["Title", "Description", "Content Type", "Genres"]]

In [6]:
df

Unnamed: 0,Title,Description,Content Type,Genres
0,(Un)Well,This docuseries takes a deep dive into the luc...,TV Show,Reality TV
1,#Alive,"As a grisly virus rampages a city, a lone man ...",Movie,"Horror Movies, International Movies, Thrillers"
2,#AnneFrank - Parallel Stories,"Through her diary, Anne Frank's story is retol...",Movie,"Documentaries, International Movies"
3,#blackAF,Kenya Barris and his family navigate relations...,TV Show,TV Comedies
4,#cats_the_mewvie,This pawesome documentary explores how our fel...,Movie,"Documentaries, International Movies"
...,...,...,...,...
5962,الف مبروك,"On his wedding day, an arrogant, greedy accoun...",Movie,"Comedies, Dramas, International Movies"
5963,دفعة القاهرة,A group of women leaves Kuwait to attend unive...,TV Show,"International TV Shows, TV Dramas"
5964,海的儿子,"Two brothers start a new life in Singapore, wh...",TV Show,"International TV Shows, TV Dramas"
5965,반드시 잡는다,After people in his town start turning up dead...,Movie,"Dramas, International Movies, Thrillers"


In [7]:
df=df.dropna()

In [8]:
df

Unnamed: 0,Title,Description,Content Type,Genres
0,(Un)Well,This docuseries takes a deep dive into the luc...,TV Show,Reality TV
1,#Alive,"As a grisly virus rampages a city, a lone man ...",Movie,"Horror Movies, International Movies, Thrillers"
2,#AnneFrank - Parallel Stories,"Through her diary, Anne Frank's story is retol...",Movie,"Documentaries, International Movies"
3,#blackAF,Kenya Barris and his family navigate relations...,TV Show,TV Comedies
4,#cats_the_mewvie,This pawesome documentary explores how our fel...,Movie,"Documentaries, International Movies"
...,...,...,...,...
5962,الف مبروك,"On his wedding day, an arrogant, greedy accoun...",Movie,"Comedies, Dramas, International Movies"
5963,دفعة القاهرة,A group of women leaves Kuwait to attend unive...,TV Show,"International TV Shows, TV Dramas"
5964,海的儿子,"Two brothers start a new life in Singapore, wh...",TV Show,"International TV Shows, TV Dramas"
5965,반드시 잡는다,After people in his town start turning up dead...,Movie,"Dramas, International Movies, Thrillers"


In [9]:
df.isnull().sum()

Unnamed: 0,0
Title,0
Description,0
Content Type,0
Genres,0


In [10]:
df['Description'][0]

'This docuseries takes a deep dive into the lucrative wellness industry, which touts health and healing. But do the products live up to the promises?'

In [11]:
df['Genres'] = df['Genres'].str.lower()
df['Genres'] = df['Genres'].str.replace('[^\w\s]','')
df['Genres'] = df['Genres'].str.replace('\n','')
df['Genres'] = df['Genres'].str.replace('\d+','',regex=True)
df['Genres'] = df['Genres'].str.replace('\'','')
df['Genres'] = df['Genres'].str.replace('\r','')
df['Genres'] = df['Genres'].str.replace('\s+',' ')
df['Genres'] = df['Genres'].str.replace('https?://\S+|www\.\S+',' ')
df['Genres'] = df['Genres'].str.replace('<.*?>+',' ')
df['Genres'] = df['Genres'].str.replace('[%s]',' ')

In [12]:
df

Unnamed: 0,Title,Description,Content Type,Genres
0,(Un)Well,This docuseries takes a deep dive into the luc...,TV Show,reality tv
1,#Alive,"As a grisly virus rampages a city, a lone man ...",Movie,"horror movies, international movies, thrillers"
2,#AnneFrank - Parallel Stories,"Through her diary, Anne Frank's story is retol...",Movie,"documentaries, international movies"
3,#blackAF,Kenya Barris and his family navigate relations...,TV Show,tv comedies
4,#cats_the_mewvie,This pawesome documentary explores how our fel...,Movie,"documentaries, international movies"
...,...,...,...,...
5962,الف مبروك,"On his wedding day, an arrogant, greedy accoun...",Movie,"comedies, dramas, international movies"
5963,دفعة القاهرة,A group of women leaves Kuwait to attend unive...,TV Show,"international tv shows, tv dramas"
5964,海的儿子,"Two brothers start a new life in Singapore, wh...",TV Show,"international tv shows, tv dramas"
5965,반드시 잡는다,After people in his town start turning up dead...,Movie,"dramas, international movies, thrillers"


In [13]:
from textblob import TextBlob
from nltk.stem import PorterStemmer
pr=PorterStemmer()

In [14]:
from sklearn.feature_extraction.text import CountVectorizer

In [15]:
def lemmafn(text):

    words=TextBlob(text).words

    return[pr.stem(word) for word in words]

In [16]:
!python -m textblob.download_corpora

[nltk_data] Downloading package brown to /root/nltk_data...
[nltk_data]   Unzipping corpora/brown.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.
[nltk_data] Downloading package conll2000 to /root/nltk_data...
[nltk_data]   Unzipping corpora/conll2000.zip.
[nltk_data] Downloading package movie_reviews to /root/nltk_data...
[nltk_data]   Unzipping corpora/movie_reviews.zip.
Finished.


In [18]:
x =df["Genres"].tolist()

In [19]:
vect = CountVectorizer(ngram_range=(1,2),max_features=10000,analyzer=lemmafn)

In [20]:
x=vect.fit_transform(x)



In [21]:
from sklearn.metrics.pairwise import cosine_similarity

In [22]:
sim = cosine_similarity(x)

In [23]:
indices = pd.Series(df.index,
                    index=df['Title']).drop_duplicates()

In [24]:
def netFlix_recommendation(title, similarity =sim):
    index = indices[title]
    similarity_scores = list(enumerate(similarity[index]))
    similarity_scores = sorted(similarity_scores, key=lambda x: x[1], reverse=True)
    similarity_scores = similarity_scores[0:10]
    movieindices = [i[0] for i in similarity_scores]
    return df['Title'].iloc[movieindices]

In [26]:
netFlix_recommendation("(Un)Well")

Unnamed: 0,Title
0,(Un)Well
68,60 Days In
305,Alone
322,America's Next Top Model
406,Are You The One
468,Awake: The Million Dollar Game
615,Best Leftovers Ever!
694,Black Ink Crew New York
720,Bling Empire
843,Buried by the Bernards
