# Movie Recommender System

### Content Based Filtering
The Basic idea behind the Content based filtering relies upon similarities between the items themselves.
Here,for the content based model first we need to figure out the similarities between the movies. These similarities can be features or qualities of each movies.Then we can recommend the movies similar to the movie which user has interest in.


In [246]:
#Importing Libraries
import numpy as np
import pandas as pd
import ast
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity


##### Loading Datasets
Movies dataset containing columns such as budget,genres,title,id,keywords,overview,popularity,runtime etc.
credits dataset consist of movieId,title,cast and crew.

In [247]:
movies=pd.read_csv('tmdb_5000_movies.csv')
credits=pd.read_csv('tmdb_5000_credits.csv')


In [248]:
movies.head(1)

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2009-12-10,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800


In [249]:
credits.head()

Unnamed: 0,movie_id,title,cast,crew
0,19995,Avatar,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."
1,285,Pirates of the Caribbean: At World's End,"[{""cast_id"": 4, ""character"": ""Captain Jack Spa...","[{""credit_id"": ""52fe4232c3a36847f800b579"", ""de..."
2,206647,Spectre,"[{""cast_id"": 1, ""character"": ""James Bond"", ""cr...","[{""credit_id"": ""54805967c3a36829b5002c41"", ""de..."
3,49026,The Dark Knight Rises,"[{""cast_id"": 2, ""character"": ""Bruce Wayne / Ba...","[{""credit_id"": ""52fe4781c3a36847f81398c3"", ""de..."
4,49529,John Carter,"[{""cast_id"": 5, ""character"": ""John Carter"", ""c...","[{""credit_id"": ""52fe479ac3a36847f813eaa3"", ""de..."


##### Merging the movies and credits dataset

In [250]:
movies=movies.merge(credits,on='title')

In [251]:
movies.head(1)

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,...,runtime,spoken_languages,status,tagline,title,vote_average,vote_count,movie_id,cast,crew
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...",...,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,19995,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."


In [252]:
movies.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 4809 entries, 0 to 4808
Data columns (total 23 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   budget                4809 non-null   int64  
 1   genres                4809 non-null   object 
 2   homepage              1713 non-null   object 
 3   id                    4809 non-null   int64  
 4   keywords              4809 non-null   object 
 5   original_language     4809 non-null   object 
 6   original_title        4809 non-null   object 
 7   overview              4806 non-null   object 
 8   popularity            4809 non-null   float64
 9   production_companies  4809 non-null   object 
 10  production_countries  4809 non-null   object 
 11  release_date          4808 non-null   object 
 12  revenue               4809 non-null   int64  
 13  runtime               4807 non-null   float64
 14  spoken_languages      4809 non-null   object 
 15  status               

##### Extracting necessary columns from merged dataset

In [253]:
movies=movies[['movie_id','overview','title','genres','keywords','cast','crew']]


In [254]:
movies.isnull().sum()

movie_id    0
overview    3
title       0
genres      0
keywords    0
cast        0
crew        0
dtype: int64

Dropping the rows where some values are null 

In [255]:
movies.dropna(inplace=True)

In [256]:
movies.isnull().sum()


movie_id    0
overview    0
title       0
genres      0
keywords    0
cast        0
crew        0
dtype: int64

In [257]:
movies.duplicated().sum()

0

In [258]:
movies.iloc[0].genres

'[{"id": 28, "name": "Action"}, {"id": 12, "name": "Adventure"}, {"id": 14, "name": "Fantasy"}, {"id": 878, "name": "Science Fiction"}]'

In [259]:
# function to get the list of all genres 
def convert(obj):
    L = []
    for i in ast.literal_eval(obj):
        L.append(i['name'])
    return L


In [260]:
movies['genres']=movies['genres'].apply(convert)

In [261]:
movies['keywords']=movies['keywords'].apply(convert)

In [262]:
#Function to get the list of top 3 cast of movies
def get_cast(obj):
    L = []
    counter=0
    for i in ast.literal_eval(obj):
        if counter !=3:
            L.append(i['name'])
            counter+=1
        else:
            break
    return L


movies['cast']=movies['cast'].apply(get_cast)

In [263]:
movies.head()

Unnamed: 0,movie_id,overview,title,genres,keywords,cast,crew
0,19995,"In the 22nd century, a paraplegic Marine is di...",Avatar,"[Action, Adventure, Fantasy, Science Fiction]","[culture clash, future, space war, space colon...","[Sam Worthington, Zoe Saldana, Sigourney Weaver]","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."
1,285,"Captain Barbossa, long believed to be dead, ha...",Pirates of the Caribbean: At World's End,"[Adventure, Fantasy, Action]","[ocean, drug abuse, exotic island, east india ...","[Johnny Depp, Orlando Bloom, Keira Knightley]","[{""credit_id"": ""52fe4232c3a36847f800b579"", ""de..."
2,206647,A cryptic message from Bond’s past sends him o...,Spectre,"[Action, Adventure, Crime]","[spy, based on novel, secret agent, sequel, mi...","[Daniel Craig, Christoph Waltz, Léa Seydoux]","[{""credit_id"": ""54805967c3a36829b5002c41"", ""de..."
3,49026,Following the death of District Attorney Harve...,The Dark Knight Rises,"[Action, Crime, Drama, Thriller]","[dc comics, crime fighter, terrorist, secret i...","[Christian Bale, Michael Caine, Gary Oldman]","[{""credit_id"": ""52fe4781c3a36847f81398c3"", ""de..."
4,49529,"John Carter is a war-weary, former military ca...",John Carter,"[Action, Adventure, Science Fiction]","[based on novel, mars, medallion, space travel...","[Taylor Kitsch, Lynn Collins, Samantha Morton]","[{""credit_id"": ""52fe479ac3a36847f813eaa3"", ""de..."


In [264]:
#Function to get the name of the director from crew member list
def get_director(obj):
    L=[]
    for i in ast.literal_eval(obj):
        if i['job']=='Director':
            L.append(i['name'])
            break
    return L

In [265]:
movies['crew']=movies['crew'].apply(get_director)

In [266]:
movies.head()

Unnamed: 0,movie_id,overview,title,genres,keywords,cast,crew
0,19995,"In the 22nd century, a paraplegic Marine is di...",Avatar,"[Action, Adventure, Fantasy, Science Fiction]","[culture clash, future, space war, space colon...","[Sam Worthington, Zoe Saldana, Sigourney Weaver]",[James Cameron]
1,285,"Captain Barbossa, long believed to be dead, ha...",Pirates of the Caribbean: At World's End,"[Adventure, Fantasy, Action]","[ocean, drug abuse, exotic island, east india ...","[Johnny Depp, Orlando Bloom, Keira Knightley]",[Gore Verbinski]
2,206647,A cryptic message from Bond’s past sends him o...,Spectre,"[Action, Adventure, Crime]","[spy, based on novel, secret agent, sequel, mi...","[Daniel Craig, Christoph Waltz, Léa Seydoux]",[Sam Mendes]
3,49026,Following the death of District Attorney Harve...,The Dark Knight Rises,"[Action, Crime, Drama, Thriller]","[dc comics, crime fighter, terrorist, secret i...","[Christian Bale, Michael Caine, Gary Oldman]",[Christopher Nolan]
4,49529,"John Carter is a war-weary, former military ca...",John Carter,"[Action, Adventure, Science Fiction]","[based on novel, mars, medallion, space travel...","[Taylor Kitsch, Lynn Collins, Samantha Morton]",[Andrew Stanton]


In [267]:
movies['overview'] = movies['overview'].apply(lambda x:x.split()) #Spilling the words of movie overview and adding them in list

Removing the spaces between words

In [269]:
movies['genres']=movies['genres'].apply(lambda x:[i.replace(" ","")for i in x])
movies['keywords']=movies['keywords'].apply(lambda x:[i.replace(" ","")for i in x])
movies['cast']=movies['cast'].apply(lambda x:[i.replace(" ","")for i in x])
movies['crew']=movies['crew'].apply(lambda x:[i.replace(" ","")for i in x])

##### Cleaned Dataset

In [270]:
movies.head()

Unnamed: 0,movie_id,overview,title,genres,keywords,cast,crew
0,19995,"[In, the, 22nd, century,, a, paraplegic, Marin...",Avatar,"[Action, Adventure, Fantasy, ScienceFiction]","[cultureclash, future, spacewar, spacecolony, ...","[SamWorthington, ZoeSaldana, SigourneyWeaver]",[JamesCameron]
1,285,"[Captain, Barbossa,, long, believed, to, be, d...",Pirates of the Caribbean: At World's End,"[Adventure, Fantasy, Action]","[ocean, drugabuse, exoticisland, eastindiatrad...","[JohnnyDepp, OrlandoBloom, KeiraKnightley]",[GoreVerbinski]
2,206647,"[A, cryptic, message, from, Bond’s, past, send...",Spectre,"[Action, Adventure, Crime]","[spy, basedonnovel, secretagent, sequel, mi6, ...","[DanielCraig, ChristophWaltz, LéaSeydoux]",[SamMendes]
3,49026,"[Following, the, death, of, District, Attorney...",The Dark Knight Rises,"[Action, Crime, Drama, Thriller]","[dccomics, crimefighter, terrorist, secretiden...","[ChristianBale, MichaelCaine, GaryOldman]",[ChristopherNolan]
4,49529,"[John, Carter, is, a, war-weary,, former, mili...",John Carter,"[Action, Adventure, ScienceFiction]","[basedonnovel, mars, medallion, spacetravel, p...","[TaylorKitsch, LynnCollins, SamanthaMorton]",[AndrewStanton]


Creating a new column 'tags' consist of all important information about movies

In [272]:
movies['tags']=movies['overview'] + movies['genres'] + movies['keywords'] + movies['cast'] + movies['crew']


In [273]:
movies.head()

Unnamed: 0,movie_id,overview,title,genres,keywords,cast,crew,tags
0,19995,"[In, the, 22nd, century,, a, paraplegic, Marin...",Avatar,"[Action, Adventure, Fantasy, ScienceFiction]","[cultureclash, future, spacewar, spacecolony, ...","[SamWorthington, ZoeSaldana, SigourneyWeaver]",[JamesCameron],"[In, the, 22nd, century,, a, paraplegic, Marin..."
1,285,"[Captain, Barbossa,, long, believed, to, be, d...",Pirates of the Caribbean: At World's End,"[Adventure, Fantasy, Action]","[ocean, drugabuse, exoticisland, eastindiatrad...","[JohnnyDepp, OrlandoBloom, KeiraKnightley]",[GoreVerbinski],"[Captain, Barbossa,, long, believed, to, be, d..."
2,206647,"[A, cryptic, message, from, Bond’s, past, send...",Spectre,"[Action, Adventure, Crime]","[spy, basedonnovel, secretagent, sequel, mi6, ...","[DanielCraig, ChristophWaltz, LéaSeydoux]",[SamMendes],"[A, cryptic, message, from, Bond’s, past, send..."
3,49026,"[Following, the, death, of, District, Attorney...",The Dark Knight Rises,"[Action, Crime, Drama, Thriller]","[dccomics, crimefighter, terrorist, secretiden...","[ChristianBale, MichaelCaine, GaryOldman]",[ChristopherNolan],"[Following, the, death, of, District, Attorney..."
4,49529,"[John, Carter, is, a, war-weary,, former, mili...",John Carter,"[Action, Adventure, ScienceFiction]","[basedonnovel, mars, medallion, spacetravel, p...","[TaylorKitsch, LynnCollins, SamanthaMorton]",[AndrewStanton],"[John, Carter, is, a, war-weary,, former, mili..."


##### New dataset with movie_id,title and tags of movies

In [274]:
new_movie_df=movies[['movie_id','title','tags']]
new_movie_df.head()

Unnamed: 0,movie_id,title,tags
0,19995,Avatar,"[In, the, 22nd, century,, a, paraplegic, Marin..."
1,285,Pirates of the Caribbean: At World's End,"[Captain, Barbossa,, long, believed, to, be, d..."
2,206647,Spectre,"[A, cryptic, message, from, Bond’s, past, send..."
3,49026,The Dark Knight Rises,"[Following, the, death, of, District, Attorney..."
4,49529,John Carter,"[John, Carter, is, a, war-weary,, former, mili..."


In [275]:
new_movie_df['tags']=new_movie_df['tags'].apply(lambda x:" ".join(x))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [276]:
new_movie_df['tags'][0]

'In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization. Action Adventure Fantasy ScienceFiction cultureclash future spacewar spacecolony society spacetravel futuristic romance space alien tribe alienplanet cgi marine soldier battle loveaffair antiwar powerrelations mindandsoul 3d SamWorthington ZoeSaldana SigourneyWeaver JamesCameron'

Converting the words in the tag column to lowercase

In [277]:
new_movie_df['tags']=new_movie_df['tags'].apply(lambda x:x.lower())

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [278]:
new_movie_df.head()

Unnamed: 0,movie_id,title,tags
0,19995,Avatar,"in the 22nd century, a paraplegic marine is di..."
1,285,Pirates of the Caribbean: At World's End,"captain barbossa, long believed to be dead, ha..."
2,206647,Spectre,a cryptic message from bond’s past sends him o...
3,49026,The Dark Knight Rises,following the death of district attorney harve...
4,49529,John Carter,"john carter is a war-weary, former military ca..."


Importing nltk 

In [145]:
from nltk.stem.porter import PorterStemmer
ps=PorterStemmer()

In [279]:
def stem(text): #helper function to lower the inflection in words and convert them to their root forms
    stem_L = []
    
    for i in text.split():
        stem_L.append(ps.stem(i))
    return " ".join(stem_L)



In [280]:
new_movie_df['tags']=new_movie_df['tags'].apply(stem)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [148]:
cv =  CountVectorizer(max_features=8000,stop_words='english') # converting the text into vectors of term or token counts

In [149]:
vectors=cv.fit_transform(new_movie_df['tags']).toarray()

In [150]:
cv.get_feature_names()



['000',
 '007',
 '10',
 '100',
 '10th',
 '11',
 '12',
 '12th',
 '13',
 '14',
 '15',
 '150',
 '15th',
 '16',
 '16th',
 '17',
 '17th',
 '18',
 '1863',
 '1890',
 '18th',
 '18thcenturi',
 '19',
 '1910',
 '1920',
 '1927',
 '1930',
 '1930s',
 '1937',
 '1940',
 '1940s',
 '1941',
 '1944',
 '1945',
 '1950',
 '1950s',
 '1955',
 '1959',
 '1960',
 '1960s',
 '1962',
 '1964',
 '1965',
 '1967',
 '1969',
 '1970',
 '1970s',
 '1971',
 '1972',
 '1973',
 '1974',
 '1976',
 '1977',
 '1979',
 '1980',
 '1980s',
 '1984',
 '1985',
 '1986',
 '1987',
 '1990',
 '1994',
 '1995',
 '1996',
 '1997',
 '1999',
 '19th',
 '19thcenturi',
 '20',
 '200',
 '2000',
 '2001',
 '2002',
 '2003',
 '2004',
 '2007',
 '2008',
 '2009',
 '2011',
 '2012',
 '20th',
 '21st',
 '21stcenturi',
 '22nd',
 '23',
 '24',
 '25',
 '27',
 '28',
 '29',
 '30',
 '300',
 '35',
 '3d',
 '40',
 '400',
 '47',
 '50',
 '500',
 '51',
 '60',
 '60s',
 '70',
 '7th',
 '80',
 'aaron',
 'aaroneckhart',
 'aarontaylor',
 'abandon',
 'abbi',
 'abbiecornish',
 'abduct',


In [151]:
cosine_similarity(vectors).shape


(4806, 4806)

In [152]:
similarity=cosine_similarity(vectors) #finding th similarity between vectors

In [153]:
sorted(list(enumerate(similarity[0])),reverse=True,key=lambda x:x[1])[1:6] 

[(2409, 0.2504897164340598),
 (1216, 0.24845199749997662),
 (3730, 0.23333333333333334),
 (539, 0.23008949665421108),
 (507, 0.22360679774997896)]

#### Recommend Function

In [285]:
def recommend(movie):

    movie_index = new_movie_df[new_movie_df['title'] == movie].index[0]
    distances = similarity[movie_index]
    movies_list=sorted(list(enumerate(distances)),reverse=True,key=lambda x:x[1])[1:6]

    for i in movies_list:
        print(new_movie_df.iloc[i[0]].title)
    
        

###### Calling the function 
    

In [287]:
recommend('Iron Man')

Iron Man 3
Iron Man 2
Avengers: Age of Ultron
The Avengers
Captain America: Civil War


#### Importing the new_movie_df and similarity matrix as a pickle file to use in front-end

In [289]:
import pickle
pickle.dump(new_movie_df.to_dict(),open('movies_dict.pkl','wb'))
pickle.dump(similarity,open('similarity.pkl','wb'))

#### A Content-Based filtering model does not need any data about other users and user ratings, since the recommendations are specific to a particular user. 


# Collaborative Filtering

 Collaborative filtering filters information by using the interactions and data collected by the system from other users.
 We find the missing rating with the help of the ratings given to the other items by the user.
 
 Step 1: Finding similarities of all the item pairs.
 
 Step 2: Generating the missing ratings in the table

In [300]:
#Importing the necessary libraries
from math import sqrt
import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import linear_kernel
from sklearn.metrics import pairwise_distances
from scipy.spatial.distance import cosine, correlation

#### Loading the datasets
The ratings dataset consist of ratings given by 668 users and their UserId, movies dataset have movie title,id and information. links_small dataset have movieId from ratings dataset and their corresponding IMDb and TMDb id 

In [301]:
ratings = pd.read_csv('ratings.csv')
movies = pd.read_csv('movies.csv')
links = pd.read_csv('links_small.csv')

In [302]:
ratings = pd.merge(movies, ratings) # merging movies and ratings dataset
movie_links = pd.merge(movies,links) # merging the movies and links dataset

In [303]:
ratings.head()

Unnamed: 0,movieId,title,genres,userId,rating,timestamp
0,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,2,5.0,859046895
1,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,5,4.0,1303501039
2,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,8,5.0,858610933
3,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,11,4.0,850815810
4,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,14,4.0,851766286


In [304]:
user_rating_pivot = ratings.pivot_table(index=['userId'],columns=['title'],values='rating')
user_rating_pivot.head()

title,'71 (2014),'Hellboy': The Seeds of Creation (2004),'Round Midnight (1986),'Til There Was You (1997),"'burbs, The (1989)",'night Mother (1986),(500) Days of Summer (2009),*batteries not included (1987),...And Justice for All (1979),10 (1979),...,[REC] (2007),[REC]² (2009),[REC]³ 3 Génesis (2012),a/k/a Tommy Chong (2005),eXistenZ (1999),loudQUIETloud: A Film About the Pixies (2006),xXx (2002),xXx: State of the Union (2005),¡Three Amigos! (1986),À nous la liberté (Freedom for Us) (1931)
userId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,,,,,,,,,,,...,,,,,,,,,,
2,,,,,,,,,,,...,,,,,,,,,,
3,,,,,,,,,,,...,,,,,,,,,,
4,,,,,,,,,,,...,,,,,,,,,,
5,,,,,,,,,,,...,,,,,,,,,,


Finding the correlation values

In [305]:
correlation_matrix = user_rating_pivot.corr(method='pearson', min_periods=50)
correlation_matrix.head()

title,'71 (2014),'Hellboy': The Seeds of Creation (2004),'Round Midnight (1986),'Til There Was You (1997),"'burbs, The (1989)",'night Mother (1986),(500) Days of Summer (2009),*batteries not included (1987),...And Justice for All (1979),10 (1979),...,[REC] (2007),[REC]² (2009),[REC]³ 3 Génesis (2012),a/k/a Tommy Chong (2005),eXistenZ (1999),loudQUIETloud: A Film About the Pixies (2006),xXx (2002),xXx: State of the Union (2005),¡Three Amigos! (1986),À nous la liberté (Freedom for Us) (1931)
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
'71 (2014),,,,,,,,,,,...,,,,,,,,,,
'Hellboy': The Seeds of Creation (2004),,,,,,,,,,,...,,,,,,,,,,
'Round Midnight (1986),,,,,,,,,,,...,,,,,,,,,,
'Til There Was You (1997),,,,,,,,,,,...,,,,,,,,,,
"'burbs, The (1989)",,,,,,,,,,,...,,,,,,,,,,


In [306]:
movie_links.head()

Unnamed: 0,movieId,title,genres,imdbId,tmdbId
0,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy,114709,862.0
1,2,Jumanji (1995),Adventure|Children|Fantasy,113497,8844.0
2,3,Grumpier Old Men (1995),Comedy|Romance,113228,15602.0
3,4,Waiting to Exhale (1995),Comedy|Drama|Romance,114885,31357.0
4,5,Father of the Bride Part II (1995),Comedy,113041,11862.0


In [307]:
 user_given_ratings = user_rating_pivot.loc[1].dropna()

In [308]:
def recommend(userId):
    user_given_ratings = user_rating_pivot.loc[userId].dropna()
    similar_items = pd.Series()
    for i in range(0, len(user_given_ratings.index)):
        # Finding similar movies to the already rated movies.
        sims = correlation_matrix[user_given_ratings.index[i]].dropna()
        # Based on how the user rated the movie scale the similarity values. 
        sims = sims.map(lambda x: x * user_given_ratings[i])
        similar_items = similar_items.append(sims)
    similar_items.sort_values(inplace = True, ascending = False)
    similar_items.head(10)
    watched_list=[]
    for i in similar_items.index:
        if i in user_given_ratings.index:
            watched_list.append(i) 
    filtered_similar_items =similar_items.drop(watched_list)#filtering the similar items
    return filtered_similar_items.head(10)

#### Calling the recommend function for userId 1

In [309]:
recommend(1)

  This is separate from the ipykernel package so we can avoid doing imports until


Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb (1964)    3.421993
Shrek 2 (2004)                                                                 3.415921
Austin Powers: International Man of Mystery (1997)                             3.199867
Matrix Revolutions, The (2003)                                                 3.084335
Monty Python's Life of Brian (1979)                                            3.005823
Outbreak (1995)                                                                2.816667
Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb (1964)    2.717731
Pirates of the Caribbean: The Curse of the Black Pearl (2003)                  2.674226
Taxi Driver (1976)                                                             2.646429
Green Mile, The (1999)                                                         2.590774
dtype: float64

#### Importing the datasets and correlation matrix using pickle files

In [310]:
import pickle
pickle.dump(user_given_ratings.to_dict(),open('user_given_ratings.pkl','wb'))
pickle.dump(user_rating_pivot,open('user_rating_pivot1.pkl','wb'))
pickle.dump(correlation_matrix,open('correlation_matrix.pkl','wb'))
pickle.dump(movies,open('new_movies1.pkl','wb'))
pickle.dump(movie_links,open('movie_links.pkl','wb'))