# <center> Movie Recommendation Systems </center>

The rapid growth of data collection has led to a new era of information. Data is being used to create more efficient systems and this is where Recommendation Systems come into play. Recommendation Systems are a type of information filtering systems as they improve the quality of search results and provides items that are more relevant to the search item or are realted to the search history of the user.

They are used to predict the rating or preference that a user would give to an item. Almost every major tech company has applied them in some form or the other: Amazon uses it to suggest products to customers, YouTube uses it to decide which video to play next on autoplay, and Facebook uses it to recommend pages to like and people to follow. Moreover, companies like Netflix and Spotify depend highly on the effectiveness of their recommendation engines for their business and sucees.

In [8]:
pip install evaluate

Collecting evaluate
  Downloading https://files.pythonhosted.org/packages/90/50/0cc73b299fd941cb12d7ed39e0ccf8e18fe78dd6c16b951abe5477b3cd82/evaluate-0.0.3.tar.gz
Collecting xgboost (from evaluate)
  Downloading https://files.pythonhosted.org/packages/b1/11/cba4be5a737c6431323b89b5ade818b3bbe1df6e8261c6c70221a767c5d9/xgboost-1.0.2-py3-none-win_amd64.whl (24.6MB)
Collecting lightgbm (from evaluate)
  Downloading https://files.pythonhosted.org/packages/1f/cb/a8ec24334c35a7d0c87b4e4e056bd2137573c7c1bd81c760b79a2f370254/lightgbm-2.3.1-py2.py3-none-win_amd64.whl (544kB)
Building wheels for collected packages: evaluate
  Building wheel for evaluate (setup.py): started
  Building wheel for evaluate (setup.py): finished with status 'done'
  Created wheel for evaluate: filename=evaluate-0.0.3-cp37-none-any.whl size=6863 sha256=91b30596c2b6327504867decb947184fd74cb01eef2db5f64c8a6256421c27f2
  Stored in directory: C:\Users\kjosh\AppData\Local\pip\Cache\wheels\de\51\a5\ebdce3e18b99539f31d3624ed21

In [8]:
%matplotlib inline
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import evaluate
from sklearn import datasets
from scipy import stats
from ast import literal_eval
from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer
from sklearn.metrics.pairwise import linear_kernel, cosine_similarity
from nltk.stem.snowball import SnowballStemmer
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.corpus import wordnet
from surprise import SVD
from surprise import Dataset
from surprise.model_selection import cross_validate


import warnings; warnings.simplefilter('ignore')

# 1. Simple Recommender

The System recommends the same movies to users with similar demographic features. Since each user is different , this approach is considered to be too simple. The basic idea behind this system is that movies that are more popular and critically acclaimed will have a higher probability of being liked by the average audience. This model does not give personalized recommendations based on the user.

The implementation of this model is extremely trivial. All we have to do is sort our movies based on ratings and popularity and display the top movies of our list. As an added step, we can pass in a genre argument to get the top movies of a particular genre.

In [130]:
df=pd.read_csv(r'C:\Users\avias\Desktop\Movie\movies_data.csv')
df.head()

Unnamed: 0,adult,belongs_to_collection,budget,genres,homepage,id,imdb_id,original_language,original_title,overview,...,release_date,revenue,runtime,spoken_languages,status,tagline,title,video,vote_average,vote_count
0,False,"{'id': 10194, 'name': 'Toy Story Collection', ...",30000000,"[{'id': 16, 'name': 'Animation'}, {'id': 35, '...",http://toystory.disney.com/toy-story,862,tt0114709,en,Toy Story,"Led by Woody, Andy's toys live happily in his ...",...,1995-10-30,373554033.0,81.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,,Toy Story,False,7.7,5415.0
1,False,,65000000,"[{'id': 12, 'name': 'Adventure'}, {'id': 14, '...",,8844,tt0113497,en,Jumanji,When siblings Judy and Peter discover an encha...,...,1995-12-15,262797249.0,104.0,"[{'iso_639_1': 'en', 'name': 'English'}, {'iso...",Released,Roll the dice and unleash the excitement!,Jumanji,False,6.9,2413.0
2,False,"{'id': 119050, 'name': 'Grumpy Old Men Collect...",0,"[{'id': 10749, 'name': 'Romance'}, {'id': 35, ...",,15602,tt0113228,en,Grumpier Old Men,A family wedding reignites the ancient feud be...,...,1995-12-22,0.0,101.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,Still Yelling. Still Fighting. Still Ready for...,Grumpier Old Men,False,6.5,92.0
3,False,,16000000,"[{'id': 35, 'name': 'Comedy'}, {'id': 18, 'nam...",,31357,tt0114885,en,Waiting to Exhale,"Cheated on, mistreated and stepped on, the wom...",...,1995-12-22,81452156.0,127.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,Friends are the people who let you be yourself...,Waiting to Exhale,False,6.1,34.0
4,False,"{'id': 96871, 'name': 'Father of the Bride Col...",0,"[{'id': 35, 'name': 'Comedy'}]",,11862,tt0113041,en,Father of the Bride Part II,Just when George Banks has recovered from his ...,...,1995-02-10,76578911.0,106.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,Just When His World Is Back To Normal... He's ...,Father of the Bride Part II,False,5.7,173.0


In [131]:
df['genres'] = df['genres'].fillna('[]').apply(literal_eval).apply(lambda x: [i['name'] for i in x] if isinstance(x, list) else [])

We will use IMDB's weighted rating formula to construct my chart. Mathematically, it is represented as follows:

Weighted Rating (WR) =  (v/v+m.R)+(m/v+m.C) 
where,

v is the number of votes for the movie
c is the minimum votes required to be listed in the chart
R is the average rating of the movie
m is the mean vote across the whole report

To find an appropriate value for m, the minimum votes required to be listed in the chart. We will use 95th percentile as our cutoff. In other words, for a movie to feature in the charts, it must have more votes than at least 95% of the movies in the list.

In [132]:
votes= df[df['vote_count'].notnull()]['vote_count'].astype('int')
votes_average= df[df['vote_average'].notnull()]['vote_average'].astype('int')
df['year'] = pd.to_datetime(df['release_date'], errors='coerce').apply(lambda x: str(x).split('-')[0] if x != np.nan else np.nan)
m = votes_average.mean()
c=votes.quantile(0.95)
print('c =',c)
print('m =',m)

c = 434.0
m = 5.244896612406511


In [133]:
qualified = df[(df['vote_count'] >= m) & (df['vote_count'].notnull()) & (df['vote_average'].notnull())][['title', 'year', 'vote_count', 'vote_average', 'popularity', 'genres']]
qualified['vote_count'] = qualified['vote_count'].astype('int')
qualified['vote_average'] = qualified['vote_average'].astype('int')
qualified.shape

(28801, 6)

In [134]:
def weighted_rating(x):
    v = x['vote_count']
    R = x['vote_average']
    return (v/(v+c) * R) + (c/(c+v) * m)

In [184]:
qualified['wr'] = qualified.apply(weighted_rating, axis=1)
qualified = qualified.sort_values('wr', ascending=False).head(350)
qualified.head(40)

Unnamed: 0,title,year,vote_count,vote_average,popularity,genres,wr
15480,Inception,2010,14075,8,29.1081,"[Action, Thriller, Science Fiction, Mystery, A...",7.917588
12481,The Dark Knight,2008,12269,8,123.167,"[Drama, Action, Crime, Thriller]",7.905871
22879,Interstellar,2014,11187,8,32.2135,"[Adventure, Drama, Science Fiction]",7.897107
2843,Fight Club,1999,9678,8,63.8696,[Drama],7.881753
4863,The Lord of the Rings: The Fellowship of the Ring,2001,8892,8,32.0707,"[Adventure, Fantasy, Action]",7.871787
292,Pulp Fiction,1994,8670,8,140.95,"[Thriller, Crime]",7.86866
314,The Shawshank Redemption,1994,8358,8,51.6454,"[Drama, Crime]",7.864
7000,The Lord of the Rings: The Return of the King,2003,8226,8,29.3244,"[Adventure, Fantasy, Action]",7.861927
351,Forrest Gump,1994,8147,8,48.3072,"[Comedy, Drama, Romance]",7.860656
5814,The Lord of the Rings: The Two Towers,2002,7641,8,29.4235,"[Adventure, Fantasy, Action]",7.851924


We got Inception, The Dark Knight and Interstellar at the very top of our chart. 

We then constructed our function that builds charts for particular genres. For this, we will use relax our default conditions to the 80th percentile instead of 95.

In [136]:
s = df.apply(lambda x: pd.Series(x['genres']),axis=1).stack().reset_index(level=1, drop=True)
s.name = 'genre'
gen_md = df.drop('genres', axis=1).join(s)

In [137]:
def build_chart(genre, percentile=0.80):
    df = gen_md[gen_md['genre'] == genre]
    vote_counts = df[df['vote_count'].notnull()]['vote_count'].astype('int')
    vote_averages = df[df['vote_average'].notnull()]['vote_average'].astype('int')
    C = vote_averages.mean()
    m = vote_counts.quantile(percentile)
    
    qualified = df[(df['vote_count'] >= m) & (df['vote_count'].notnull()) & (df['vote_average'].notnull())][['title', 'year', 'vote_count', 'vote_average', 'popularity']]
    qualified['vote_count'] = qualified['vote_count'].astype('int')
    qualified['vote_average'] = qualified['vote_average'].astype('int')
    
    qualified['wr'] = qualified.apply(lambda x: (x['vote_count']/(x['vote_count']+m) * x['vote_average']) + (m/(m+x['vote_count']) * C), axis=1)
    qualified = qualified.sort_values('wr', ascending=False).head(350)
    
    return qualified

Here we got our Top 30 rance Movies (Romance almost didn't feature at all in our Generic Top Chart despite being one of the most popular movie genres).

In [138]:
build_chart('Romance').head(30)

Unnamed: 0,title,year,vote_count,vote_average,popularity,wr
10309,Dilwale Dulhania Le Jayenge,1995,661,9,34.457,8.709932
351,Forrest Gump,1994,8147,8,48.3072,7.98165
876,Vertigo,1958,1162,8,18.2082,7.876591
40251,Your Name.,2016,1030,8,34.461252,7.861619
883,Some Like It Hot,1959,835,8,11.8451,7.831401
1132,Cinema Paradiso,1988,834,8,14.177,7.831212
19901,Paperman,2012,734,8,7.19863,7.8099
37863,Sing Street,2016,669,8,10.672862,7.792904
882,The Apartment,1960,498,8,11.9943,7.729211
38718,The Handmaiden,2016,453,8,16.727405,7.705364


# Content Based Recommender

This system uses item metadata, such as genre, director, description, actors, etc. for movies, to make these recommendations. The general idea behind these recommender systems is that if a person liked a particular item, he or she will also like an item that is similar to it.

We built two Content Based Recommenders based on:

1. Movie Overviews and Taglines
2. Movie Cast, Crew, Keywords and Genre




In [139]:
df2 = pd.read_csv(r'C:\Users\avias\Desktop\Movie\small_data.csv')
df2 = df2[df2['tmdbId'].notnull()]['tmdbId'].astype('int')

In [140]:
df = df.drop([19730, 29503, 35587])
df['id'] = df['id'].astype('int')

In [141]:
fd = df[df['id'].isin(df2)]
fd.shape

(9099, 25)

### Movie Description Based Recommender

We tried to build a recommender using movie descriptions and taglines.

In [142]:
fd['tagline'] = fd['tagline'].fillna('')
fd['description'] = fd['overview'] + fd['tagline']
fd['description'] = fd['description'].fillna('')

In [143]:
fd = fd.reset_index()
titles = fd['title']
indices = pd.Series(fd.index, index=fd['title'])

In [144]:
tf = TfidfVectorizer(analyzer='word',ngram_range=(1, 2),min_df=0, stop_words='english')
tfidf_matrix = tf.fit_transform(fd['description'])

In [145]:
tfidf_matrix.shape

(9099, 268124)

### Cosine Similarity

We used the Cosine Similarity to calculate a numeric quantity that denotes the similarity between two movies. Mathematically, it is defined as follows:

cosine(x,y)=(x.y⊺)/(||x||.||y||) 


In [146]:
cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)
cosine_sim[0]

array([1.        , 0.00680476, 0.        , ..., 0.        , 0.00344913,
       0.        ])

In [147]:
def get_recommendations(title):
    idx = indices[title]
    sim_scores = list(enumerate(cosine_sim[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:31]
    movie_indices = [i[0] for i in sim_scores]
    return titles.iloc[movie_indices]

In [148]:
get_recommendations('Superman').head(10)

7718                   All Star Superman
2114    Superman IV: The Quest for Peace
2112                         Superman II
7879           Superman and the Mole-Men
6447                    Superman Returns
8312                        Man of Steel
6693                          Mr. Brooks
5476                        Spider-Man 2
8485                               Enemy
4895                        The Freshman
Name: title, dtype: object

In [149]:
get_recommendations('Toy Story').head(10)

2502               Toy Story 2
7535               Toy Story 3
6193    The 40 Year Old Virgin
2547           Man on the Moon
6627              Factory Girl
4702    What's Up, Tiger Lily?
889      Rebel Without a Cause
6554    For Your Consideration
4988          Rivers and Tides
1599                 Condorman
Name: title, dtype: object

We see that for Superman, our system is able to identify it as a Superman film and subsequently recommend other Superman films as its top recommendations. But unfortunately, that is all this system can do at the moment. This is not of much use to most people as it doesn't take into considerations very important features such as cast, crew, director and genre, which determine the rating and the popularity of a movie.

We would be using much more suggestive metadata than Overview and Tagline. Next we would build a more sophisticated recommender that takes genre, keywords, cast and crew into consideration.

### Metadata Based Recommender

To build this recommender, we will need to merge our current dataset with the crew and the keyword datasets.

In [150]:
df3=pd.read_csv(r'C:\Users\avias\Desktop\Movie\credits.csv')
df4=pd.read_csv(r'C:\Users\avias\Desktop\Movie\keywords.csv')

In [151]:
df4['id'] = df4['id'].astype('int')
df3['id'] = df3['id'].astype('int')
df['id'] = df['id'].astype('int')

In [152]:
df.shape

(45463, 25)

In [153]:
df = df.merge(df3, on='id')
df = df.merge(df4, on='id')

In [154]:
fd = df[df['id'].isin(df2)]
fd.shape

(9219, 28)

We now have our cast, crew, genres and credits, all in one dataframe.

Crew: From the crew, we will only pick the director as our feature since the others don't contribute that much to the feel of the movie.

Cast: Choosing Cast is a little more tricky. Lesser known actors and minor roles do not really affect people's opinion of a movie. So we would only select the major characters and their respective actors.

In [155]:
fd['cast'] = fd['cast'].apply(literal_eval)
fd['crew'] = fd['crew'].apply(literal_eval)
fd['keywords'] = fd['keywords'].apply(literal_eval)
fd['cast_size'] = fd['cast'].apply(lambda x: len(x))
fd['crew_size'] = fd['crew'].apply(lambda x: len(x))

In [156]:
def director_name(x):
    for i in x:
        if i['job'] == 'Director':
            return i['name']
    return np.nan

In [157]:
fd['director'] = fd['crew'].apply(director_name)

In [158]:
fd['cast'] = fd['cast'].apply(lambda x: [i['name'] for i in x] if isinstance(x, list) else [])
fd['cast'] = fd['cast'].apply(lambda x: x[:3] if len(x) >=3 else x)

In [159]:
fd['keywords'] = fd['keywords'].apply(lambda x: [i['name'] for i in x] if isinstance(x, list) else [])

The approach of building this recommender is extremely hacky. So we decided of creating a metadata dump for every movie which consists of genres, director, main actors and keywords. Then we will use a Count Vectorizer to create our count matrix.

We narrowed down the Spaces and converted to Lowercase from all the features so system will not confuse between Johnny Depp and Johnny Galecki and also mentioning the director names three times will give exact result.

In [160]:
fd['cast'] = fd['cast'].apply(lambda x: [str.lower(i.replace(" ", "")) for i in x])

In [161]:
fd['director'] = fd['director'].astype('str').apply(lambda x: str.lower(x.replace(" ", "")))
fd['director'] = fd['director'].apply(lambda x: [x,x, x])

In [162]:
key = fd.apply(lambda x: pd.Series(x['keywords']),axis=1).stack().reset_index(level=1, drop=True)
key.name = 'keyword'

In [163]:
key = key.value_counts()
key[:10]

independent film        610
woman director          550
murder                  399
duringcreditsstinger    327
based on novel          318
violence                264
love                    222
musical                 219
sex                     219
suspense                212
Name: keyword, dtype: int64

In [164]:
key = key[key > 1]
stemmer = SnowballStemmer('english')
stemmer.stem('dogs')

'dog'

In [165]:
def keywords_filter(x):
    words = []
    for i in x:
        if i in s:
            words.append(i)
    return words

In [166]:
fd['keywords'] = fd['keywords'].apply(keywords_filter)
fd['keywords'] = fd['keywords'].apply(lambda x: [stemmer.stem(i) for i in x])
fd['keywords'] = fd['keywords'].apply(lambda x: [str.lower(i.replace(" ", "")) for i in x])

In [167]:
fd['soup'] = fd['keywords'] + fd['cast'] + fd['director'] + fd['genres']
fd['soup'] = fd['soup'].apply(lambda x: ' '.join(x))

In [168]:
count = CountVectorizer(analyzer='word',ngram_range=(1, 2),min_df=0, stop_words='english')
count_matrix = count.fit_transform(fd['soup'])

In [169]:
cosine_sim = cosine_similarity(count_matrix, count_matrix)

In [170]:
get_recommendations('The Dark Knight').head(15)

3900                                  The Formula
284                      The Shawshank Redemption
2547                              Man on the Moon
538                         Hellraiser: Bloodline
859                                Shall We Dance
1180                             That Old Feeling
2058                           A Walk on the Moon
4719                       The Day of the Dolphin
6768                               Tekkonkinkreet
7623                              Never Let Me Go
2100    Star Wars: Episode I - The Phantom Menace
4588                                A Chorus Line
3845                                       Piñero
1265                                 Career Girls
2301                                        Tommy
Name: title, dtype: object

In [171]:
get_recommendations('The Godfather').head(15)

3347                Death on the Nile
1008                      The Shining
1488                         Cimarron
5536                Little Black Book
128                           Amateur
16              Sense and Sensibility
24                  Leaving Las Vegas
27                         Persuasion
41                        Restoration
44      How To Make An American Quilt
47              When Night Is Falling
69                       Bed of Roses
79                 Angels and Insects
112                    The Star Maker
121               Up Close & Personal
Name: title, dtype: object

### Popularity and Ratings

It recommends movies regardless of ratings and popularity.
Therefore, we will add a mechanism to remove bad movies and return movies which are popular and have had a good critical response.

We will take the top 25 movies based on similarity scores and calculate the vote of the 60th percentile movie. Then, we will calculate the weighted rating of each movie using IMDB's formula.

In [172]:
def improved_recommendations(title):
    idx = indices[title]
    sim_scores = list(enumerate(cosine_sim[idx]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:26]
    movie_indices = [i[0] for i in sim_scores]
    
    movies = fd.iloc[movie_indices][['title', 'vote_count', 'vote_average', 'year']]
    vote_counts = movies[movies['vote_count'].notnull()]['vote_count'].astype('int')
    vote_averages = movies[movies['vote_average'].notnull()]['vote_average'].astype('int')
    C = vote_averages.mean()
    m = vote_counts.quantile(0.60)
    qualified = movies[(movies['vote_count'] >= m) & (movies['vote_count'].notnull()) & (movies['vote_average'].notnull())]
    qualified['vote_count'] = qualified['vote_count'].astype('int')
    qualified['vote_average'] = qualified['vote_average'].astype('int')
    qualified['wr'] = qualified.apply(weighted_rating, axis=1)
    qualified = qualified.sort_values('wr', ascending=False).head(10)
    return qualified

In [173]:
improved_recommendations('The Godfather')

Unnamed: 0,title,vote_count,vote_average,year,wr
212,Before Sunrise,984,7,1995,6.462824
24,Leaving Las Vegas,365,7,1995,6.046665
16,Sense and Sensibility,364,7,1995,6.04547
257,Little Women,222,7,1994,5.838849
151,Belle de Jour,163,7,1967,5.724096
196,The Umbrellas of Cherbourg,119,7,1964,5.622577
199,Total Eclipse,112,6,1995,5.39979
261,Like Water for Chocolate,70,6,1992,5.349772
45,How To Make An American Quilt,38,6,1995,5.305689
138,Up Close & Personal,51,5,1996,5.219145


In [174]:
improved_recommendations('Pulp Fiction')

Unnamed: 0,title,vote_count,vote_average,year,wr
1078,Reservoir Dogs,3821,8,1992,7.718986
20254,Django Unchained,10297,7,2012,6.929017
13756,Inglourious Basterds,6598,7,2009,6.891679
6794,Kill Bill: Vol. 1,5091,7,2003,6.862133
28345,The Hateful Eight,4405,7,2015,6.842588
7340,Kill Bill: Vol. 2,4061,7,2004,6.830542
1667,Jackie Brown,1580,7,1997,6.62179
162,Die Hard: With a Vengeance,2094,6,1995,5.870366
69,From Dusk Till Dawn,1644,6,1996,5.842293
11978,Death Proof,1359,6,2007,5.817225


## Collaborative 

In [13]:
conda install -c conda-forge scikit-surprise

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.


Note: you may need to restart the kernel to use updated packages.


In [16]:

reader = Reader()
ratings=pd.read_csv("ratings_small.csv")
ratings.head()


Unnamed: 0,userId,movieId,rating,timestamp
0,1,31,2.5,1260759144
1,1,1029,3.0,1260759179
2,1,1061,3.0,1260759182
3,1,1129,2.0,1260759185
4,1,1172,4.0,1260759205


In [19]:
ratings[ratings['userId'] == 1]

Unnamed: 0,userId,movieId,rating,timestamp
0,1,31,2.5,1260759144
1,1,1029,3.0,1260759179
2,1,1061,3.0,1260759182
3,1,1129,2.0,1260759185
4,1,1172,4.0,1260759205
5,1,1263,2.0,1260759151
6,1,1287,2.0,1260759187
7,1,1293,2.0,1260759148
8,1,1339,3.5,1260759125
9,1,1343,2.0,1260759131


# Hybrid Recommendor

We tried to build a simple hybrid recommender that brings together:

Input: User ID and the Title of a Movie

In [179]:
def convert_int(x):
    try:
        return int(x)
    except:
        return np.nan

In [180]:
id_map = pd.read_csv(r'C:\Users\avias\Desktop\Movie\small_data.csv')[['movieId', 'tmdbId']]
id_map['tmdbId'] = id_map['tmdbId'].apply(convert_int)
id_map.columns = ['movieId', 'id']
id_map = id_map.merge(fd[['title', 'id']], on='id').set_index('title')
#id_map = id_map.set_index('tmdbId')

In [181]:
indices_map = id_map.set_index('id')

In [182]:
def hybrid(userId, title):
    idx = indices[title]
    tmdbId = id_map.loc[title]['id']
    movie_id = id_map.loc[title]['movieId']
    sim_scores = list(enumerate(cosine_sim[int(idx)]))
    sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
    sim_scores = sim_scores[1:26]
    movie_indices = [i[0] for i in sim_scores]
    movies = fd.iloc[movie_indices][['title', 'vote_count', 'vote_average', 'release_date', 'id']]
    return movies.head(10)

In [183]:
hybrid(1, 'Avatar')

Unnamed: 0,title,vote_count,vote_average,release_date,id
3373,Heart and Souls,84.0,6.6,1993-08-13,12187
4945,Truly Madly Deeply,14.0,6.6,1990-11-10,18317
8495,Heaven Can Wait,40.0,6.9,1943-08-11,18727
15563,TiMER,91.0,5.9,2009-05-14,39545
1335,The Preacher's Wife,55.0,5.4,1996-12-13,21539
1600,A Life Less Ordinary,130.0,6.2,1997-10-24,8067
2174,The Butcher's Wife,35.0,5.6,1991-10-25,20096
4054,Down to Earth,98.0,4.9,2001-02-16,16300
4490,Chances Are,66.0,6.5,1989-03-10,3064
4496,Dream a Little Dream,26.0,5.8,1989-03-03,15142


# Conclusion:

We created recommenders using demographic , content- based and collaborative filtering. While demographic filtering is very elemantary and cannot be used practically, Hybrid Systems can take advantage of content-based and collaborative filtering as the two approaches are proved to be almost complimentary. This model was very baseline and only provides a fundamental framework to start with.