Building the simple recommender is fairly straightforward. The
steps are as follows:
1. Choose a metric (or score) to rate the movies on
2. Decide on the prerequisites for the movie to be featured
on the chart
3. Calculate the score for every movie that satisfies the
conditions
4. Output the list of movies in decreasing order of their
scores

In [1]:
import pandas as pd

In [2]:
movies = pd.read_csv('tmdb_5000_movies.csv')
credits = pd.read_csv('tmdb_5000_credits.csv')

In [3]:
movies.head(1)

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2009-12-10,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800


In [7]:
movies.shape

(4803, 20)

Weighted Rating (WR) = ((v/(v+m))*R)+((v/(v+m))*C)

v is the number of votes garnered by the movie
m is the minimum number of votes required for the
movie to be in the chart (the prerequisite)
R is the mean rating of the movie
C is the mean rating of all the movies in the dataset

In [5]:
m = movies['vote_count'].quantile(0.80)
m

957.6000000000004

We can see that only 20% of the movies have gained more than 957.6000000000004 votes. Therefore, our value of m is 957.6000000000004.

Another prerequisite that we want in place is the runtime. We
will only consider movies that are greater than 45 minutes and
less than 300 minutes in length. Let us define a new DataFrame,
q_movies, which will hold all the movies that qualify to appear in
the chart

In [6]:
q_movies = movies[(movies['runtime'] >= 45) & (movies['runtime'] <= 300)]
q_movies.shape

(4761, 20)

We see that from our dataset of 4803 movies approximately 4761 movies made the cut.

The final value that we need to discover before we calculate our
scores is C, the mean rating for all the movies in the dataset

In [8]:
C = movies['vote_average'].mean()
C

6.092171559442016

We can see that the average rating of a movie is approximately
6.09/10. It seems that IMDB happens to be particularly strict
with their ratings. Now that we have the value of C, we can go
about calculating our score for each movie

First, let us define a function that computes the rating for a
movie, given its features and the values of m and C

In [9]:
def weighted_rating(x, m=m, C=C):
    v = x['vote_count']
    R = x['vote_average']
    return (v/(v+m) * R) + (m/(m+v) * C)

In [10]:
q_movies['score'] = q_movies.apply(weighted_rating, axis=1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  q_movies['score'] = q_movies.apply(weighted_rating, axis=1)


There is just one step left. We now need to sort our DataFrame
on the basis of the score we just computed and output the list of
top movies

In [12]:
q_movies.sort_values(by='score', ascending=False)

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,...,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count,score
1881,25000000,"[{""id"": 18, ""name"": ""Drama""}, {""id"": 80, ""name...",,278,"[{""id"": 378, ""name"": ""prison""}, {""id"": 417, ""n...",en,The Shawshank Redemption,Framed in the 1940s for the double murder of h...,136.747729,"[{""name"": ""Castle Rock Entertainment"", ""id"": 97}]",...,1994-09-23,28341469,142.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,Fear can hold you prisoner. Hope can set you f...,The Shawshank Redemption,8.5,8205,8.248353
662,63000000,"[{""id"": 18, ""name"": ""Drama""}]",http://www.foxmovies.com/movies/fight-club,550,"[{""id"": 825, ""name"": ""support group""}, {""id"": ...",en,Fight Club,A ticking-time-bomb insomniac and a slippery s...,146.757391,"[{""name"": ""Regency Enterprises"", ""id"": 508}, {...",...,1999-10-15,100853753,139.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,Mischief. Mayhem. Soap.,Fight Club,8.3,9413,8.096134
3337,6000000,"[{""id"": 18, ""name"": ""Drama""}, {""id"": 80, ""name...",http://www.thegodfather.com/,238,"[{""id"": 131, ""name"": ""italy""}, {""id"": 699, ""na...",en,The Godfather,"Spanning the years 1945 to 1955, a chronicle o...",143.659698,"[{""name"": ""Paramount Pictures"", ""id"": 4}, {""na...",...,1972-03-14,245066411,175.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,An offer you can't refuse.,The Godfather,8.4,5893,8.077404
3232,8000000,"[{""id"": 53, ""name"": ""Thriller""}, {""id"": 80, ""n...",,680,"[{""id"": 396, ""name"": ""transporter""}, {""id"": 14...",en,Pulp Fiction,"A burger-loving hit man, his philosophical par...",121.463076,"[{""name"": ""Miramax Films"", ""id"": 14}, {""name"":...",...,1994-10-08,213928762,154.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Just because you are a character doesn't mean ...,Pulp Fiction,8.3,8428,8.074738
65,185000000,"[{""id"": 18, ""name"": ""Drama""}, {""id"": 28, ""name...",http://thedarkknight.warnerbros.com/dvdsite/,155,"[{""id"": 849, ""name"": ""dc comics""}, {""id"": 853,...",en,The Dark Knight,Batman raises the stakes in his war on crime. ...,187.322927,"[{""name"": ""DC Comics"", ""id"": 429}, {""name"": ""L...",...,2008-07-16,1004558444,152.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Why So Serious?,The Dark Knight,8.2,12002,8.044250
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
303,100000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...",,314,"[{""id"": 418, ""name"": ""white russian""}, {""id"": ...",en,Catwoman,Liquidated after discovering a corporate consp...,32.271938,"[{""name"": ""Village Roadshow Pictures"", ""id"": 7...",...,2004-07-22,82102379,104.0,"[{""iso_639_1"": ""es"", ""name"": ""Espa\u00f1ol""}, ...",Released,CATch her in IMAX,Catwoman,4.2,808,5.226248
3746,4000000,"[{""id"": 53, ""name"": ""Thriller""}]",http://www.theboynextdoorfilm.com/,241251,"[{""id"": 255, ""name"": ""male nudity""}, {""id"": 29...",en,The Boy Next Door,A recently cheated on married woman falls for ...,24.161735,"[{""name"": ""Universal Pictures"", ""id"": 33}, {""n...",...,2015-01-23,52425855,91.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,A Moment She Couldn't Resist. An Obsession He ...,The Boy Next Door,4.1,1022,5.063681
1652,100000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",,14164,"[{""id"": 3436, ""name"": ""karate""}, {""id"": 9715, ...",en,Dragonball Evolution,The young warrior Son Goku sets out on a quest...,21.677732,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...",...,2009-04-01,0,85.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,The legend comes to life.,Dragonball Evolution,2.9,462,5.053299
210,125000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...",,415,"[{""id"": 848, ""name"": ""double life""}, {""id"": 84...",en,Batman & Robin,Along with crime-fighting partner Robin and ne...,50.073575,"[{""name"": ""PolyGram Filmed Entertainment"", ""id...",...,1997-06-20,238207122,125.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,Strength. Courage. Honor. And loyalty.,Batman & Robin,4.2,1418,4.962731
