In [None]:
# Reference: https://analyticsindiamag.com/how-to-build-your-first-recommender-system-using-python-movielens-dataset/ 

In [464]:
import numpy as np
import pandas as pd
from tabulate import tabulate
import warnings
warnings.filterwarnings("ignore")

## Merge Dataset

In [250]:
# Download ratings
data = pd.read_csv('data/ml-latest-small/ratings.csv')
# Download movies
movie_titles_genre = pd.read_csv("data/ml-latest-small/movies.csv")
# Merge datasets
movie_ratings = data.merge(movie_titles_genre,on='movieId', how='left')

## Calculate new features

For easier data entry for the user, the movie title was split into two additional columns for a short form of the title and the year the movie was released. I also applied some formatting changes to have it print nicer for readability.

In [437]:
# Create short movie title
movie_ratings['title_short'] = movie_ratings['title'].str.split('(', 1).str[0].str.rstrip()

In [440]:
# Seperate year from title
movie_ratings['year'] = movie_ratings['title'].str.split('(', 1).str[1].str.replace(')', '').str[-4:]

In [441]:
# Add spaces to genres
movie_ratings['genres'] = movie_ratings['genres'].str.replace('|', ' | ')

The average and count of ratings will be used as a reference point in the recommendation system.

In [442]:
# Calculate average rating score (out of 5)
average_ratings = pd.DataFrame(movie_ratings.groupby(['title','title_short','year', 'genres'])['rating'].mean())

In [443]:
# Calculate total number of ratings
average_ratings['total_ratings'] = pd.DataFrame(movie_ratings.groupby(['title','title_short','year', 'genres'])['rating'].count())

## Pivot Table of Ratings by User

Pivot table was created to list the ratings per movie and user.

In [444]:
# Pivot table of movie rating by user
movie_user = movie_ratings.pivot_table(index='userId',columns='title_short',values='rating')

## Recommendation System

A function called recommendMovie() was created to start the program. It will prompt the user to enter a movie name in the short title form. The pivot table will be used to determine the selected movie's correlation to all other movies in the dataset. Once correlations are made, the average rating of the movies are joined to the recommendation. I have limited the movie responses to those that have had 100 ratings in order to have a wide variety of opinions on the movie before recommendation. The recommendations are then sorted and the top 10 are selected. The output is formatted and presented to the user.

In [491]:
def recommendMovie():
    print("The movie recommendation system will allow you to look up recommendations based on a movie title in short form (without year).\n")
    while True:
        try:
            # Get movie name from user
            movie_name = input("Movie Name: ")
            
            # Get correlation to chosen movie name
            correlations = movie_user.corrwith(movie_user[movie_name])

            # Create dataframe with correlations
            recommendation = pd.DataFrame(correlations,columns=['correlation'])
            recommendation.dropna(inplace=True)
            recommendation = recommendation.join(average_ratings)

            # Find recommendations when 100 or more ratings
            recc = recommendation[recommendation['total_ratings']>=100].sort_values('correlation',ascending=False).reset_index()

            # Select top 10
            top_10 = recc[['title_short', 'year', 'rating']][1:11]

            # Round rating to two decimal points
            top_10['rating'] = top_10['rating'].round(2)

            print('\nRecommended movies based on your interest in ' + movie_name + ':')
            print(tabulate(top_10, headers=top_10.iloc[:0], tablefmt='fancy_grid'))
        except:
            print('\nNo recommendation could be made. Please try a different title')
            continue
        break

In [492]:
# Run this to utilize the system
recommendMovie()

The movie recommendation system will allow you to look up recommendations based on a movie title in short form (without year).



Movie Name:  Toy Story



Recommended movies based on your interest in Toy Story:
╒════╤════════════════════════════╤════════╤══════════╕
│    │ title_short                │   year │   rating │
╞════╪════════════════════════════╪════════╪══════════╡
│  1 │ Incredibles, The           │   2004 │     3.84 │
├────┼────────────────────────────┼────────┼──────────┤
│  2 │ Finding Nemo               │   2003 │     3.96 │
├────┼────────────────────────────┼────────┼──────────┤
│  3 │ Aladdin                    │   1992 │     3.79 │
├────┼────────────────────────────┼────────┼──────────┤
│  4 │ Monsters, Inc.             │   2001 │     3.87 │
├────┼────────────────────────────┼────────┼──────────┤
│  5 │ Mrs. Doubtfire             │   1993 │     3.39 │
├────┼────────────────────────────┼────────┼──────────┤
│  6 │ Amelie                     │   2001 │     4.18 │
├────┼────────────────────────────┼────────┼──────────┤
│  7 │ American Pie               │   1999 │     3.38 │
├────┼────────────────────────────┼────────┼───