# Model Evaluation & Recommendation

**Haley Lautenbach**<br>
haley.lautenbach@gmail.com

**Research Problem:**
There are many movie recommendation systems or streaming services that allow for users to set up a profile. This profile then adds information to the system about what types of films the user does and doesn't like, allowing for the system to make personalized recommendations to the user. However, there doesn't seem to be systems that recommend to more than one user, allowing for couples, families, groups of friends, to have personalized recommendations that take all of their preferences into account. 

This is notebook 3/4. This last notebook will cover some further model evaluation as well as the creation of function that allow for recommendations to be visible to the user. First, precision and recall are calculated for both models. Judging by the values that were returned, a hybrid approach seemed appropriate, so the ratings were averaged together to create hybrid predictions. These predictions were then used for the functions that recommended films to multiple users. The notebook also covers the creation of functions that allow for recommendations to new users, or users that we do not have predictions for.

In [1]:
# import relevant packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from surprise import Dataset
from surprise.reader import Reader
from surprise import accuracy

from collections import defaultdict
from surprise import dump
import pickle

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Model-Evaluation-&amp;-Recommendation" data-toc-modified-id="Model-Evaluation-&amp;-Recommendation-1">Model Evaluation &amp; Recommendation</a></span></li><li><span><a href="#1.-Import-Relevant-Datasets" data-toc-modified-id="1.-Import-Relevant-Datasets-2">1. Import Relevant Datasets</a></span></li><li><span><a href="#2.-Precision-and-Recall-at-K" data-toc-modified-id="2.-Precision-and-Recall-at-K-3">2. Precision and Recall at K</a></span></li><li><span><a href="#3.-Mix-Ratings-for-Hybrid-Approach" data-toc-modified-id="3.-Mix-Ratings-for-Hybrid-Approach-4">3. Mix Ratings for Hybrid Approach</a></span></li><li><span><a href="#4.-Top-Ratings" data-toc-modified-id="4.-Top-Ratings-5">4. Top Ratings</a></span><ul class="toc-item"><li><span><a href="#4.1-Top-Ratings-for-One" data-toc-modified-id="4.1-Top-Ratings-for-One-5.1">4.1 Top Ratings for One</a></span></li><li><span><a href="#4.2-Top-Ratings-for-Two" data-toc-modified-id="4.2-Top-Ratings-for-Two-5.2">4.2 Top Ratings for Two</a></span></li></ul></li><li><span><a href="#5.-Similar-Films" data-toc-modified-id="5.-Similar-Films-6">5. Similar Films</a></span></li><li><span><a href="#6.-New-Users" data-toc-modified-id="6.-New-Users-7">6. New Users</a></span><ul class="toc-item"><li><span><a href="#6.1-Top-In-Genre" data-toc-modified-id="6.1-Top-In-Genre-7.1">6.1 Top In Genre</a></span></li><li><span><a href="#5.2-Addition-of-New-Ratings" data-toc-modified-id="5.2-Addition-of-New-Ratings-7.2">5.2 Addition of New Ratings</a></span></li></ul></li><li><span><a href="#7.-Next-Steps" data-toc-modified-id="7.-Next-Steps-8">7. Next Steps</a></span></li></ul></div>

# 1. Import Relevant Datasets

In [2]:
# import relevant data
movies_reference = pd.read_csv('My Datasets/movies_reference.csv')
movies_reference

Unnamed: 0,movieId,title,year_of_release,averageRating,numVotes,runtimeMinutes
0,1,Toy Story,1995,8.3,896826.0,81.0
1,2,Jumanji,1995,7.0,312103.0,104.0
2,3,Grumpier Old Men,1995,6.7,24447.0,101.0
3,4,Waiting to Exhale,1995,6.0,9671.0,124.0
4,5,Father of the Bride Part II,1995,6.1,34885.0,106.0
...,...,...,...,...,...,...
27273,131254,Kein Bund für's Leben,2007,5.0,1409.0,85.0
27274,131256,"Feuer, Eis & Dosenbier",2002,3.3,1077.0,83.0
27275,131258,The Pirates,2014,6.6,2957.0,130.0
27276,131260,Rentun Ruusu,2001,6.6,1221.0,102.0


In [3]:
# reading in ratings set
ratings_df = pd.read_csv('Datasets/ml-1m/ratings.dat', sep="::", engine='python', header=None, names=['userId', 'movieId', 'rating', 'timestamp'])
reader = Reader(rating_scale=(1,5))
ratings_utility = Dataset.load_from_df(ratings_df[['userId', 'movieId', 'rating']], reader=reader)

In [4]:
# reading in movie similarities data
movie_similarities = np.load('My Datasets/movie_similarities.npy')

In [5]:
# importing the models, they are tuples (predictions, algorithm)
SVD_test = dump.load('Models/final_svd_test.pkl')
KNN_test = dump.load('Models/final_KNN_test.pkl')

In [6]:
# creating prediction variable from the imported SVD
predictions_SVD_test = SVD_test[0]
predictions_KNN_test = KNN_test[0]
SVD = SVD_test[1]
KNN = KNN_test[1]

In [7]:
# loading models & predictions for anti-testset
SVD_anti = dump.load('Models/final_svd_anti.pkl')
KNN_anti = dump.load('Models/final_KNN_anti.pkl')

In [8]:
# creating separation predictions & model variables
predictions_SVD_anti = SVD_anti[0]
predictions_KNN_anti = KNN_anti[0]

# 2. Precision and Recall at K

For computing precision and recall, the surprise package does not have any obvious methods available. However, they have provided the code for a function that will compute precision and recall at "k", k being the the top N predictions for each user. Since the provided function is the most efficient way to compute these metrics, we will be relying heavily on the offering.<br>
The resulting precision and recall scores for each user can then be averaged across all users to get a final score for each. The code below was influenced greatly by the [surprise P@K/R@K function](https://surprise.readthedocs.io/en/stable/FAQ.html#where-are-datasets-stored-and-how-to-change-it), however it has been modified to suit the needs of our particular situation.

In [9]:
# help on what a defaultdict is:
# https://stackoverflow.com/questions/5900578/how-does-collections-defaultdict-work

# help on how to calculate precision and recall
# https://surprise.readthedocs.io/en/stable/FAQ.html#where-are-datasets-stored-and-how-to-change-it

def precision_and_recall_at_k(predictions, k=10, threshold=4):
    
    # first using a default dict, we will map the estimated predicitons
    # and the actual ratings to each user
    # we use the default dict because it allows you to create the key when
    # you call it, instead of having to instantiate it beforehand
    user_predicted_true_dict = defaultdict(list)
    # uid = user id, iid = item id, r_ui = real rating of user for item, est = pred rating
    for uid, iid, r_ui, est, details in predictions:
        # creating key-value pair for each user, appending list of rating tuples
        user_predicted_true_dict[uid].append((est, r_ui))

    # instantiating the empty dictionaries 
    precision_dict = dict()
    recall_dict = dict()
    
    # now we will be looping through the dictionary 
    for uid, ratings in user_predicted_true_dict.items():
        
        # sort the tuples in the keys by the predicted value
        # lambda function calls x as each tuple, indexes to the first item (est)
        # sorts the tuple list by that estimated rating
        ratings.sort(key=lambda x: x[0], reverse=True)
        
        # getting number of relevant items
        # summing up the number of items with a true rating over the pre-defined threshold
        relevant_items_count = 0
        for (est, r_ui) in ratings:
            if r_ui >= threshold:
                relevant_items_count += 1
        
        # getting number of recommended items in the top k
        # summing up the number of items with an estimated rating over the threshold
        recommended_items_in_top_k = 0
        for (est, r_ui) in ratings[:k]:
            if est >= threshold:
                recommended_items_in_top_k += 1
        
        # getting total number of relevant & recommended in top k
        total_relevant_and_recommended = 0
        for (est, r_ui) in ratings[:k]:
            if (est >= threshold) & (r_ui >= threshold):
                total_relevant_and_recommended += 1
        
        # PRECISION @K: the proportion of recommendend items that are actually relevant
        # if there are no recommended items in the top K, then precision will be = 0
        if recommended_items_in_top_k != 0:
            precision_dict[uid] = total_relevant_and_recommended/recommended_items_in_top_k
        else:
            precision_dict[uid] = 0
        
        # RECALL @K: the proportion of relevant items that are actually recommended
        # if there are no relevant items, then recall is =0
        if relevant_items_count != 0:
            recall_dict[uid] = total_relevant_and_recommended/relevant_items_count
        else:
            recall_dict[uid] = 0
    
    return precision_dict, recall_dict

In [10]:
# calling function to create precision and recall dictionaries for SVD
precisions_svd, recalls_svd = precision_and_recall_at_k(predictions_SVD_test, k=20, threshold=4)

In [11]:
print(f'SVD Precision: {sum(precisions_svd.values())/len(precisions_svd)}')
print(f'SVD Recall: {sum(recalls_svd.values())/len(recalls_svd)}')

SVD Precision: 0.8993341421450955
SVD Recall: 0.2608171324026318


For the SVD, we can see that the precision score is pretty good, while the recall is not. To remind ourselves, precision is the proportion of recommended items that are relevant. This means that 90% of the items that we recommended are relevant to the user. Recall refers to the proportion of relevant items that were recommended. This means that only 26% of the relevant items to the user were actually recommended. With a recall score this low, it means we are missing out on a significant portion of relevant items that should be recommended. However, because the precision is so high, it means that whatever has been recommended is relevant to the user.

In [12]:
# calling function to create precision and recall dictionaries for KNN
precisions_knn, recalls_knn = precision_and_recall_at_k(predictions_KNN_test, k=20, threshold=4)

In [13]:
# precisions and recalls are averaged across all users
print(f'KNN Precision: {sum(precisions_knn.values())/len(precisions_knn)}')
print(f'KNN Recall: {sum(recalls_knn.values())/len(recalls_knn)}')

KNN Precision: 0.8010828998735005
KNN Recall: 0.4170614557349849


The recall for the KNN is better, with a 16% increase, but the precision is worse by about 10%.

# 3. Mix Ratings for Hybrid Approach

Because of the increased recall in the KNN and the high precision in the SVD, it might be prudent to average the estimated ratings that each model predicted. This way we will lose a bit of our precision, but gain a bit of recall, in the hopes that our recommendations are more relevant from both models than from one.<br>
To achieve this we will create a function that takes in two sets of predictions and returns a dictionary of the mixed predictions for each item with a key of user id.

In [14]:
def mix_ratings(predictions1, predictions2):
    '''Takes in two sets of raw predictions from recommender system models, averages
    the estimated ratings and puts the results into a new dictionary.
    Required incoming format: uid, iid, r_ui, est, ___.
    Where uid = user id, iid = item id, r_ui is the true rating of item by user, est is
    the estimated rating of item by user.
    Output format: dictionary of list of tuples, ex. uid: [(iid, est)]'''
    
    # first using a default dict, we will map the estimated predicitons
    # and the item id for each user
    # we use the default dict because it allows you to create the key when
    # you call it, instead of having to instantiate it beforehand
    model1_predictions_dictionary = defaultdict(list)
    model2_predictions_dictionary = defaultdict(list)
    
    # uid = user id, iid = item id, r_ui = real rating of user for item, est = pred rating
    for uid, iid, r_ui, est, details in predictions1:
        # creating key-value pair for each user, appending list of item-rating tuples
        model1_predictions_dictionary[uid].append((iid, est))
        
    for uid, iid, r_ui, est, details in predictions2:
        # creating key-value pair for each user, appending list of item-rating tuples
        model2_predictions_dictionary[uid].append((iid, est))
    
    # instanatiate empty mixed dictionary
    ratings_mixed = defaultdict(list)
    
    # for each userid - ratings tuple in the first dictionary
    for uid1, ratings1 in model1_predictions_dictionary.items():
        # check with each userid - ratings tuple in the second dictionary
        for uid2, ratings2 in model2_predictions_dictionary.items():
            # where the user ids match
            if uid1 == uid2:
                # go through each iid, rating in the first dictionary
                for iid1, rating1 in ratings1:
                    # check with each iid, rating in the second dictionary
                    for iid2, rating2 in ratings2: 
                        # if the item ids are the same
                        if iid1 == iid2:
                            # average the ratings
                            rating_mix = (rating1+rating2)/2
                            # append the new averaged rating to the new mixed dictionary
                            ratings_mixed[uid1].append((iid1, rating_mix))
    
    # return the completed mix rating dictionary
    return ratings_mixed

We can now call the function and pass through our predictions from both the KNN and the SVD on the anti-testset. Once we have the predictions, we can then pickle the file for safekeeping.

In [15]:
# calling function with the anti-testset predictions for both KNN & SVD
KNN_SVD_predictions = mix_ratings(predictions_SVD_anti, predictions_KNN_anti)

In [16]:
# pickle file
mixed_predictions = open('mixed_predictions.pkl', 'wb')
pickle.dump(KNN_SVD_predictions, mixed_predictions)

In [17]:
# close connection once pickling is complete
mixed_predictions.close()

In [18]:
# reopen file at later date, unneccesary if you haven't killed the kernel yet
mixed_predictions = open('Models/mixed_predictions.pkl', 'rb')
KNN_SVD_predictions = pickle.load(mixed_predictions)

In [19]:
# close connection once file upload is complete
mixed_predictions.close()

# 4. Top Ratings

## 4.1 Top Ratings for One

The next step is to get the top rated films for each user, so we know what films to recommend to the users.
The function created below is heavily influenced by the suprise function [get_top_n](https://surprise.readthedocs.io/en/stable/FAQ.html). This function will be used to take the hybrid ratings from the function previous and then sort them according to the predicted ratings. The resulting dictionary will then be returned with only the top N for each user, as specified in the function call. The dictionary will also be subject to a rating threshold, where only films with estimated ratings above a certain point will be returned.

In [25]:
def get_top_N_movies(predictions, n=10, threshold=3.5):
    ''' Takes in dictionary of user ids and associated true rating/predited rating tuples.
    Returns the top N per user, as specified in the function call.
    Required format: dictionary of lists of tuples, ex. uid: [(iid, est)]
    Where uid = user id, iid = item id and est = predicted rating of item by user.
    Arguments: predictions, top N ratings, ratings threshold'''
    
    user_top_n_films = defaultdict(list)
    
    # going through the list of tuples, sorting by the predicted rating
    # slicing out only the top n for each user
    for uid, ratings in predictions.items():
        
        # sort the tuples in the keys by the predicted rating
        # lambda function calls x as each tuple, indexes to the first item (est)
        # sorts the tuple list by that estimated rating
        ratings.sort(key=lambda x: x[1], reverse=True)
        
        # separates off the top n ratings for each user
        user_top_n_films[uid] = ratings[:n]
    
    # deleting data in the top n ratings that is not within our rating threshold
    for uid, ratings in user_top_n_films.items():
        # if the estimated rating is below the threshold
        for i, rating in enumerate(ratings):
            if rating[1] < threshold:
                ratings.pop(i)
        
    return user_top_n_films

In [28]:
# calling function with mixed prediction, for top 100, above a rating of 4
top_dict = get_top_N_movies(KNN_SVD_predictions, n=100, threshold=4)

Now that we have the top films for each user, we can go through and find points of commonality between users.

## 4.2 Top Ratings for Two

The final function in predicting ratings for two users will need to take into account items that appear on both users top lists. Using each users top lists will ensure that all films recommended to the couple are highly rated for each. The thresholds are set in the function previous, where the top films are selected. This function will then return a dataframe with options that would rank highly for both users, as well as the averaged rating for the couple.

In [34]:
def ratings_for_two(user1, user2):
    '''Takes in two user ids, returns a dataframe of movies that would be recommended for both'''
    
    # uses top_n function to create list of movies that each user would like
    movies_for_one = top_dict[user1]
    movies_for_two = top_dict[user2]
    
    # instatiate empty list for the combined movies
    movies_for_both = []
    
    # filling movies for both list, averaging the couple rating
    for (iid1, rating1) in movies_for_one:
        for (iid2, rating2) in movies_for_two:
            # IF the movie ids match, append the id and an averaged rating to the list
            if iid1 == iid2:
                movies_for_both.append((iid1, ((rating1+rating2)/2)))
            
    
    # sort the list by the averaged rating
    movies_for_both.sort(key=lambda x: x[1], reverse=True)
    
    # instantiating dataframe for visibility
    top_for_both = pd.DataFrame(columns=['Title', 'Year Of Release', 'IMDB Rating /10', 'Couple Pred Rating /5', 'IMDB Vote Count', 'MovieId'])
    
    # for each movie + rating in the movies for both list
    for i, (iid, rating) in enumerate(movies_for_both):
        try:
        # populate dataframe, using the movies reference table for the values
            top_for_both.loc[i] = [str(movies_reference['title'][movies_reference['movieId'] == iid].values).strip("(?:[''])"),
                                   int(movies_reference['year_of_release'][movies_reference['movieId'] == iid].values),
                                   float(movies_reference['averageRating'][movies_reference['movieId'] == iid].values),
                                   round(rating,1),
                                   int(movies_reference['numVotes'][movies_reference['movieId'] == iid].values),
                                   int(movies_reference['movieId'][movies_reference['movieId'] == iid].values)]
        except TypeError:
            pass
    
    top_for_both = top_for_both[top_for_both['IMDB Vote Count'] > 45000]
    top_for_both = top_for_both[top_for_both['Year Of Release'] > 1965]
    
    top_for_both.reset_index(drop=True, inplace=True)
    
    # return the dataframe
    return top_for_both

In [42]:
# calling function with random pair of users (available 1-6000)
couple_movies = ratings_for_two(44,60)
couple_movies

Unnamed: 0,Title,Year Of Release,IMDB Rating /10,Couple Pred Rating /5,IMDB Vote Count,MovieId
0,Saving Private Ryan,1998,8.6,4.5,1249095,2028
1,Toy Story 2,1999,7.9,4.4,531981,3114
2,"Green Mile, The",1999,8.6,4.3,1162106,3147
3,Life Is Beautiful (La Vita è bella,1997,8.6,4.3,630999,2324
4,Apollo 13,1995,7.6,4.3,272113,150
5,Braveheart,1995,8.3,4.2,966758,110
6,October Sky,1999,7.8,4.2,83538,2501
7,Die Hard,1988,8.2,4.2,799366,1036
8,"Hunt for Red October, The",1990,7.6,4.2,182158,1610
9,"Christmas Story, A",1983,7.9,4.2,133465,2804


The final output of the function is a dataframe that includes the film title, the couple rating, the year of release, the IMDB rating, the IMDB vote count and the movieID. The user can then make decision on what to watch based on these films that were in both of their top lists.

# 5. Similar Films

It is possible that there are not many points of commonality between two users, if they don't have a lot of interests in common. If this is the case, the function below will also take in those two users and instead of returning the entire list of their top films, it only returns the top 5. These top 5 will also have the top 5 films that are most similar. That way, we can still have a decent breadth of recommendations to the user even if they don't have many points of intersection.

In [39]:
def similar_items_to_top_5(user1, user2):
    ''' Takes in two user ids, returns a dataframe of top 5 similar movies 
    to the top 5 movies in the two users top film lists.
    Requires the use of another function: ratings_for_two().'''
    
    # creates a dataframe of the top films for both users
    df = ratings_for_two(user1, user2)
    
    # instantiate empty list for dataframes
    to_concat_list = []

    # for each item id in the top 5 of the dataframe
    for i, iid in enumerate(df['MovieId'].iloc[:5]):
        
        # get the index location of that film from the movie reference table
        movie_index = (movies_reference[movies_reference['movieId'] == iid].index)
        # get the title of that film
        similar_to = str(movies_reference['title'][movies_reference['movieId'] == iid].values).strip("['']")
        
        # create dataframe that has the top 5 similar films
        i = pd.DataFrame({'Movie':movies_reference['title'],
                          'Year of Release':movies_reference['year_of_release'],
                          'Similarity Score': np.array(movie_similarities[movie_index, :].squeeze()),
                         'Similar To:': similar_to})
        
        # sort dataframe by the similarity score in descending order
        i = i.sort_values(by='Similarity Score', ascending=False).iloc[1:6,:]
        
        # add that dataframe to the list 
        to_concat_list.append(i)
    
    # concatenate the lists together one on top of eachother
    similar_df = pd.concat(to_concat_list, axis=0)
    
    # indexes the dataframe by the movies, to avoid ugly repeated fields
    similar_df = similar_df.set_index(['Similar To:', 'Movie'])
    
    return similar_df

In [43]:
# calling function with two user ids
similar_items_to_top_5(44,60)

Unnamed: 0_level_0,Unnamed: 1_level_0,Year of Release,Similarity Score
Similar To:,Movie,Unnamed: 2_level_1,Unnamed: 3_level_1
Saving Private Ryan,Schindler's List,1993,0.563621
Saving Private Ryan,"Boot, Das (Boat, The)",1981,0.477214
Saving Private Ryan,"Downfall (Untergang, Der)",2004,0.468293
Saving Private Ryan,Braveheart,1995,0.444878
Saving Private Ryan,Gladiator,2000,0.43069
Toy Story 2,Toy Story,1995,0.629941
Toy Story 2,"Monsters, Inc.",2001,0.521168
Toy Story 2,Pooh's Heffalump Movie,2005,0.512823
Toy Story 2,Toy Story 3,2010,0.501739
Toy Story 2,Pinocchio,1940,0.491354


# 6. New Users

Recommender systems often suffer from "cold-start" issues. This means that if the system does not have any information on the user, then it would have a hard time predicting what they would like. In these circumstances, it is a good idea to have some other options in place so that users can still be recommended things even if it is not as tailored as predicted ratings.

## 6.1 Top In Genre

Our first option for new users is to have the user specify a genre, and have the system return the top N films that in genre. To do so, we will need a reference dataframe with movie ids and the respective genre categories for each film. Using this data, we can sort by some metric to decide which films would be the most popular.

In [52]:
# reading in movie utility matrix 
movies_utility = pd.read_csv('My Datasets/movies_utility.csv', index_col=0)

In [53]:
# taking out the first 20 columns, these are all of the genres
movies_genres_df = movies_utility.iloc[:,:20]

In [54]:
# creation of new reference df, with the movieId & genres 
movies_reference_genre = pd.merge(movies_reference, movies_genres_df, on='movieId', how='left')

In [55]:
movies_genres_df

Unnamed: 0,movieId,Adventure,Animation,Children,Comedy,Fantasy,Romance,Drama,Action,Crime,Thriller,Horror,Mystery,Sci-Fi,IMAX,Documentary,War,Musical,Western,Film-Noir
0,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,2,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,3,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
3,4,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0
4,5,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
27273,131254,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
27274,131256,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
27275,131258,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
27276,131260,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


The function below returns the top 15 films in the genre specified by the user. The top films are sorted by their number of votes from IMDB and their average rating from IMDB. Sorting by the number of votes first makes sure that we aren't recommending high scoring films that were rated by a very small number of people. Of course, in a real life scenario there would have to be caveats in place for newly released films. These films would not have as many votes as films that had been out for years, but would most likely end up being more popular at the time because they are "new releases". This was not an issue for this dataset, because the data was only collected up until 2015 and the vote and ratings came from newly updated files from the IMDB website, meaning that the films in this dataset had plently of time to accumulate ratings.<br>
An option for future systems would be to have a "top films of all time" as well as a "new releases" category for each genre.

In [57]:
def top_in_genre():
    ''' Takes in user input of genre, returns the top 15 films in each genre
    by their overall rating on IMDB and the number of votes.'''
    
    print('Available Genre Choices:\n')
    # printing all the columns in the genres dataframe as available choices
    for col in movies_genres_df.drop(columns='movieId').columns:
        print(f'{col}')
    
    # use user input for which genre, capitalizing to match the column name
    genre_choice = input('\nWhat genre do you want to watch? ').capitalize()
    
    # returns dataframe that is the top choices
    best_in_genre = movies_reference_genre[['title', 'year_of_release','runtimeMinutes', 'numVotes', 'averageRating']]/
    [movies_reference_genre[genre_choice] == 1].sort_values(['numVotes', 'averageRating'], ascending=False).head(15)
    
    return best_in_genre

In [63]:
top_in_genre()

Available Genre Choices:

Adventure
Animation
Children
Comedy
Fantasy
Romance
Drama
Action
Crime
Thriller
Horror
Mystery
Sci-Fi
IMAX
Documentary
War
Musical
Western
Film-Noir

What genre do you want to watch? crime


Unnamed: 0,title,year_of_release,runtimeMinutes,numVotes,averageRating
315,"Shawshank Redemption, The",1994,142.0,2369695.0,9.3
12525,"Dark Knight, The",2008,152.0,2332847.0,9.0
15534,Inception,2010,148.0,2092810.0,8.8
2873,Fight Club,1999,139.0,1875495.0,8.8
293,Pulp Fiction,1994,154.0,1846441.0,8.9
843,"Godfather, The",1972,175.0,1640898.0,9.2
18312,"Dark Knight Rises, The",2012,164.0,1531904.0,8.4
10169,Batman Begins,2005,140.0,1321520.0,8.2
587,"Silence of the Lambs, The",1991,118.0,1286421.0,8.6
22191,"Wolf of Wall Street, The",2013,180.0,1208209.0,8.2


## 5.2 Addition of New Ratings

Another way to circumvent cold-start issues is to have the user rate a handful of films and then score their ratings on the model to then be able to make customized predictions for them. Of course, the user would have to rate a decent number of films, according to the dataset used to fit the model, each user had a minimum of 20 films rated. At the moment, this type of threshold is not in place, and the user can specify as many films as they want.

In [30]:
def add_new_user(userId):
    
    '''Creates a new dataframe of user ratings through user input search of the exisiting
    reference dataframe. The new dataframe can then be used to add into exisiting ratings
    dataset and have the model re-trained.'''
    
    # instantiate new empty dataframe
    new_user_df = pd.DataFrame(columns=['userId', 'movieId', 'rating'])
    
    # to keep while loop running, instantiating a variable to true.
    add_rating = True
    # while the above is true...
    while add_rating == True:
        print('\nPress q to quit')
        # search is the input of a string, keyword search for film title
        search = str(input('Try searching by title, pls capitalize each word: \n'))
        # if the user types 'q', program quits
        if search == 'q':
            break
        print('\nOptions listed below:\n')
        # prints out the options for that search
        print(pd.DataFrame(movies_reference[['movieId', 'title']][movies_reference['title'].str.contains(search)]))
        print('\nPlease choose a movie and rate (1-5):')
        # requires user specifies movie Id
        movieid = int(input("Movie ID: "))
        # requires user specifies rating
        new_rating = int(input("Rating: "))
        # quit option
        if new_rating == 'q':
            break
        # quit option
        elif movieid == 'q':
            break
        # if the rating is out of bounds, have user re-type
        elif new_rating > 5 & new_rating < 1:
            print('Rating cannot be less than 1 or more than 5')
            new_rating = int(input("Rating: "))
        
        # creation of new row variable with information from user input
        new_row = {'userId': userId, 'movieId': movieid, 'rating': new_rating}
        
        # appending that row onto the user's rating dataframe
        new_user_df = new_user_df.append(new_row, ignore_index=True)
    
    return new_user_df

In [31]:
add_new_user('haley')


Press q to quit
Try searching by title, pls capitalize each word: 
Star Wars

Options listed below:

       movieId                                              title
257        260                 Star Wars: Episode IV - A New Hope
1171      1196     Star Wars: Episode V - The Empire Strikes Back
1184      1210         Star Wars: Episode VI - Return of the Jedi
2543      2628          Star Wars: Episode I - The Phantom Menace
5281      5378       Star Wars: Episode II - Attack of the Clones
10117    33493       Star Wars: Episode III - Revenge of the Sith
12926    61160                          Star Wars: The Clone Wars
15507    79006  Empire of Dreams: The Story of the 'Star Wars'...
20393   100089                    Star Wars Uncut: Director's Cut
22977   109713                      Star Wars: Threads of Destiny

Please choose a movie and rate (1-5):
Movie ID: 1210
Rating: 5

Press q to quit
Try searching by title, pls capitalize each word: 
Mean Girls

Options listed below:

     

Unnamed: 0,userId,movieId,rating
0,haley,1210,5
1,haley,7451,5
2,haley,76077,5
3,haley,96616,1


This system of searching for films and adding ratings one by one is very verbose and impractical. However, under the time constraints, it lays a nice base as a prototype for things to come. The function at this point also does not score the new ratings on the model. Another limitation of this function is the lack of processing that is involved on the input step. The user has to capitalize every word they type in and search by keyword only, because if the name of the film is not formatted exactly how it is in the reference dataframe, it will not show up. This can obviously be dealt with at a later date with some simple formatting issue work arounds.

# 7. Next Steps

As far as model selection goes, given more time, I would like to attempt to use the SVDpp. This model takes into account implicit ratings, which is the fact that a user rated a film in general, no matter what the rating is. However, this model is very computationally expensive and would take a very long time to run, so it will have to happen outside of the time constraints of this project. I also found that the recall scores for both models were much lower than I would have liked. When using the SVD alone, there was a lot of repeat recommendations on users top lists, no matter the combination of ids. Bringing the KNN predictions in, as well as thresholding for films above a certain vote count helped to tamper this issue. However, the issue is definitely not fixed completely and requires more investigation and most likely more in-depth model optimization. There are still many user combinations that return the same few films, especially when the thresholding for a minimum rating was put in to place. This shows just how important recall is in recommender systems. With low recall scores, you just end up recommending the same few films to everyone because they technically are relevant to all users because of their popularity. However, you miss out on a lot of the very user specific recommendations which is what would set apart a really great recommender. A system that finds the true key differences between users and does not just recommend films because they are generally well liked.<br>
The user interface is completely non-existent, so I would ideally like to have it in a website or an app that allows the user to interact with something a bit more clean and less code based. At this point, I imagine only someone familiar with code would be able to understand how to call the functions, which is not ideal. Also, as mentioned I would like to have the new ratings input be a bit cleaner, and possibly offer a list of films for the user to rate instead of having them search. This way you could spread the films available over key points of difference so the models would be able to judge more accurately on less film choices.<br>
The function created to find the top N films for each user is also quite memory intensive, since it stores all of the films above the threshold for each user. It would probably be a bit better to have the function calculate and store only the required amount of user information when the function is called. 

This marks the end of notebook 3. There is a fourth notebook of just the necessary data files and functions to demo all of the functionality of the system. Feel free to continue on and test out the functions with different combinations of user ids.