## Recommendation System 
After going through EDA and trying different models, this notebook will be used to set up and run the recommendation system which utilizes the surprise library's SVD algorithm to filter results for users to recommend 5 movies based on similar users in the system.

Student name: Amanda Rowe 

Student pace: self paced 

Instructor name: Jeff Herman 

Blog post URL:https://roweyerboat.github.io/the_helpful_library_of_surprise

In [1]:
# Importing Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
from surprise import Reader, Dataset
from surprise.model_selection import cross_validate, train_test_split
from surprise.prediction_algorithms import SVD
from surprise import accuracy

In [2]:
# Importing Data
ratings_df = pd.read_csv('ratings_limited_users.csv', usecols=['userId', 'movieId', 'rating'])
movies_df = pd.read_csv('movies.csv')

In [3]:
# Setting up the model
# Initializing a reader and data class
reader = Reader()
data = Dataset.load_from_df(ratings_df, reader)

# Splitting the data into train and test sets
trainset, testset = train_test_split(data, test_size=.25)

# Using the tuned parameters for the SVD model
svd = SVD(n_factors=100,
               n_epochs=30,
               lr_all=0.01, 
               reg_all=0.1)
svd.fit(trainset)
svd_preds = svd.test(testset)

In [4]:
# Checking the accuracy of the model
accuracy.rmse(svd_preds)

RMSE: 0.8504


0.8503526381452163

In [5]:
# Importing the popular movies dataframe to be used for users to rate
popular_movies_df = pd.read_csv('popular_movies.csv')
popular_movies_df.head()

Unnamed: 0.1,Unnamed: 0,title,rating,num of ratings,movieId,genres
0,18,10 Things I Hate About You (1999),3.527778,54,2572,Comedy|Romance
1,34,12 Angry Men (1957),4.149123,57,1203,Drama
2,74,2001: A Space Odyssey (1968),3.894495,109,924,Adventure|Drama|Sci-Fi
3,89,28 Days Later (2002),3.974138,58,6502,Action|Horror|Sci-Fi
4,104,300 (2007),3.68125,80,51662,Action|Fantasy|War|IMAX


In [8]:
# Functions needed
# Function to get new users preferences on any movie or a particular genre
def movie_rater(movie_df, num=5, genre=None):
    """ This function is to handle a cold start with a new user.  It takes in a number of ratings
        from a new user and gives the output of 5 movie recommendations.
        
        Args:
            movie_df(dataframe): the dataframe of movies that you will use to recommend movies
            num(integer): the number of ratings you want the user to input before giving a recommendation. The default value is 5.
            genre(string): The genre of movies that you wish to pull from for your user to rate.  The default is None.
        
        Returns:
            The output is a list of 5 movies with their titles and genres receommended for the user based on their initial ratings given.  
            A collaborative filter is used to add their ratings to the inital dataframe to then find this output."""
    userID = 1000
    rating_list = []
    while num > 0:
        if genre:
            movie = popular_movies_df[popular_movies_df['genres'].str.contains(genre)].sample(1)
        else:
            movie = popular_movies_df.sample(1)
        print(movie['title'])
        try: 
            rating = input('How do you rate this movie on a scale of (low)1-5(high). Press n if you have not seen this movie: \n')
            if rating == 'n':
                continue
            else:
                rating_one_movie = {'userId': userID, 'movieId': movie['movieId'].values[0], 'rating': rating}
                rating_list.append(rating_one_movie)
                num -=1
        except:
            continue
    new_ratings_df = ratings_df.append(rating_list, ignore_index=True)
    new_data = Dataset.load_from_df(new_ratings_df, reader)
    svd_ =  SVD(n_factors=100,
               n_epochs=30,
               lr_all=0.01, 
               reg_all=0.1)
    svd_.fit(new_data.build_full_trainset())
    list_of_movies = []
    for m_id in ratings_df['movieId'].unique():
        list_of_movies.append( (m_id, svd_.predict(1000, m_id)[3]))
    ranked_movies = sorted(list_of_movies, key=lambda x: x[1], reverse=True)
    n=5
    for idx, rec in enumerate(ranked_movies):
        title = movie_df.loc[movie_df['movieId'] == int(rec[0])]['title']
        print('------------------------------------------------')
        print('Recommendation # ', idx+1, ': ', title, '\n')
        n-=1
        if n==0:
            break
    
    return 


In [7]:
# Calling the function to get user input and result in 5 recommendations
movie_rater(movies_df, 5, genre = 'Drama')

276    Mr. Holland's Opus (1995)
Name: title, dtype: object
How do you rate this movie on a scale of (low)1-5(high). Press n if you have not seen this movie: 
n
69    Braveheart (1995)
Name: title, dtype: object
How do you rate this movie on a scale of (low)1-5(high). Press n if you have not seen this movie: 
4
114    Dark Knight, The (2008)
Name: title, dtype: object
How do you rate this movie on a scale of (low)1-5(high). Press n if you have not seen this movie: 
4
251    Love Actually (2003)
Name: title, dtype: object
How do you rate this movie on a scale of (low)1-5(high). Press n if you have not seen this movie: 
2
414    Vertigo (1958)
Name: title, dtype: object
How do you rate this movie on a scale of (low)1-5(high). Press n if you have not seen this movie: 
n
405    Truman Show, The (1998)
Name: title, dtype: object
How do you rate this movie on a scale of (low)1-5(high). Press n if you have not seen this movie: 
4
6    A.I. Artificial Intelligence (2001)
Name: title, dtype: ob

In [None]:
movie_rater(movies_df,5)

In [None]:
#movie_rater(movies_df, genre='Comedy')