## Recommendations with MovieTweetings: Collaborative Filtering

One of the most popular methods for making recommendations is **collaborative filtering**.  In collaborative filtering, you are using the collaboration of user-item recommendations to assist in making new recommendations.  

There are two main methods of performing collaborative filtering:

1. **Neighborhood-Based Collaborative Filtering**, which is based on the idea that we can either correlate items that are similar to provide recommendations or we can correlate users to one another to provide recommendations.

2. **Model Based Collaborative Filtering**, which is based on the idea that we can use machine learning and other mathematical models to understand the relationships that exist amongst items and users to predict ratings and provide ratings.


In this notebook, you will be working on performing **neighborhood-based collaborative filtering**.  There are two main methods for performing collaborative filtering:

1. **User-based collaborative filtering:** In this type of recommendation, users related to the user you would like to make recommendations for are used to create a recommendation.

2. **Item-based collaborative filtering:** In this type of recommendation, first you need to find the items that are most related to each other item (based on similar ratings).  Then you can use the ratings of an individual on those similar items to understand if a user will like the new item.

In this notebook you will be implementing **user-based collaborative filtering**.  However, it is easy to extend this approach to make recommendations using **item-based collaborative filtering**.  First, let's read in our data and necessary libraries.

**NOTE**: Because of the size of the datasets, some of your code cells here will take a while to execute, so be patient!

In [1]:
import numpy as np
import pandas as pd

In [3]:
# Read the datasets
movies = pd.read_csv('movies_clean.csv', index_col='movie_id')
del movies['Unnamed: 0']

reviews = pd.read_csv('reviews_clean.csv', index_col=0)
reviews.head()

Unnamed: 0,user_id,movie_id,rating,timestamp,date,month_1,month_2,month_3,month_4,month_5,...,month_9,month_10,month_11,month_12,year_2013,year_2014,year_2015,year_2016,year_2017,year_2018
0,1,68646,10,1381620027,2013-10-12 23:20:27,0,0,0,0,0,...,0,1,0,0,1,0,0,0,0,0
1,1,113277,10,1379466669,2013-09-18 01:11:09,0,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0
2,2,422720,8,1412178746,2014-10-01 15:52:26,0,0,0,0,0,...,0,1,0,0,0,1,0,0,0,0
3,2,454876,8,1394818630,2014-03-14 17:37:10,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
4,2,790636,7,1389963947,2014-01-17 13:05:47,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0


### Measures of Similarity

When using **neighborhood** based collaborative filtering, it is important to understand how to measure the similarity of users or items to one another.  

There are a number of ways in which we might measure the similarity between two vectors (which might be two users or two items).  In this notebook, we will look specifically at two measures used to compare vectors:

* **Pearson's correlation coefficient**

Pearson's correlation coefficient is a measure of the strength and direction of a linear relationship. The value for this coefficient is a value between -1 and 1 where -1 indicates a strong, negative linear relationship and 1 indicates a strong, positive linear relationship. 

If we have two vectors x and y, we can define the correlation between the vectors as:


$$CORR(x, y) = \frac{\text{COV}(x, y)}{\text{STDEV}(x)\text{ }\text{STDEV}(y)}$$

where 

$$\text{STDEV}(x) = \sqrt{\frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2}$$

and 

$$\text{COV}(x, y) = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})$$

where n is the length of the vector, which must be the same for both x and y and $\bar{x}$ is the mean of the observations in the vector.  

We can use the correlation coefficient to indicate how alike two vectors are to one another, where the closer to 1 the coefficient, the more alike the vectors are to one another.  There are some potential downsides to using this metric as a measure of similarity.  You will see some of these throughout this workbook.


* **Euclidean distance**

Euclidean distance is a measure of the straightline distance from one vector to another.  Because this is a measure of distance, larger values are an indication that two vectors are different from one another (which is different than Pearson's correlation coefficient).

Specifically, the euclidean distance between two vectors x and y is measured as:

$$ \text{EUCL}(x, y) = \sqrt{\sum_{i=1}^{n}(x_i - y_i)^2}$$

Different from the correlation coefficient, no scaling is performed in the denominator.  Therefore, you need to make sure all of your data are on the same scale when using this metric.

**Note:** Because measuring similarity is often based on looking at the distance between vectors, it is important in these cases to scale your data or to have all data be in the same scale.  In this case, we will not need to scale data because they are all on a 10 point scale, but it is always something to keep in mind!

------------

### User-Item Matrix

In order to calculate the similarities, it is common to put values in a matrix.  In this matrix, users are identified by each row, and items are represented by columns.  




In the above matrix, you can see that **User 1** and **User 2** both used **Item 1**, and **User 2**, **User 3**, and **User 4** all used **Item 2**.  However, there are also a large number of missing values in the matrix for users who haven't used a particular item.  A matrix with many missing values (like the one above) is considered **sparse**.

Our first goal for this notebook is to create the above matrix with the **reviews** dataset.  However, instead of 1 values in each cell, you should have the actual rating.  

The users will indicate the rows, and the movies will exist across the columns. To create the user-item matrix, we only need the first three columns of the **reviews** dataframe, which you can see by running the cell below.

In [2]:
reviews.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 712337 entries, 0 to 712336
Data columns (total 23 columns):
 #   Column     Non-Null Count   Dtype 
---  ------     --------------   ----- 
 0   user_id    712337 non-null  int64 
 1   movie_id   712337 non-null  int64 
 2   rating     712337 non-null  int64 
 3   timestamp  712337 non-null  int64 
 4   date       712337 non-null  object
 5   month_1    712337 non-null  int64 
 6   month_2    712337 non-null  int64 
 7   month_3    712337 non-null  int64 
 8   month_4    712337 non-null  int64 
 9   month_5    712337 non-null  int64 
 10  month_6    712337 non-null  int64 
 11  month_7    712337 non-null  int64 
 12  month_8    712337 non-null  int64 
 13  month_9    712337 non-null  int64 
 14  month_10   712337 non-null  int64 
 15  month_11   712337 non-null  int64 
 16  month_12   712337 non-null  int64 
 17  year_2013  712337 non-null  int64 
 18  year_2014  712337 non-null  int64 
 19  year_2015  712337 non-null  int64 
 20  year

### Creating the User-Item Matrix

In order to create the user-items matrix (like the one above), I personally started by using a [pivot table](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.pivot_table.html). 

However, I quickly ran into a memory error (a common theme throughout this notebook).  I will help you navigate around many of the errors I had, and achieve useful collaborative filtering results! 

_____

`1.` Create a matrix where the users are the rows, the movies are the columns, and the ratings exist in each cell, or a NaN exists in cells where a user hasn't rated a particular movie. If you get a memory error (like I did), [this link here](https://stackoverflow.com/questions/39648991/pandas-dataframe-pivot-memory-error) might help you!

In [3]:
# Create user-by-item matrix, of shape (#users, #movies), 
matrix_shape = (len(reviews['user_id'].unique()), movies.shape[0]) # (53,968, 31,245) = 1,686,230,160
print('matrix_shape =', matrix_shape)


# user-item matrix will contain only int values from 0 to 10, or NAN.
# make dtype=int8 to fit into memory.
user_item_matrix = np.full(shape=matrix_shape, fill_value=-1, dtype=np.int8)
groups = reviews.groupby('user_id').indices # type=dict


movies_indices = movies.index.to_numpy()
# Util function that maps (actual indices) to (positional indices) for iloc.
def get_i_idx(values):
    return [np.where(movies_indices == val)[0][0] for val in values]


# fill the user_item_matrix
for uid in groups:
    user_movies = groups[uid].tolist() # list of positional indices of user_movies.
    # Make sure that the list is not empty.
    if user_movies:
        # get movies actual ids.
        movies_ids = reviews.iloc[user_movies, 1].tolist() # 1 is the index of column 'movie_id'
        # fill this user row and movies positional indices in the matrix with ratings values.
        # uid-1 because users ids start from 1 not 0.
        user_item_matrix[uid-1, get_i_idx(movies_ids)] = reviews.iloc[user_movies, 2].tolist()

matrix_shape = (53968, 31245)


In [6]:
def get_user_row(uid, replace_val=np.nan):
    result = user_item_matrix[uid-1].astype(float)
    result[result == -1] = replace_val
    return result

# Create a dictionary with users and corresponding movies seen
def movies_watched(user_id):
    '''
    INPUT:
    user_id - the user_id of an individual as int
    OUTPUT:
    movies - an array of movies the user has watched
    '''
    # Implement your code here
    movies = groups.get(user_id)
    if movies is not None:
        movies = movies.tolist()
        movies = reviews.iloc[movies, 1].tolist()
    else:
        movies = []
    return movies


def create_user_movie_dict():
    '''
    INPUT: None
    OUTPUT: movies_seen - a dictionary where each key is a user_id and the value is an array of movie_ids
    
    Creates the movies_seen dictionary
    '''
    # Implement your code here
    movies_seen = {k:movies_watched(k) for k in groups}
    return movies_seen

movies_seen = create_user_movie_dict()

In [7]:
len(movies_seen)

53968

`3.` If a user hasn't rated more than 2 movies, we consider these users "too new".  Create a new dictionary that only contains users who have rated more than 2 movies.  This dictionary will be used for all the final steps of this workbook.

In [8]:
# Remove individuals who have watched 2 or fewer movies - don't have enough data to make recs
def create_movies_to_analyze(movies_seen, lower_bound=2):
    '''
    INPUT:  
    movies_seen - a dictionary where each key is a user_id and the value is an array of movie_ids
    lower_bound - (an int) a user must have more movies seen than the lower bound to be added to the movies_to_analyze dictionary

    OUTPUT: 
    movies_to_analyze - a dictionary where each key is a user_id and the value is an array of movie_ids
    
    The movies_seen and movies_to_analyze dictionaries should be the same except that the output dictionary has removed 
    
    '''
    # Implement your code here
    movies_to_analyze = {k: v for k, v in movies_seen.items() if len(v) > lower_bound}
    return movies_to_analyze

movies_to_analyze = create_movies_to_analyze(movies_seen)

In [9]:
len(movies_to_analyze[2])

23

### Calculating User Similarities

Now that you have set up the **movies_to_analyze** dictionary, it is time to take a closer look at the similarities between users. Below is the pseudocode for how I thought about determining the similarity between users:

```
for user1 in movies_to_analyze
    for user2 in movies_to_analyze
        see how many movies match between the two users
        if more than two movies in common
            pull the overlapping movies
            compute the distance/similarity metric between ratings on the same movies for the two users
            store the users and the distance metric
```

However, this took a very long time to run, and other methods of performing these operations did not fit on the workspace memory!

Therefore, rather than creating a dataframe with all possible pairings of users in our data, your task for this question is to look at a few specific examples of the correlation between ratings given by two users.  For this question consider you want to compute the [correlation](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.corr.html) between users.

`4.` Using the **movies_to_analyze** dictionary and **user_by_movie** dataframe, create a function that computes the correlation between the ratings of similar movies for two users.  Then use your function to compare your results to ours using the tests below.  

In [11]:
def get_common_movies(user1, user2):
    user1_movies = ~np.isnan(get_user_row(user1))
    user2_movies = ~np.isnan(get_user_row(user2))
    common_movies = user1_movies & user2_movies
    return common_movies

def compute_correlation(user1, user2):
    '''
    INPUT
    user1 - int user_id
    user2 - int user_id
    OUTPUT
    the correlation between the matching ratings between the two users
    '''
    # Implement your code here
    common_movies = get_common_movies(user1, user2)
    corr = np.corrcoef(user_item_matrix[user1-1, common_movies], user_item_matrix[user2-1, common_movies])
    return corr[1, 0] #return the correlation

### Why the NaN's?

If the function you wrote passed all of the tests, then you have correctly set up your function to calculate the correlation between any two users.  

`5.` But one question is, why are we still obtaining **NaN** values?  As you can see in the code cell above, users 7 and 8022 have a correlation of **NaN**. Why?

Think and write your ideas here about why these NaNs exist, and use the cells below to do some coding to validate your thoughts. You can check other pairs of users and see that there are actually many NaNs in our data. These NaN's ultimately make the correlation coefficient a less than optimal measure of similarity between two users.

```
In the denominator of the correlation coefficient, we calculate the standard deviation for each user's ratings.  The ratings for two users share the same ratings on the matching movies.  Therefore, the standard deviation is 0.  Because a 0 is in the denominator of the correlation coefficient, we end up with a **NaN** correlation coefficient.  Therefore, a different approach is likely better for this particular situation.
```

In [13]:
# Which movies did both user 2 and user 104 see?
user2_movies = ~np.isnan(get_user_row(2))
user104_movies = ~np.isnan(get_user_row(104))
common_movies = user2_movies & user104_movies

In [14]:
# What were the ratings for each user on those movies?
print(user_item_matrix[1, common_movies])
print(user_item_matrix[103, common_movies])

[8 8 8 8]
[9 7 7 9]


`6.` Because the correlation coefficient proved to be less than optimal for relating user ratings to one another, we could instead calculate the euclidean distance between the ratings.  I found [this post](https://stackoverflow.com/questions/1401712/how-can-the-euclidean-distance-be-calculated-with-numpy) particularly helpful when I was setting up my function.  This function should be very similar to your previous function.  When you feel confident with your function, test it against our results.

In [23]:
def _euclidean_dist(v1, v2):
    return np.linalg.norm(v1 - v2)


def compute_euclidean_dist(user1, user2):
    '''
    INPUT
    user1 - int user_id
    user2 - int user_id
    OUTPUT
    the euclidean distance between user1 and user2
    '''
    # Implement your code here
    common_movies = get_common_movies(user1, user2)
    # make sure that there is at least 1 movie in common between user1 and user2
    if common_movies.sum() > 0:
        dist = _euclidean_dist(user_item_matrix[user1-1, common_movies], user_item_matrix[user2-1, common_movies])
        return dist
    return np.nan #return the euclidean distance

In [27]:
print(compute_euclidean_dist(35, 51))
cmvs = get_common_movies(35, 51)
print(user_item_matrix[34, cmvs])
print(user_item_matrix[50, cmvs])

nan
[]
[]


In [30]:
print(movies_seen[35])
print(movies_seen[51])

for m in movies_seen[35]:
    if m in movies_seen[51]:
        print(m)

[2172584, 2345737]
[25316, 29843, 31381, 34248, 34583, 37017, 38109, 38787, 39694, 42004, 44706, 44937, 48947, 51658, 57590, 68646, 82766, 105695, 369610, 473075, 816442, 975645, 1024648, 1045658, 1126590, 1195478, 1210819, 1216492, 1232829, 1343092, 1616195, 1623205, 1661199, 1670345, 1907668, 1911644, 1979388, 1980929, 2004420, 2023587, 2083383, 2180411, 2278388, 2294629, 2302755]


In [38]:
# Important when comparing nans
c = np.nan
print(c != np.nan)
print(not np.isnan(c))

True
False


### Using the Nearest Neighbors to Make Recommendations

In the previous question, you read in **df_dists**. Therefore, you have a measure of distance between each user and every other user. This dataframe holds every possible pairing of users, as well as the corresponding euclidean distance.

Because of the **NaN** values that exist within the correlations of the matching ratings for many pairs of users, as we discussed above, we will proceed using **df_dists**. You will want to find the users that are 'nearest' each user.  Then you will want to find the movies the closest neighbors have liked to recommend to each user.

I made use of the following objects:

* df_dists (to obtain the neighbors)
* user_items (to obtain the movies the neighbors and users have rated)
* movies (to obtain the names of the movies)

`7.` Complete the functions below, which allow you to find the recommendations for any user.  There are five functions which you will need:

* **find_closest_neighbors** - this returns a list of user_ids from closest neighbor to farthest neighbor using euclidean distance


* **movies_liked** - returns an array of movie_ids


* **movie_names** - takes the output of movies_liked and returns a list of movie names associated with the movie_ids


* **make_recommendations** - takes a user id and goes through closest neighbors to return a list of movie names as recommendations


* **all_recommendations** = loops through every user and returns a dictionary of with the key as a user_id and the value as a list of movie recommendations

In [52]:
def find_closest_neighbors(user, k=10):
    '''
    INPUT:
        user - (int) the user_id of the individual you want to find the closest users
    OUTPUT:
        closest_neighbors - an array of the id's of the users sorted from closest to farthest away
    '''
    
    # keep track the nearest neighbours
    nn    = [0] * k
    # dists list to track the least k distances.
    dists = [1000] * k
    
    user_row = get_user_row(user, 0)
    n = user_item_matrix.shape[0]
    
    for i in range(1, n+1):
        # uncomment the next line to monitor the progress.
        # print('progress =', i*100 / n, '%')
        
        # calculate the distance and make sure != nan 
        dist = compute_euclidean_dist(user, i)
        if not np.isnan(dist):
            # to keep dists list sorted ascending, just insert when find a smaller distance. 
            j = 0
            while j < len(nn) and dist > dists[j]:
                j += 1
            
            # Decide whether to replace the last dist in dists or insert a new one in the middle 
            if j < len(nn):
                if j == len(nn) - 1:
                    dists[-1] = dist
                    nn[-1] = i
                else:
                    dists.insert(j, dist)
                    dists = dists[:-1]

                    nn.insert(j, i)
                    nn = nn[:-1]
    print('dists =', dists)   
    print('nn =', nn)
    return nn[1:k+1]
    
    
def movies_liked(user_id, min_rating=7):
    '''
    INPUT:
    user_id - the user_id of an individual as int
    min_rating - the minimum rating considered while still a movie is still a "like" and not a "dislike"
    OUTPUT:
    movies_liked - an array of movies the user has watched and liked
    '''
    # Implement your code here
    movies_liked_idx = user_item_matrix[user_id-1] >= 7
    movies_liked = movies_indices[movies_liked_idx]
    return movies_liked.tolist()


def movie_names(movie_ids):
    '''
    INPUT
    movie_ids - a list of movie_ids
    OUTPUT
    movies - a list of movie names associated with the movie_ids    
    '''
    # Implement your code here
    movie_lst = movies.loc[movie_ids, 'movie'].tolist()
    return movie_lst
    
    
def make_recommendations(user, num_recs=10):
    '''
    INPUT:
        user - (int) a user_id of the individual you want to make recommendations for
        num_recs - (int) number of movies to return
    OUTPUT:
        recommendations - a list of movies - if there are "num_recs" recommendations return this many
                          otherwise return the total number of recommendations available for the "user"
                          which may just be an empty list
    '''
    # Implement your code here
    nn = find_closest_neighbors(user, k=num_recs*2)
    user_liked_movies = set(movies_liked(user, min_rating=0))
    
    liked_movies = []
    for uid in nn:    
        liked_movies.extend(movies_liked(uid))

    # Make sure that the recommendations is new to the user.    
    liked_movies = [movie for movie in liked_movies if movie not in user_liked_movies]
    recommendations = movie_names(liked_movies)
    return recommendations[:num_recs]


def all_recommendations(num_recs=10):
    '''
    INPUT 
        num_recs (int) the (max) number of recommendations for each user
    OUTPUT
        all_recs - a dictionary where each key is a user_id and the value is an array of recommended movie titles
    '''
    # Implement your code here
    # work on just 10 users as a POC.
    users = list(movies_to_analyze.keys())[:10]
    all_recs = {k:make_recommendations(k) for k in users}
    return all_recs

all_recs = all_recommendations(10)

dists = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
nn = [53967, 53951, 53937, 53875, 53822, 53793, 53772, 53766, 53751, 53744, 53695, 53591, 53567, 53520, 53458, 53420, 53360, 53331, 53316, 53279]
dists = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
nn = [53958, 53835, 53755, 53583, 53271, 53161, 53129, 53069, 53008, 52863, 52823, 52791, 52589, 52586, 52556, 52245, 52161, 52156, 51959, 51948]
dists = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
nn = [53881, 53714, 53447, 53180, 53086, 52966, 52883, 52866, 52604, 52575, 52537, 52491, 52388, 52161, 52106, 52100, 52095, 51806, 51755, 51599]
dists = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
nn = [53801, 51382, 50210, 48352, 47911, 47796, 47028, 44135, 39111, 39085, 38911, 38352, 37371, 34403, 32568, 28157, 27838, 2

In [49]:
print('user 2 watched movies =', movie_names(movies_to_analyze[2]))
# find_closest_neighbors(2)
make_recommendations(2)

user 2 watched movies = ['Marie Antoinette (2006)', 'Life of Pi (2012)', 'Dallas Buyers Club (2013)', 'World War Z (2013)', 'Lone Survivor (2013)', 'Two Lovers (2008)', 'August: Osage County (2013)', 'In the Heart of the Sea (2015)', 'Straight Outta Compton (2015)', 'Deadpool (2016)', 'Disconnect (2012)', 'Gravity (2013)', 'Captain Phillips (2013)', 'The Intouchables (2011)', 'Her (2013)', 'All Is Lost (2013)', '12 Years a Slave (2013)', 'Frozen (2013)', 'The Intern (2015)', 'Mission: Impossible - Rogue Nation (2015)', 'The Longest Ride (2015)', 'Chef (2014)', 'Spy (2015)']
dists = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
nn = [53967, 53951, 53937, 53875, 53822, 53793, 53772, 53766, 53751, 53744, 53695, 53591, 53567, 53520, 53458, 53420, 53360, 53331, 53316, 53279]


['The Shawshank Redemption (1994)',
 'Good Will Hunting (1997)',
 'This Is the End (2013)',
 'Movie 43 (2013)',
 'Peeples (2013)',
 '21 & Over (2013)',
 'Safety Not Guaranteed (2012)',
 'The Internship (2013)',
 'Beasts of No Nation (2015)',
 'The Judge (2014)']

In [47]:
nn = [53967, 53951, 53937, 53875, 53822, 53793, 53772, 53766, 53751, 53744, 53695, 53591, 53567, 53520, 53458, 53420, 53360, 53331, 53316, 53279]
for n in nn:
    print(f'common movies (2, {n}) =', get_common_movies(2, n).sum())

common movies (2, 53967) = 1
common movies (2, 53951) = 1
common movies (2, 53937) = 1
common movies (2, 53875) = 1
common movies (2, 53822) = 2
common movies (2, 53793) = 3
common movies (2, 53772) = 1
common movies (2, 53766) = 1
common movies (2, 53751) = 2
common movies (2, 53744) = 1
common movies (2, 53695) = 1
common movies (2, 53591) = 1
common movies (2, 53567) = 1
common movies (2, 53520) = 1
common movies (2, 53458) = 1
common movies (2, 53420) = 1
common movies (2, 53360) = 1
common movies (2, 53331) = 1
common movies (2, 53316) = 1
common movies (2, 53279) = 1


In [55]:
print(list(all_recs.keys()))

all_recs[2]

[2, 3, 7, 8, 9, 17, 22, 24, 25, 26]


['The Shawshank Redemption (1994)',
 'Good Will Hunting (1997)',
 'This Is the End (2013)',
 'Movie 43 (2013)',
 'Peeples (2013)',
 '21 & Over (2013)',
 'Safety Not Guaranteed (2012)',
 'The Internship (2013)',
 'Beasts of No Nation (2015)',
 'The Judge (2014)']