# Item-Item Top-N Recommendations

In this excercise we will implement a simple top-N recommender, evaluate the algorithms, and then call algorithms from the Surprise package. In top-N recommendations the algorithm is requested to produce a list of N items that the user will be interested in. 
In this particular execercise we will work with an escape room dataset.

First, let's load the dataset, which is already split by time into a training set and a test set:

We can take a look at the structure of the dataset:

In [3]:
import pandas as pd
import numpy as np
train_set_path = 'resources//train_numerized_with_anon.csv'
test_set_path = 'resources//test_numerized_with_anon.csv'

train_set = pd.read_csv(train_set_path, parse_dates=[3], index_col='index')
test_set = pd.read_csv(test_set_path, parse_dates=[3], index_col='index')

users_in_train = train_set.userID.unique()
test_set = test_set[test_set.userID.isin(users_in_train)]

In [4]:
train_set.groupby('itemID').count().sort_values(by='userID', ascending=False)

Unnamed: 0_level_0,userID,rating,timestamp
itemID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
66,741,741,741
1,601,601,601
76,548,548,548
69,547,547,547
5,512,512,512
...,...,...,...
359,1,1,1
321,1,1,1
284,1,1,1
302,1,1,1


In [5]:
train_set.head()
d = train_set.groupby('itemID').count()['userID'].sort_values( ascending=False)
d.name = 'itemID'
d.head()



itemID
66    741
1     601
76    548
69    547
5     512
Name: itemID, dtype: int64

## Part 1: Recommend Most Popular Items 

Now we can begin implementing our first algorithm, that recommends to the user the list of most popular items. Although this is not a personalized approach, in many cases, this is not a bad idea - popular items are popular because everybody choose them, so there is a high likelihood that recommended popular items will be indeed chosen by the user.

In the code below, fill in the missing parts. The algorithm has a training method, where item popularity is computed, and a recommendation method, where the list of popular items.

In [6]:
class MostPopular:

    def __init__(self):
        self.item_ratings_sorted = None
        self.train_set = None

    def learn_model(self, train_set):
        self.train_set = train_set
        #1) Add code to set the item_ratings_sorted to the list of items in the training set, 
        #ordered by decreasing popularity (i.e., the number of users who have chosen an item)
        # group by item, sort by count
        self.item_ratings_sorted = self.train_set.groupby('itemID').count()['userID'].sort_values( ascending=False)
        self.item_ratings_sorted.name = 'itemID'
        

    def get_top_n_recommendations(self, test_set, top_n):
        result = {}
        already_ranked_items_by_user = self.train_set.groupby('userID')['itemID'].apply(list)
        
        #For each user in the test set compute recommendations
        for userID in test_set.userID.unique():
            user_items = already_ranked_items_by_user[userID]
            
            filtered_user_items_idx = np.setdiff1d(self.item_ratings_sorted.index, user_items).tolist()
            result[str(userID)] = self.item_ratings_sorted[filtered_user_items_idx].sort_values(ascending=False).head(top_n).index.to_list()
        return result

    def clone(self):
        pass


Now we can call the most popular algorithm to deliver a list of reocmmendations. The code below prints the list of top 5 recommended items for user with ID 431.

In [7]:
popular = MostPopular()
popular.learn_model(train_set)
popular_recs = popular.get_top_n_recommendations(test_set,top_n=5)
print(popular_recs['431'])
assert popular_recs['431'] == [53, 26, 68, 85, 16], 'Wrong computation of popular items'

[53, 26, 68, 85, 16]


## Part 2 - Item-Item Recommendations

We now learn a slightly more sophisticated model, that uses item-item similarities. Given such a similarity score, we can recommend to a user items that are most similar to the items that the user has chosen in the past. One such useful similarity metric is the Jaccard coefficient. For two items i1 and i2, the Jaccard similarity is the number of users who have chosen both i1 and i2, divided by the number of users who have chosen either i1 or i2. That is, given the list of users who have chosen i1 and the list of users who have chosen i2, the Jaccard similarity is the intersection of the lists, divided by the union of the lists.

In practice, to expedite the recommendation process, and hence reduce online latency, we will compute the item-item co-occurence matrix in the model learning phase. Then, online, when recommendations are requested, we only need to compute for each item that the user has already chosen in the past, the Jaccard scores for the other items.

As the user has chosen several items in the past, we need to aggregate the Jaccard scores. That is, if the user has previously chosen i1 and i2, item i3 has two scores J(i1,i3) and J(i2,i3), and an aggregation of the scores is needed. There are two popular aggregation functions - sum and max. Empirically, max typically has better perfromance.

Fill in the missing parts in the code below.

### Cleaning duplicates

Seems like we have an issue with out of the users.<br>
He watched the same item and gave it different ratings 27 times on the same day.

We will drop duplicate of this item.

In [8]:
train_set_clean = train_set[['userID','itemID']].drop_duplicates()
train_set_clean['rating'] = train_set.loc[train_set_clean.index, 'rating']
train_set_clean

Unnamed: 0_level_0,userID,itemID,rating
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,0,0,10
1,1,0,10
2,2,0,8
3,3,0,10
4,4,0,10
...,...,...,...
40017,18871,307,10
40018,18871,87,10
40019,1834,364,10
40020,20196,87,10


In [9]:
items_OHE = pd.get_dummies(train_set_clean.itemID)
items_OHE['userID'] = train_set_clean.userID

In [10]:
items_OHE = items_OHE.groupby(items_OHE.userID).sum().reset_index()

In [11]:
items_OHE_np = items_OHE[range(375)].to_numpy()

In [12]:
def learn_model(train_set):

    items_OHE = pd.get_dummies(train_set.itemID)
    items_OHE['userID'] = train_set.userID

    # Combining each row for each user so we are left with unique rows for each user with OHE of the items
    items_OHE = items_OHE.groupby(items_OHE.userID).sum().reset_index()

    # Preforming sum between all the columns and a specific colum 
    items_OHE_np = items_OHE[range(375)].to_numpy()

    # calculating jacard, and we get a table for each item and it's combination Jacard score
    # a sum of 2 means both items have been watched by user.
    cols = []
    for i in range(375):
        col = np.expand_dims(items_OHE[i].to_numpy(), axis=1)
        comb = pd.DataFrame(items_OHE_np + col)

        # dropping the same column we did addition
        # comb.drop(columns=[i], inplace=True)
        comb = comb.apply(pd.Series.value_counts)

        # Sometimes we don't have 2 in our index, if not add row with 0
        if ~(comb.index == 2).any():
            comb.loc[2] = 0
        
        # Making sure that the calculation work properly we replace nans with zero
        comb.fillna(0,inplace=True)
        cols.append((comb.loc[2] / (comb.loc[1] + comb.loc[2])))


    jacard = pd.concat(cols, axis=1)
    np.fill_diagonal(jacard.values, 0)
    display(jacard)

In [13]:
# creating One Hot Encoding for each item
items_OHE = pd.get_dummies(train_set.itemID)
items_OHE['userID'] = train_set.userID

# Combining each row for each user so we are left with unique rows for each user with OHE of the items
items_OHE = items_OHE.groupby(items_OHE.userID).sum().reset_index()

# Preforming sum between all the columns and a specific colum 
items_OHE_np = items_OHE[range(375)].to_numpy()

# calculating jacard, and we get a table for each item and it's combination Jacard score
jacard = pd.DataFrame([], columns=range(374))
for i in range(375):

    col = np.expand_dims(items_OHE[i].to_numpy(), axis=1)
    comb = pd.DataFrame(items_OHE_np + col)
    comb = comb.apply(pd.Series.value_counts)
    jacard[i] = (comb.loc[2] / comb.loc[1])
jacard

  jacard[i] = (comb.loc[2] / comb.loc[1])


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,365,366,367,368,369,370,371,372,373,374
0,,0.162098,0.112727,0.124191,0.139767,0.066519,0.040558,0.058912,0.155080,0.102528,...,0.001946,0.003922,0.007782,0.003846,0.005671,0.013699,0.003929,0.001957,,
1,0.162098,,0.195707,0.204000,0.195793,0.087645,0.056604,0.073103,0.202381,0.140351,...,0.003317,0.005008,0.015126,0.006590,0.009772,0.018519,0.005017,0.001661,,
2,0.112727,0.195707,,0.162050,0.211610,0.072316,0.055263,0.054628,0.182331,0.148092,...,0.001980,0.003992,0.014028,0.003914,0.009690,0.018072,0.006024,0.004000,,
3,0.124191,0.204000,0.162050,,0.171053,0.072189,0.051105,0.058728,0.205394,0.140351,...,0.004376,0.004396,0.015453,0.004301,0.010638,0.019912,0.006637,0.002193,,
4,0.139767,0.195793,0.211610,0.171053,,0.072808,0.065789,0.064665,0.246914,0.138947,...,0.003802,0.011673,0.039841,0.011236,0.022059,0.022901,0.011719,0.003846,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
370,0.013699,0.018519,0.018072,0.019912,0.022901,0.005758,0.008523,0.012552,0.026316,0.008403,...,,0.133333,0.294118,0.130435,0.166667,,0.062500,0.062500,,
371,0.003929,0.005017,0.006024,0.006637,0.011719,0.003914,0.008824,0.008734,0.013514,0.008696,...,0.142857,0.200000,0.333333,0.066667,0.083333,0.062500,,,,
372,0.001957,0.001661,0.004000,0.002193,0.003846,,,0.004329,,0.002865,...,,,,,,0.062500,,,,
373,,,,,,,,,,,...,,,,,,,,,,


In [90]:
# from tqdm import tqdm
# import numpy as np
# import operator

# class Jaccard:

#     def __init__(self):
#         self.item_ratings_sorted = None
#         self.train_set = None
#         self.item_item_counts = dict()
#         self.item_counts = None
#         self.jacard = None

#     def learn_model(self, train_set):
#         self.train_set = train_set
#         items_OHE = pd.get_dummies(train_set.itemID)
#         items_OHE['userID'] = train_set.userID

#         # Combining each row for each user so we are left with unique rows for each user with OHE of the items
#         items_OHE = items_OHE.groupby(items_OHE.userID).sum().reset_index()

#         items_OHE_np = items_OHE[range(375)].to_numpy()

#         # calculating Jacard, and we get a table for each item and it's combination Jacard score
#         # a sum of 2 means both items have been watched by user.
#         cols = []
#         for i in range(375):
#             col = np.expand_dims(items_OHE[i].to_numpy(), axis=1)
#             comb = pd.DataFrame(items_OHE_np + col)

#             comb = comb.apply(pd.Series.value_counts)

#             # Sometimes we don't have 2 in our index, if not add row with 0
#             if ~(comb.index == 2).any():
#                 comb.loc[2] = 0
            
#             # Making sure that the calculation work properly we replace nans with zero
#             comb.fillna(0,inplace=True)
#             cols.append((comb.loc[2] / (comb.loc[1] + comb.loc[2])))


#         jacard = pd.concat(cols, axis=1)
#         # The diagonal is where each column has with it's self a similarity score of 1, we replace it with zero.
#         np.fill_diagonal(jacard.values, 0)
#         self.jacard = jacard
        
#         display(self.jacard)
#         return self.jacard
                    
            
            

#     def get_top_n_recommendations(self, test_set, top_n):
#         result = {}
#         already_ranked_items_by_users = self.train_set.groupby('userID')['itemID'].apply(list)
#         res = []
#         for userID in test_set.userID.unique():
#             result[str(userID)] = []
            
#             for itemID in already_ranked_items_by_users[userID]:
#                 if itemID in self.jacard:
#                     res.append(self.jacard[itemID].nlargest(top_n))
#                 else:
#                     res.append([])

#             res_s = pd.concat(res)
#             exclude_idx = res_s.index.isin(already_ranked_items_by_users[userID])
#             res_s = res_s[~exclude_idx]
#             res_s.drop_duplicates(inplace=True)
#             result[str(userID)] = res_s.nlargest(top_n).index
#         return result
            
    
#     def clone(self):
#         pass


### Jacard y-data

In [167]:
from tqdm import tqdm
import numpy as np
import operator
d = []
class Jaccard:

    def __init__(self):
        self.item_ratings_sorted = None
        self.train_set = None
        self.item_item_counts = dict()
        self.item_counts = None
        self.jacard = None

    def learn_model(self, train_set):
        self.train_set = train_set
        items_OHE = pd.get_dummies(train_set.itemID)
        items_OHE['userID'] = train_set.userID

        # Combining each row for each user so we are left with unique rows for each user with OHE of the items
        items_OHE = items_OHE.groupby(items_OHE.userID).sum().reset_index()

        items_OHE_np = items_OHE[range(375)].to_numpy()

        # calculating Jacard, and we get a table for each item and it's combination Jacard score
        # a sum of 2 means both items have been watched by the user.
        dfs = []
        for i in range(375):
            col = np.expand_dims(items_OHE[i].to_numpy(), axis=1)
            df_comb = pd.DataFrame(items_OHE_np + col)
            a = (df_comb == 0).sum()
            b = (df_comb == 1).sum()
            c = (df_comb == 2).sum()

            d = pd.DataFrame([], columns=[0,1,2])
            d[0] = a
            d[1] = b
            d[2] = c

            d = d.T
            
            # Making sure that the calculation work properly we replace nans with zero
            d.fillna(0,inplace=True)
            dfs.append((d.loc[2] / (d.loc[1] + d.loc[2])))

        
        jacard = pd.concat(dfs, axis=1)
        # The diagonal is where each column has with it's self a similarity score of 1, we replace it with zero.
        np.fill_diagonal(jacard.values, 0)
        self.jacard = jacard
        
        display(self.jacard)
        return self.jacard
        # return dfs
                    
            
            

    def get_top_n_recommendations(self, test_set, top_n):
        result = {}
        already_ranked_items_by_users = self.train_set.groupby('userID')['itemID'].apply(list)
        res = []
        for userID in test_set.userID.unique():
            result[str(userID)] = []
            items_not_in_user = ~self.jacard.columns.isin(already_ranked_items_by_users[userID])
            items_in_user = already_ranked_items_by_users[userID]
            filtered_jacard = self.jacard.loc[items_not_in_user , items_in_user]

            result[str(userID)] = filtered_jacard.max(axis=1).sort_values().tail(top_n).index
        return result
            
    
    def clone(self):
        pass


### Running Jacard similarity
The code below trains a Jaccard model and generates recommendations. Training will take a while, as we need to iterate over all users, and for each user go over her items in quadratic time.

In [168]:
jaccard = Jaccard()
jacard_res = jaccard.learn_model(train_set_clean)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,365,366,367,368,369,370,371,372,373,374
0,0.000000,0.139487,0.101307,0.110472,0.122628,0.062370,0.038977,0.055635,0.134259,0.092994,...,0.001942,0.003906,0.007722,0.003831,0.005639,0.013514,0.003914,0.001953,0.0,0.0
1,0.139487,0.000000,0.163675,0.169435,0.163735,0.080583,0.053571,0.068123,0.168317,0.123077,...,0.003306,0.004983,0.014901,0.006547,0.009677,0.018182,0.004992,0.001658,0.0,0.0
2,0.101307,0.163675,0.000000,0.139452,0.174652,0.067439,0.052369,0.051799,0.154213,0.128989,...,0.001976,0.003976,0.013834,0.003899,0.009597,0.017751,0.005988,0.003984,0.0,0.0
3,0.110472,0.169435,0.139452,0.000000,0.146067,0.067329,0.048620,0.055470,0.170396,0.123077,...,0.004357,0.004376,0.015217,0.004283,0.010526,0.019523,0.006593,0.002188,0.0,0.0
4,0.122628,0.163735,0.174652,0.146067,0.000000,0.067867,0.061728,0.060738,0.198020,0.121996,...,0.003788,0.011538,0.038314,0.011111,0.021583,0.022388,0.011583,0.003831,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
370,0.013514,0.018182,0.017751,0.019523,0.022388,0.005725,0.008451,0.012397,0.025641,0.008333,...,0.000000,0.117647,0.227273,0.115385,0.142857,0.000000,0.058824,0.058824,0.0,0.0
371,0.003914,0.004992,0.005988,0.006593,0.011583,0.003899,0.008746,0.008658,0.013333,0.008621,...,0.125000,0.166667,0.250000,0.062500,0.076923,0.058824,0.000000,0.000000,0.0,0.0
372,0.001953,0.001658,0.003984,0.002188,0.003831,0.000000,0.000000,0.004310,0.000000,0.002857,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.058824,0.000000,0.000000,0.0,0.0
373,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0


In [170]:
jaccard_recs = jaccard.get_top_n_recommendations(test_set,top_n=5)
jaccard_recs
# assert jaccard_recs['431'] == [296,297,298,299,300], 'Wrong computation of jaccard items'

{'431': Int64Index([297, 296, 300, 304, 303], dtype='int64'),
 '433': Int64Index([14, 44, 40, 43, 152], dtype='int64'),
 '4463': Int64Index([195, 336, 229, 68, 254], dtype='int64'),
 '12663': Int64Index([89, 75, 195, 204, 74], dtype='int64'),
 '4501': Int64Index([4, 8, 203, 54, 75], dtype='int64'),
 '4502': Int64Index([287, 371, 346, 266, 269], dtype='int64'),
 '582': Int64Index([39, 17, 38, 331, 40], dtype='int64'),
 '4516': Int64Index([38, 275, 229, 68, 331], dtype='int64'),
 '660': Int64Index([267, 287, 348, 93, 74], dtype='int64'),
 '671': Int64Index([370, 277, 4, 10, 182], dtype='int64'),
 '681': Int64Index([4, 66, 8, 3, 10], dtype='int64'),
 '697': Int64Index([18, 21, 10, 126, 8], dtype='int64'),
 '765': Int64Index([4, 66, 8, 10, 204], dtype='int64'),
 '4809': Int64Index([120, 4, 126, 2, 242], dtype='int64'),
 '12684': Int64Index([84, 274, 204, 89, 331], dtype='int64'),
 '4957': Int64Index([348, 266, 285, 346, 269], dtype='int64'),
 '4993': Int64Index([4, 8, 149, 204, 68], dtype=

As a side note - as computing the ite-item counts takes a while (especially with Python), we are using here the progress bar from the tqdm package (https://pypi.org/project/tqdm/). You need to install tqdm, or remove the progress bar, which would of course is not needed for the algorithm to run.

## Part 3 - Comparing the Algorithms 

We now want to compare the recommendation lists to see which one is better. In top-N recommendations it is popular to computer the Precision@N metric - the portion of recommended items that were chosen by users in the test set. This is typically a reasonable metric for real systems, where one wants to optimize the number of recommended items that are chosen.

We compute Precision@N by comparing the number of recommendations chosen by the users, divided by the number of overall recommendations.

Fill in the missing parts in the code below:

In [162]:
def compute_precision(test_set, recommendations):
    #hits is the number of items that were recommended and chosen
    hits = 0
    #recs is the total number of recommended items
    recs = 0
    
    for u in test_set.userID.unique():
        user_itemIDS = test_set[test_set.userID == u].itemID
        userRecs = recommendations.get(str(u))
        #5) Compute here the number of hits. Update hits and recs accordingly.
        hits += np.in1d(userRecs,user_itemIDS).sum()
        recs += len(userRecs)

        
    return hits / recs
        
    

In [163]:
p1 = compute_precision(test_set, jaccard_recs)
p2 = compute_precision(test_set, popular_recs)
print("Jaccard=", p1, "  Popularity=", p2)

Jaccard= 0.03612334801762115   Popularity= 0.027312775330396475


The precision values for this dataset may seem pretty low, but this is typical for many top-N problems. It is important not to compute metrics that hide the true values, such as AUC, but to acknowledge the perfromance of the system in the application.

## Part 4 - Calling Algorithms from the Surprise Package

There are many existing recommendation algorithms available. We will now see how we can call algorithms from the Surprise package. 

The code below adds a wrapper around the algorithm to transform the resulting recommendations into our desired format.

#### NOTE: 
To run the code below you first have to install _surprise_ (http://surpriselib.com/). surprise requires scipy >=1.0, so update if needed.

To install: pip install scikit-surprise or, if you're using anaconda:  conda install -c conda-forge scikit-surprise).  

In [None]:
import sys, string, os
import pandas as pd
import itertools
from tqdm import tqdm
import numpy as np
import operator
from surprise import Reader
from surprise import Dataset
from surprise.model_selection import PredefinedKFold
from surprise.prediction_algorithms import *



class SurpriseRecMethod():

    #method will be the specific Surprise algorithm that we will call
    def __init__(self, method):
        self.method = method

    def fit(self, train_set):
        self.train_set = train_set


    def get_top_n_recommendations(self, test_set, top_n):
        self.test_set = test_set

        #Surprise requires a slightly different input data format, so we use two different CSVs
        test_path_tmp = "resources//test_file.csv"
        train_path_tmp = "resources//train_file.csv"

        self.train_set.to_csv(train_path_tmp, index=False, header=False)
        self.test_set.to_csv(test_path_tmp, index=False, header=False)

        fold_files = [(train_path_tmp, test_path_tmp)]
        reader = Reader(rating_scale=(1, 10), line_format='user item rating', sep=',')
        data = Dataset.load_from_folds(fold_files, reader=reader)

        for trainset, testset in PredefinedKFold().split(data):
            self.method.fit(trainset)

        already_ranked_items_by_users = self.train_set.groupby('userID')['itemID'].apply(list)

        recommendations = {}
        pbar = tqdm(total=len(self.test_set.userID.unique()))
        for userID in self.test_set.userID.unique():
            pbar.update(1)

            if userID not in self.train_set.userID.unique():
                recommendations[str(userID)] = []
                continue

            items_expected_ranking = {}
            for itemID in self.train_set.itemID.unique():
                if itemID in already_ranked_items_by_users[userID]:
                    continue
                #We call here the specific Surprise method that we use for this model
                #The method predicts a score for a given item
                predicted = self.method.predict(str(userID), str(itemID), clip=False)
                items_expected_ranking[itemID] = predicted[3]
                
            #Now we just sort by decreasing scores and take the top N
            sorted_predictions = sorted(items_expected_ranking.items(), key=operator.itemgetter(1))
            sorted_predictions.reverse()
            sorted_predictions = [x[0] for x in sorted_predictions]
            user_recommendations = sorted_predictions[:top_n]
            recommendations[str(userID)] = user_recommendations
        pbar.close()
        return recommendations


The code below calls the package with the SlopeOne algorithm.

In [None]:
modelSlopeOne = SurpriseRecMethod(SlopeOne())
modelSlopeOne.fit(train_set)
recSlopeOne = modelSlopeOne.get_top_n_recommendations(test_set, 5)
p3 = compute_precision(test_set,recSlopeOne)

The code below calls the package with a nearest neighbor user-item recommendation method.

In [None]:
modelKNNUser = SurpriseRecMethod(KNNBasic(sim_options={'name': 'cosine', 'user_based': True}))
modelKNNUser.fit(train_set)
recKNNUser = modelKNNUser.get_top_n_recommendations(test_set, 5)
p4 = compute_precision(test_set,recKNNUser)

Let us look at the results of all algorithms together:

In [None]:
pd.DataFrame.from_dict({'Jaccard':p1,'Popularity':p2,'SlopeOne':p3,'User KNN':p4}, orient='index',columns=['Precision'])

Try the NMF (non-negative matrix factorization) algorithms from the package - https://surprise.readthedocs.io/en/stable/matrix_factorization.html#surprise.prediction_algorithms.matrix_factorization.NMF.


In [None]:
#Your code here

For this particular dataset, the user nearest neighbor approach worked best. Hence, should we need to choose a method to put online, we should go with this method.