# Recommender Systems Using Collaborative Filtering

> *Recommender Systems*  
> *MSc in Data Science, Department of Informatics*  
> *Athens University of Economics and Business*

---

Find a ***rating-based*** or ***matching-based*** dataset that can be used to inform a recommender system based on ***collaborative filtering***.

Build a Python notebook that:

- Loads the dataset
- Tries at least 2 different recommendations methods based on collaborative filtering (e.g., Count-based, Matrix Factorization, Tensorflow)
- Uses quantitative metrics to evaluate the recommendations of each of the methods that you selected

## Introduction

### *Libraries*

In [1]:
import pandas as pd
import numpy as np
import collections
from itertools import combinations
from collections import defaultdict
import random
import os
import warnings
warnings.filterwarnings("ignore")

### *Data*

- The dataset that will be used can be found in the following link: https://www.kaggle.com/datasets/tamber/steam-video-games
- The data are taken from Steam which is a gaming platform 
- It is a list of user behaviors, with columns: user-id, game-title, behavior-name, value. 
- The behaviors included are 'purchase' and 'play'. 
- The value indicates the degree to which the behavior was performed - in the case of 'purchase' the value is always 1, and in the case of 'play' the value represents the number of hours the user has played the game.

##### *Read ratings data*

In [2]:
def load_games(path: str):
    
    games_df = pd.read_csv(path, header = None, usecols = [0,1,2,3] ,
                       names = ['userId','Game','Behavior', 'Value'])
    
    # Checking for Duplicate rows and droping them
    if (games_df.duplicated() == True).sum() > 0:
        games_df.drop(games_df.index[games_df.duplicated() == True].tolist(), inplace = True)
        
    return games_df

In [3]:
games_df = load_games(os.getcwd() + '\\' + 'steam-200k.csv')
games_df.head()

Unnamed: 0,userId,Game,Behavior,Value
0,151603712,The Elder Scrolls V Skyrim,purchase,1.0
1,151603712,The Elder Scrolls V Skyrim,play,273.0
2,151603712,Fallout 4,purchase,1.0
3,151603712,Fallout 4,play,87.0
4,151603712,Spore,purchase,1.0


## Recommendations Using Item-Based Technique

In [4]:
#distinct users set
distinct_users=set(games_df['userId'])

#distinct games
distinct_games = set(games_df['Game'].unique())

#creating unique games dictionary
games_dict = dict(zip(distinct_games,[i for i in range(len(distinct_games))]))

#replacing games with their id instead of their title
games_df['Game'] = games_df['Game'].map(games_dict)

#reversing keys and values of the games dictionary
games_dict = dict((v,k) for k,v in games_dict.items())

##### *Function to get the ratings of each user*

In [5]:
def get_user_ratings(distinct_users: dict
                    , games_df: pd.core.frame.DataFrame):

    '''
    Loads all games purchased and/or played by each user

    Each game is mapped to the following two ratings:
        
        P-P -> if the game was purchased and played
    
        P-NP -> if the game was purchased and not played
        
    '''
    
    user_ratings = {}

    for user in distinct_users:

        purchased = games_df[(games_df['userId'] == user) & (games_df['Behavior'] == 'purchase')]['Game'].values.tolist()

        played = games_df[(games_df['userId'] == user) & (games_df['Behavior'] == 'play')]['Game'].values.tolist()

        user_ratings[user] = dict(zip([game for game in purchased],
                                      ['P-P' if game in played else 'P-NP' for game in purchased]))
        
    return user_ratings

In [6]:
# execute
user_ratings = get_user_ratings(distinct_users,games_df)

##### *Function to get user neighbors based on the Jaccard Similarity*

In [7]:
def get_user_neighbors(user_ratings:dict, # games purchased/played by each user
                       min_rating_num:int=5 # at least this many games purchased/played are required for a comparison
                      ):
    
    '''
    Compute rating-based similarity between every two pairs of users 
    
    '''
    
    #get all possible pairs of users
    pairs=list(combinations(list(user_ratings.keys()),2))
    
    usim=defaultdict(dict) # initialize the sim dictionary
    
    for u1,u2 in pairs: # for every user pair 
   
        #get a set with all the discretized values (game, purchased/played values) for u1 and u2
        s1=set([(mid,pol) for mid,pol in user_ratings[u1].items()])
        s2=set([(mid,pol) for mid,pol in user_ratings[u2].items()])

        # check if both users respect the lower bound
        if len(s1)<min_rating_num or len(s2)<min_rating_num: continue
      
        # get the union and intersection for these two users
        union=s1.union(s2)
        inter=s1.intersection(s2)
    
        # compute user sim via the jaccard coeff
        jacc=len(inter)/len(union)

        # remember the sim values
        usim[u1][u2]=jacc
        usim[u2][u1]=jacc
        
    # attach each user to its neighbors, sorted by sim in descending order 
    return {user:sorted(usim[user].items(),key=lambda x:x[1], reverse=True) for user in usim}

In [8]:
# execute
neighbors_u=get_user_neighbors(user_ratings)

##### *Function to get precision and recall*

In [9]:
def calculate_precision_recall(actual_ratings, recommendations):
    true_positives = len(set(actual_ratings).intersection(recommendations))
    precision = true_positives / len(recommendations) if len(recommendations) > 0 else 0
    recall = true_positives / len(actual_ratings) if len(actual_ratings) > 0 else 0
    return precision, recall

##### *Recommendations*

The function provides user-based recommendations for a given user by following these steps:

  - Identifies the most similar users to the specified user.
  - Iterates through all games purchased or played by the neighbors.
  - Assigns a score of +2 to each game if a neighbor purchased and played it, and +1 if the neighbor only purchased it.
  - Scales the votes based on user similarity.
  - Sorts the games in descending order based on their scores.
  - Iterates through the sorted list of games. If the user has already rated a game, stores its rating; otherwise, prints the game.
  
The evaluation metrics used are the following:
- **Number of Games Before First Valid Recommendation:**
  - Represents the count of games iterated before the first valid recommendation.
  - Indicates how many games were considered before identifying a game that the user has not yet rated.

- **Number of Games per Valid Recommendation:**
  - Signifies the average number of games considered for each valid recommendation.
  - Provides insight into the efficiency of the recommendation process.

- **Recommendations Accuracy:**
  - Presents the accuracy of the recommendations as a percentage.
  - Calculated by dividing the number of already rated games by the total number of games considered for recommendations.
  - Reflects the proportion of recommendations that align with the user's preferences.

- **Precision:**
  - Precision is the ratio of relevant recommendations to the total recommendations made.
  - In this context, it measures how many of the recommended games were actually liked by the user.

- **Recall:**
  - Recall is the ratio of relevant recommendations to the total number of relevant items.
  - In this context, it measures how many of the user's liked games were successfully recommended.

In [10]:
def recommend_evaluate_ub(user:int, 
                 games_dict:dict, # games dict  
                 neighbors_u:dict, # neighbors dict
                 user_ratings:dict, # ratings submitted per user 
                 neighbor_num:int, # number of neighbors to consider
                 rec_num:int,# number of games to recommend
                 show_rec: bool = False, #determine whether to show recommendations or not
                 show_eval: bool = False #determine whether to show evaluation metrics
                ):
    
    if user not in neighbors_u or len(neighbors_u[user]) < neighbor_num:
        if show_rec:
            print("Not enough neighbors to provide recommendations.")
        return 0, 0, 0
    
    top_k=neighbors_u[user][:neighbor_num] # get the top k neighbors of this user
    
    votes=defaultdict(int) # count the values per game
    
    for neighbor,sim_val in top_k: # for each neighbor 

        for mid,pol in user_ratings[neighbor].items(): # for each game purchased/played by this neighbor

            if pol=='P-P': 
                votes[mid]+=2*sim_val
            else: 
                votes[mid]+=1*sim_val

    # sort the games in descending order 
    srt=sorted(votes.items(),key=lambda x:x[1], reverse=True)

    if show_rec: 
        print('\nI suggest the following games because they have received positive ratings from users who tend to like what you like:\n')
        print('='*100)
          
    cnt=0 # count number of recommendations made 
    total = 0 # total number of games before reaching k recommendations
    first_rating_index = 0 # number of recommendations before first unrated game
    
    already_rated={}
    
    for gm, score in srt: # for each game 
    
        total += 1
        
        title=games_dict[gm] # get the title 
        
        rat=user_ratings[user].get(gm,None) # check if the user has already purchased/played the game 
        
        if rat: # game already rated 
            already_rated[title]=rat # store the value
            continue
     
        cnt+=1 # one more recommendation
        if first_rating_index == 0 : first_rating_index = len(already_rated)
        
        if show_rec: print('\n',gm, title) # print 
    
        if cnt==rec_num:break # stop once you 've made enough recommendations
    
    #if show_rec: print('\n',already_rated)
        
    if show_eval:
        precision, recall = calculate_precision_recall(list(user_ratings[user].values()), list(already_rated.values()))
        print("\n")
        print("Evaluation Metrics for Current Recommendation:")
        print('='*100)
        #print("\n")
        print('Number of Games Before First Valid Recommendation: ', first_rating_index)
        print('Number of Games per Valid Recommendation: ', round(total / rec_num,2))
        print('Recommendations Accuracy: ', "{:0.2%}".format(len(already_rated) / len(user_ratings[user]),2))
        print('Precision: ',"{:0.2%}".format(precision))
        print('Recall: ',"{:0.2%}".format(recall))
        
    return first_rating_index, total, len(already_rated), already_rated

In [11]:
recommend_evaluate_ub(91687359, games_dict, neighbors_u, user_ratings, 15, 10, True, True)


I suggest the following games because they have received positive ratings from users who tend to like what you like:


 1375 Dota 2

 4863 Deathmatch Classic

 3821 Day of Defeat

 1005 Counter-Strike Source

 1968 Ricochet

 2052 War Thunder

 3877 No More Room in Hell

 1860 Counter-Strike Nexon Zombies

 1208 Call of Duty Modern Warfare 2 - Multiplayer

 384 Call of Duty Modern Warfare 2


Evaluation Metrics for Current Recommendation:
Number of Games Before First Valid Recommendation:  4
Number of Games per Valid Recommendation:  1.4
Recommendations Accuracy:  66.67%
Precision:  50.00%
Recall:  33.33%


(4,
 14,
 4,
 {'Counter-Strike': 'P-P',
  'Counter-Strike Global Offensive': 'P-P',
  'Counter-Strike Condition Zero Deleted Scenes': 'P-P',
  'Counter-Strike Condition Zero': 'P-NP'})

##### *Function to get the evaluation metrics on a subset of users*

In [12]:
def evaluate_ub_recs(user_subset: list, #list of users that will participate in the evaluation
                    games_dict:dict, # game info  
                    neighbors_u:dict, # neighbors dict
                    user_ratings:dict, # values submitted per user 
                    neighbor_num:int, # number of neighbors to consider
                    rec_num:int,# number of games to recommend
                    show_eval: bool = True
                    ):
    
    indexes = [] #list that holds the count before the first recommendation
    total_recs = [] #list that holds the number of total recommendations / number of games to recommend
    rated = [] #list that holds the already rated / purchase and/or played games
    precision_list = []  # List that holds precision values
    recall_list = []  # List that holds recall values
    
    for user in user_subset:
        
        if user in neighbors_u:
            f, c, s, already_rated =  recommend_evaluate_ub(user, games_dict, neighbors_u, user_ratings, 15, 10, False, False)
            indexes.append(f)
            total_recs.append(c / rec_num)
            rated.append(round(s / len(user_ratings[user]), 2))
            
            # Calculate precision and recall for each user
            precision, recall = calculate_precision_recall(list(user_ratings[user].values()), list(already_rated.values()))
            precision_list.append(precision)
            recall_list.append(recall)

    if show_eval:
        print('Average Number of Games Before First Valid Recommendation: ', round(sum(indexes) / len(indexes),2))
        print('Average Number of Games per Valid Recommendation: ', round(sum(total_recs) / len(total_recs),2))
        print('Recommendations Accuracy: ', "{:0.2%}".format(sum(rated) / len(rated),2))
        print('Average Precision: ', "{:0.2%}".format(sum(precision_list) / len(precision_list), 2))
        print('Average Recall: ', "{:0.2%}".format(sum(recall_list) / len(recall_list), 2))
        
    return indexes,total_recs,rated

In [13]:
# number of items to select (15% of list size)
num_items_to_select = int(len(distinct_users) * 0.15)

# select a random subset of items
random_subset = random.sample(distinct_users, num_items_to_select)

ind, cnt, pr = evaluate_ub_recs(random_subset, games_dict, neighbors_u, user_ratings, 15, 10)

Average Number of Games Before First Valid Recommendation:  5.15
Average Number of Games per Valid Recommendation:  2.02
Recommendations Accuracy:  57.75%
Average Precision:  25.25%
Average Recall:  15.10%


## Recommendations Using Matrix Factorization

### *Libraries*

In [14]:
import surprise
from surprise import Reader, Dataset, SVD
from surprise.model_selection.validation import cross_validate
from sklearn.metrics import mean_squared_error,mean_absolute_error

##### *Scaling Function*

In [15]:
def scale_value(value, in_min, in_max, out_min, out_max):
    """
    Scales a value from one range to another.

    Parameters:
    value (float): the value to be scaled.
    in_min (float): the minimum value of the input range.
    in_max (float): the maximum value of the input range.
    out_min (float): the minimum value of the output range.
    out_max (float): the maximum value of the output range.

    Returns:
    float: the scaled value.
    """
    if in_min == in_max and out_min <= value <= out_max:
        return value
    elif in_min == in_max and value >= out_max:
        in_max += 0.1
        return (value - in_min) * (out_max - out_min) / (in_max - in_min) + out_min
    else:
        return (value - in_min) * (out_max - out_min) / (in_max - in_min) + out_min


### *Data Preprocessing*

Given the original dataset this function does the following:

- Creates a copy of the original dataset and keeps only the games that were purchased by each user
- Merges the dataset with another that containes the corresponding playing time
- Updates the value column, which now is the sum of purchase value (1) and playing time
- Scales that value in the range of (1,5) so that it can be used by the surprise library 

In [16]:
def preprocess_data(games_df: pd.core.frame.DataFrame,  # games dataframe
                    distinct_users: dict  # distinct users
                    ):
    
    # Select only purchased games
    games_df_mt = games_df[games_df['Behavior'] == 'purchase'].copy()
    
    # calculate the total playing time of each game per user and then merge with the dataframe above
    games_df_mt = games_df_mt.merge(
        games_df[games_df['Behavior'] == 'play'].groupby(['userId', 'Game'], as_index=False)['Value'].sum(),
        how='left', left_on=['userId', 'Game'], right_on=['userId', 'Game']
    )

    # Fill NaN values with 0
    games_df_mt.Value_y.fillna(0, inplace=True)

    # Create a new 'Value' column by adding playing and purchase
    games_df_mt['Value'] = games_df_mt['Value_x'] + games_df_mt['Value_y']

    # Drop unnecessary columns
    games_df_mt.drop(['Value_x', 'Value_y', 'Behavior'], axis=1, inplace=True)

    # Initialize 'Scaled_Value' with 'Value'
    games_df_mt['Scaled_Value'] = games_df_mt['Value']

    # Scale 'Scaled_Value' for each user within the range [1, 5]
    for user in distinct_users:
        mn = games_df_mt[games_df_mt['userId'] == user].Value.min()
        mx = games_df_mt[games_df_mt['userId'] == user].Value.max()

        games_df_mt.loc[games_df_mt['userId'] == user, 'Scaled_Value'] = games_df_mt[
            games_df_mt['userId'] == user].Scaled_Value.apply(
            lambda x: scale_value(x, mn, mx, 1, 5))

    return games_df_mt

In [17]:
#execute
games_df_mt = preprocess_data(games_df,distinct_users)
games_df_mt.head()

Unnamed: 0,userId,Game,Value,Scaled_Value
0,151603712,3571,274.0,5.0
1,151603712,3031,88.0,2.274725
2,151603712,3980,15.9,1.218315
3,151603712,1839,13.1,1.177289
4,151603712,3505,9.9,1.130403


##### *Recommendations Model Definition*

- Given the dataset with the scaled playing time, this function creates the model with which we are going to make our recommendations.

In [18]:
def define_model(df: pd.core.frame.DataFrame,
                 lower_bound: float, #lower bound of surprize library
                 upper_bound: float,  #upper bound of surprize library
                 factors_num: int    #number of factors to keep in SVD
                ):
    
    # Load Reader library
    reader = Reader(rating_scale=(lower_bound, upper_bound))

    # Load games dataset with Dataset library
    data = Dataset.load_from_df(df[['userId', 'Game', 'Scaled_Value']], reader)
    
    # Use the SVD algorithm.
    svd = SVD(n_factors = factors_num)

    # Compute the RMSE of the SVD algorithm.
    cross_validate(svd, data, measures=['RMSE'],cv=10,verbose=True)
    
    trainset = data.build_full_trainset()

    svd.fit(trainset)# fit the svd
    
    return svd

In [19]:
# execute
svd = define_model(games_df_mt,1,5,60)

Evaluating RMSE of algorithm SVD on 10 split(s).

                  Fold 1  Fold 2  Fold 3  Fold 4  Fold 5  Fold 6  Fold 7  Fold 8  Fold 9  Fold 10 Mean    Std     
RMSE (testset)    0.8412  0.8650  0.8439  0.8672  0.8614  0.8622  0.8477  0.8699  0.8570  0.8492  0.8565  0.0097  
Fit time          0.93    0.97    0.94    0.98    0.97    0.99    1.00    0.96    1.00    1.04    0.98    0.03    
Test time         0.37    0.06    0.06    0.06    0.06    0.33    0.06    0.07    0.06    0.05    0.12    0.12    


##### *Make Recommendations*

The recommendations function follows the steps below:

- Extract historical ratings for the user.
- Use the collaborative filtering model to predict ratings for all games.
- Identify games that the user has not rated.
- Recommend the top-rated unrated games up to the specified number.
- Compile the recommendations into a DataFrame with Game IDs, Titles, and Predicted Ratings.
- Sort the DataFrame by predicted ratings in descending order.
- Return the sorted DataFrame as the final set of recommendations.

In [20]:
def recommend_surprise(uid:int,
              ratings_df:pd.core.frame.DataFrame,
              model,
              games_dict:dict,
              rec_num:int
             ):

    #get all the values by this user
    my_ratings=ratings_df[ratings_df.userId==uid]

    #zip the values into a dict
    already_rated=dict(zip(my_ratings.Game,my_ratings.Scaled_Value))

    pred_dict={}# store predicted values

    for game in games_dict: # for every game 

        pred_dict[game]=model.predict(uid = uid,iid = game).est# get the pred for this user
        
    # sort the games by predicted values
    srt=sorted(pred_dict.items(),key=lambda x:x[1],reverse=True)
    
    rec_set=set()# set of games to be recommended

    total = 0 # total number of games before reaching k recommendations
    first_rating_index = 0 # number of recommendations before first non purchased/played game
    
    for mid,pred in srt:
        
        total += 1
        
        if mid not in already_rated: # game has not already been purchased/played
            
            if first_rating_index == 0 : first_rating_index = total
            
            rec_set.add(mid) # add to the set
            
            if len(rec_set)==rec_num:break 
       
    # make a data frame with only the recommended games 
    
    rec_df = pd.DataFrame({'GameId': [game for game in games_dict.keys() if game in rec_set],
              'Title': [games_dict[game] for game in games_dict.keys() if game in rec_set]})
    
    #add the predicted rating as a new column
    rec_df['predicted_rating'] = rec_df['GameId'].map(pred_dict)
    
    #sort the df by the new column
    rec_df=rec_df.sort_values(['predicted_rating'], ascending=False)
    
    return rec_df

In [24]:
recommend_surprise(109205612,
              games_df_mt,
              svd,
              games_dict,
              10
             )

Unnamed: 0,GameId,Title,predicted_rating
0,238,Football Manager 2012,3.105558
2,1208,Call of Duty Modern Warfare 2 - Multiplayer,2.77049
7,4905,Total War ROME II - Emperor Edition,2.412587
8,4969,Football Manager 2013,2.386558
3,1318,F1 2012,2.361092
1,974,Football Manager 2014,2.343981
5,2788,Counter-Strike Global Offensive,2.317203
9,5128,Football Manager 2011,2.306468
6,4133,Football Manager 2010,2.186282
4,1884,Football Manager 2015,2.093617


##### *Evaluate Recommendations*

Certainly! The following function is designed to evaluate the performance of the recommendation model in predicting playing time for games. Here's a brief description of the evaluation measures used:

- **Root Mean Squared Error (RMSE):**
  - **Definition:** The square root of the average of the squared differences between actual and predicted values.
  - **Interpretation:** Measures the average magnitude of the errors, giving more weight to larger errors.

- **Mean Squared Error (MSE):**
  - **Definition:** The average of the squared differences between actual and predicted values.
  - **Interpretation:** Similar to RMSE but without the square root, providing a measure of the average squared error.

- **Mean Absolute Error (MAE):**
  - **Definition:** The average of the absolute differences between actual and predicted values.
  - **Interpretation:** Measures the average magnitude of the errors without considering their direction.

- **Explanation:**
  - The function calculates predicted playing times for games using a collaborative filtering model.
  - It then compares the predicted playing times with the actual playing times for games the user has already played.
  - RMSE, MSE, and MAE are computed to quantify the accuracy and performance of the recommendation model.
  - The results are printed, and the function returns the computed MAE, RMSE, and MSE values.

These evaluation metrics provide insights into how well the recommendation model is predicting playing times for games, with lower values indicating better performance.

In [25]:
def evaluate_recommendations(uid:int,
              playing_time_df:pd.core.frame.DataFrame,
              model: surprise.prediction_algorithms.matrix_factorization.SVD,
              games_dict:dict
             ):

    #get the adjusted playing time for each game by this user
    my_games=playing_time_df[playing_time_df.userId==uid]

    already_played=dict(zip(my_games.Game,my_games.Scaled_Value))

    pred_dict={}# store predicted playing time

    for game in games_dict: # for every game 

        pred_dict[game]=model.predict(uid = uid,iid = game).est
        
    actual,pred=[],[]
    
    for mid in already_played: 
        actual.append(already_played[mid])
        pred.append(pred_dict[mid])
    
    rmse = mean_squared_error(actual,pred,squared=False)
    mse = mean_squared_error(actual,pred,squared=True)
    mae = mean_absolute_error(actual,pred)
    
    print('Recommendations MAE: ', round(mae,4))
    print('Recommendations RMSE: ', round(rmse,4))
    print('Recommendations MSE: ', round(mse,4))
    
    return mae,rmse,mse

In [26]:
mae,rmse,mse = evaluate_recommendations(109205612,
              games_df_mt,
              svd,
              games_dict
             )

Recommendations MAE:  0.0922
Recommendations RMSE:  0.1756
Recommendations MSE:  0.0308
