# Tuning and Evaluation Notebook

In this notebook we go over how to train a Latent Factor Model using Alternating Least Squares (ALS). We also go over tuning the models rank and regularization scaler parameters. Due to the size of the full BYU dataset performing this on analysis on the complete data was computationally infeasible at this time. Instead I performed a simple random sample of only 5% and conducted my analysis on this subset. See the file "lines_to_np.py" in the code folder for more details on how this sampling was done. Additionally, see the "Data Explorarion" notebook for some explaratory data analysis of the data used here.

## 1. Loading the Data

In [1]:
import numpy as np
import pandas as pd

In [2]:
path = "/Volumes/Samsung_T5/Data/little_array.npy"

df = pd.DataFrame(np.load(path),columns=['steam_id','app_id','interact'])

# Need to re-number steam_id's since they get too big for spark
minimum = np.min(df['steam_id'])
df['user_id'] = df['steam_id'] - minimum
df = df[['user_id','app_id','interact']]

# Train/Test Split
from sklearn.model_selection import train_test_split

train, test = train_test_split(df, test_size=0.2)
print(train.shape,test.shape)

(4783694, 3) (1195924, 3)


In [3]:
# Make Dictionary with Keys = user_ids, Values = list of their hidden games
hidden_games = test.groupby('user_id')['app_id'].apply(list)
hidden_games = hidden_games.to_dict()

In [4]:
import pyspark
from pyspark.sql import SparkSession

# Initialize the Spark Session
spark = SparkSession.builder.getOrCreate()

# Create Spark Dataframe
sp_train = spark.createDataFrame(train)

## 2. Define Helper Functions for Evaluating Performance

In [6]:
from pyspark.mllib.evaluation import RankingMetrics

def eval_ALS(model, hidden_games):
    """
    Inputs: model-- a fitted als model, hidden_games-- dictionary with keys = user_id's from test set 
                    and values = list of the user's hidden (testing) games
    Output: list of model's precision at k=10,20,30. This is the average over all users of the fraction of
    games in the models top k recommendations that belong to the user's hidden games list
    """
    
    # Make Spark Dataframe of users in test set
    users = pd.DataFrame({'user_id': list(hidden_games.keys())})
    sp_users = spark.createDataFrame(users)
    
    # Predict Top 30 Games for each test user
    preds = model.recommendForUserSubset(sp_users, 30).collect()
    
    # Make a list of lists, containing each user's predictions and the games we hid in testing
    recs_list = []
    for user, items in preds:
        pred_items = [item.app_id for item in items]
        recs_list.append((pred_items, hidden_games[user]))
        
    # Get average "hit-rate" for k =10,20,30
    labels = spark.sparkContext.parallelize(recs_list)
    metrics = RankingMetrics(labels)
    return [metrics.precisionAt(k) for k in [10,20,30]]

## 3. Tuning / Testing

In [8]:
ranks = [10,15,20]
scalers = [0.1,0.5,1,5,10]
dash = '-' * 60

print(dash)
print('{:<6s}{:^12s}{:^20s}{:^12s}{:^12s}'.format('Rank','RegParam', 'Precision at k= 10','k= 20', 'k=30'))
print(dash)

for r in ranks:
    for c in scalers:
        als = ALS(rank=r, maxIter=5, regParam=c,
          userCol="user_id", itemCol="app_id", implicitPrefs=True,
          ratingCol="interact" ,coldStartStrategy="drop")
        model = als.fit(sp_train)
        precision = eval_ALS(model,hidden_games)
        print('{:<10d}{:^10.1f}{:^20.3f}{:^12.3f}{:^12.3f}'.format(r,c,precision[0],precision[1],precision[2]))

------------------------------------------------------------
Rank    RegParam   Precision at k= 10    k= 20        k=30    
------------------------------------------------------------
10           0.1           0.131           0.107       0.089    
10           0.5           0.118           0.097       0.081    
10           1.0           0.098           0.081       0.069    
10           5.0           0.003           0.003       0.003    
10           10.0          0.002           0.002       0.002    
15           0.1           0.134           0.111       0.092    
15           0.5           0.121           0.098       0.082    
15           1.0           0.097           0.082       0.068    
15           5.0           0.030           0.023       0.020    
15           10.0          0.000           0.001       0.001    
20           0.1           0.130           0.112       0.094    
20           0.5           0.122           0.099       0.083    
20           1.0           0.097   

In [9]:
ranks = [10,15]
scalers = [0.01,0.05,0.25]
dash = '-' * 60

print(dash)
print('{:<6s}{:^12s}{:^20s}{:^12s}{:^12s}'.format('Rank','RegParam', 'Precision at k= 10','k= 20', 'k=30'))
print(dash)

for r in ranks:
    for c in scalers:
        als = ALS(rank=r, maxIter=5, regParam=c,
          userCol="user_id", itemCol="app_id", implicitPrefs=True,
          ratingCol="interact" ,coldStartStrategy="drop")
        model = als.fit(sp_train)
        precision = eval_ALS(model,hidden_games)
        print('{:<10d}{:^10.1f}{:^20.3f}{:^12.3f}{:^12.3f}'.format(r,c,precision[0],precision[1],precision[2]))

------------------------------------------------------------
Rank    RegParam   Precision at k= 10    k= 20        k=30    
------------------------------------------------------------
10           0.0           0.128           0.107       0.089    
10           0.1           0.131           0.107       0.089    
10           0.2           0.130           0.107       0.088    
15           0.0           0.124           0.108       0.091    
15           0.1           0.130           0.110       0.092    
15           0.2           0.136           0.111       0.092    


In [10]:
ranks = [12,15,18]
scalers = [0.05,0.1,0.15,0.2,0.25,0.3,0.35]
dash = '-' * 60

print(dash)
print('{:<6s}{:^12s}{:^20s}{:^12s}{:^12s}'.format('Rank','RegParam', 'Precision at k= 10','k= 20', 'k=30'))
print(dash)

for r in ranks:
    for c in scalers:
        als = ALS(rank=r, maxIter=5, regParam=c,
          userCol="user_id", itemCol="app_id", implicitPrefs=True,
          ratingCol="interact" ,coldStartStrategy="drop")
        model = als.fit(sp_train)
        precision = eval_ALS(model,hidden_games)
        print('{:<10d}{:^10.2f}{:^20.3f}{:^12.3f}{:^12.3f}'.format(r,c,precision[0],precision[1],precision[2]))

------------------------------------------------------------
Rank    RegParam   Precision at k= 10    k= 20        k=30    
------------------------------------------------------------
12           0.05          0.132           0.109       0.091    
12           0.10          0.133           0.109       0.091    
12           0.15          0.134           0.109       0.091    
12           0.20          0.135           0.109       0.091    
12           0.25          0.135           0.109       0.090    
12           0.30          0.133           0.108       0.089    
12           0.35          0.131           0.106       0.088    
15           0.05          0.130           0.110       0.092    
15           0.10          0.134           0.111       0.092    
15           0.15          0.136           0.111       0.093    
15           0.20          0.136           0.111       0.092    
15           0.25          0.136           0.111       0.092    
15           0.30          0.135   

Using the eval_ALS function I trained and evaluated several models with different ranks and regularization parameters. Recall that the rank is the dimension of the feature/preference vector for each item/user and the regularization parameter is the value we called $\lambda$ in the previous notebook, which was used to scale the $\sum_{i} ||U_i||^2 +  \sum_j ||V_j||^2$ term.

By testing and adjusting, I determined that the best choice of parameter was rank =18 and regParam = 0.25, which gave us a precision at $k$ of $13.8\%$, i.e. about 1 in 7 of our recommendations was actually in the test set. This is pretty good when you consider the sparsity of the matrix. For a baseline comparisson, let's see how we would do if we just recommended the 10 most popular games to every user. We computed these games in the "Data Exploration" notebook.

In [11]:
# Recommend Popular Games Baseline

top_ten_apps = [340,240,320,220,400,10,550,223530,30,40]

top_ten_recs = [(top_ten_apps, hidden_games[user]) for user in list(hidden_games.keys())]
labels = spark.sparkContext.parallelize(top_ten_recs)

metrics = RankingMetrics(labels)

print(metrics.precisionAt(10))

0.0887993750305153


We find that our model significantly outperforms the baseline model which just recommend the top 10 most popular games to every user, which got a precision at 10 of just $8.8\%$. 