# Recipe Recommender Results

This runs the recommender models on the cleaned user-recipes interactions datasets and applies the various evaluation metrics for evaluating the results of our recommendations.

### Start Spark Session

In [1]:
# Code from https://spark.apache.org/docs/2.2.0/ml-collaborative-filtering.html
import pyspark
from pyspark.ml.evaluation import RegressionEvaluator
from pyspark.ml.recommendation import ALS
from pyspark.sql import Row
from pyspark.sql import SparkSession
from pyspark.sql.functions import col


conf = pyspark.SparkConf().setAll([('spark.master', 'local[2]'),
                                   ('spark.app.name', 'Recommender Results')])
spark = SparkSession.builder.config(conf=conf).getOrCreate()

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2022-05-26 17:37:49,116 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


### Read in cleaned user-recipes interactions data frame

In [2]:
file_path = 'file:///home/work/data/interactions_train_cleaned.csv'
ratings = spark.read.csv(file_path, inferSchema = True, header = True)
ratings.show()

                                                                                

+-------+---------+------+
|user_id|recipe_id|rating|
+-------+---------+------+
|  38094|    40893|     4|
|1293707|    40893|     5|
| 190375|   134728|     5|
|1171894|   134728|     5|
| 217118|   200236|     5|
| 202555|   225241|     5|
| 684460|   225241|     5|
| 135017|   254596|     5|
| 224088|   254596|     4|
| 582223|   254596|     5|
| 935485|   321038|     5|
| 102602|    20930|     5|
| 172467|    29093|     5|
|  58332|    41090|     4|
| 160497|    41090|     5|
| 183565|    79222|     5|
| 226989|    79222|     4|
| 868654|    79222|     5|
| 302867|    79222|     5|
| 930021|    79222|     5|
+-------+---------+------+
only showing top 20 rows



### Random split and normalize training and testing

In [3]:
(unnorm_training, unnorm_test) = ratings.randomSplit([0.8, 0.2])
mean = unnorm_training.agg({'rating': 'mean'}).collect()[0][0]
std = unnorm_training.agg({'rating': 'std'}).collect()[0][0]
print(mean, std)
training = unnorm_training.withColumn("rating", (col("rating") - mean) / std)
test = unnorm_test.withColumn("rating", (col("rating") - mean) / std)



4.670459606468806 0.7133942544421343


                                                                                

## (1) Modeling With the ALS Collaborative Filtering, pyspark

This was our first model that we tried out for building the recommender system.  It take the user-recipe rating data and build and ALS collaborative filtering model to predict the rating a user will give a recipe.  The predicted ratings can be used for choosing recommendations

### Generate recipe recommendations with the collaborative filtering model and evaluate with RMSE

Fit collaborative filtering model

In [4]:
# Setting cold start strategy to 'drop' to ensure we don't get NaN evaluation metrics
als = ALS(rank=200, maxIter=20, regParam=0.125, userCol="user_id", itemCol="recipe_id", ratingCol="rating",
          coldStartStrategy="drop")
model = als.fit(training)

2022-05-26 17:40:11,756 WARN netlib.InstanceBuilder$NativeBLAS: Failed to load implementation from:dev.ludovic.netlib.blas.JNIBLAS
2022-05-26 17:40:11,760 WARN netlib.InstanceBuilder$NativeBLAS: Failed to load implementation from:dev.ludovic.netlib.blas.ForeignLinkerBLAS
2022-05-26 17:40:12,505 WARN netlib.InstanceBuilder$NativeLAPACK: Failed to load implementation from:dev.ludovic.netlib.lapack.JNILAPACK
                                                                                

Evaluate model with MSE

In [5]:
normalized_predictions = model.transform(test)
predictions = normalized_predictions.withColumn(
    "rating",col("rating") * std + mean
).withColumn(
    "prediction",col("prediction") * std + mean
)
evaluator = RegressionEvaluator(metricName="mse", labelCol="rating",
                                predictionCol="prediction")

mse = evaluator.evaluate(predictions)
print("The MSE of the recommender model is", mse)
predictions.show()

                                                                                

The MSE of the recommender model is 0.4734603465860362




+-------+---------+------+------------------+
|user_id|recipe_id|rating|        prediction|
+-------+---------+------+------------------+
| 269153|       81|   5.0| 4.733374731108822|
| 194829|       91|   5.0| 4.768165041634855|
| 851190|       92|   5.0| 4.842012929601356|
|  79361|       93|   5.0|4.7234469979523865|
| 983811|       93|   5.0| 4.282348661035171|
| 531768|       94|   5.0| 4.712682822183593|
|  20160|      112|   5.0|  4.35241798296385|
| 719181|      112|   2.0| 4.726094430667179|
| 315301|      142|   5.0| 4.648969488503619|
| 513784|      142|   5.0| 4.725606861928464|
|1386579|      142|   5.0|  4.74752338793877|
| 175405|      175|   4.0| 4.705301482421126|
| 603083|      185|   5.0| 4.578521558135922|
|   1773|      190|   3.0| 5.024686632039058|
| 177567|      190|   5.0| 4.694129449988864|
|   3794|      191|   5.0|4.8129496635299365|
| 680440|      192|   5.0| 4.770042689677924|
| 862233|      192|   5.0| 4.688824982647968|
|1537761|      192|   5.0| 4.76760

                                                                                

## Evaluation Metrics

We will load in the RecEvalMetrics object and thoroughly evaluate the recipe recommender system with various evaluation metrics.

In [6]:
pip install rbo

Collecting rbo
  Downloading rbo-0.1.2-py3-none-any.whl (7.5 kB)
Installing collected packages: rbo
Successfully installed rbo-0.1.2
Note: you may need to restart the kernel to use updated packages.


In [23]:
# !pip install rbo

from functools import reduce
from scipy.stats import kendalltau
from scipy.stats import spearmanr
from sklearn.metrics import mean_squared_error, ndcg_score
import pandas as pd
import numpy as np
import rbo

class RecEvalMetrics(object):


    # Takes user-recipe rating predictions dataframe, returns mean squarred error for top k
    # recipes of each user predicted ratings
    """ Parameters:
        predictions: Dataframe of true and predicted ratings, default 20
        k: Top k predicted ratings to evaluate with mse
    """
    @staticmethod
    def top_k_evaluation(predictions, k = 20):
        users = list(predictions.drop_duplicates(subset = ['user_id'])['user_id'])
        top_k_predictions = []
        
        for user in users:
            user_ratings = predictions[(predictions['user_id'] == user)]
            top_k_user_ratings =  user_ratings.sort_values(by = ['prediction'], ascending = False).head(k)
            top_k_predictions.append(top_k_user_ratings)
        top_k_predictions_df = pd.concat(top_k_predictions, ignore_index = True)
        
        k_mse = mean_squared_error(list(top_k_predictions_df['rating']), list(top_k_predictions_df['prediction']))
        
        return(k_mse) 


    # Takes in user-recipe rating predictions dataframe, returns percent of recipes that ended
    # up in someone's top k.  Larger value means more personalization
    """ Parameters:
        predictions: Dataframe of true and predicted ratings
        k: Top k recipes to count in percentage, default 20
    """
    @staticmethod
    def percent_in_top_ratings(predictions, k = 20):
        total_recipes = len(predictions.drop_duplicates(subset = ['recipe_id']))
        users = list(predictions.drop_duplicates(subset = ['user_id'])['user_id'])

        top_k_predictions = set()
        for user in users:
            user_ratings = predictions[(predictions['user_id'] == user)]
            user_pred_ordered = list(user_ratings.sort_values(by = ['prediction'], ascending = False)['recipe_id'])
            top_k_user_recipes = user_pred_ordered[:k]
            top_k_predictions.update(top_k_user_recipes)

        top_recipes_count = len(top_k_predictions)

        return(top_recipes_count/total_recipes)


    # Take in user-recipe rating predictios dataframe, returns ranked biased overlap
    # between top k predicted ratings and top k actual ratings
    # Refer to: https://github.com/changyaochen/rbo
    """ Parametes:
        predictions: Dataframe of true and predicted ratings
        k: Number of k recipes in the ranked list to evaluate with RBO, default 20
    """
    @staticmethod
    def rbo_evaluation(predictions, k = 20):
        users = list(predictions.drop_duplicates(subset = ['user_id'])['user_id'])

        rbos = []
        for user in users:
            user_ratings = predictions[(predictions['user_id'] == user)]
            user_actual_ordered = list(user_ratings.sort_values(by = ['rating'], ascending = False)['recipe_id'])
            user_pred_ordered = list(user_ratings.sort_values(by = ['prediction'], ascending = False)['recipe_id'])
            top_k_user_actual = user_actual_ordered[:k]
            top_k_user_pred = user_pred_ordered[:k]
            user_rbo = rbo.RankingSimilarity(top_k_user_actual, top_k_user_pred).rbo()
            rbos.append(user_rbo)
        
        rbos_describe = pd.DataFrame(rbos)
        return(rbos_describe.describe())


    # Takes in user-recipe rating predictions dataframe, returns the Kendalls Tau evaluation
    # between actual ratings and predicted ratings
    """ Parameters:
        predictions: Dataframe of true and predicted ratings	
    """
    @staticmethod
    def kendalls_tau(predictions):
        users = list(predictions.drop_duplicates(subset = ['user_id'])['user_id'])

        tau = []
        i = 0
        for user in users:
            user_ratings = predictions[(predictions['user_id'] == user)]
            # Kendall's Tau will not work with list of size 1
            if (len(user_ratings) > 1):
                user_actual_ordered = list(user_ratings.sort_values(by = ['rating'], ascending = False)['recipe_id'])
                user_pred_ordered = list(user_ratings.sort_values(by = ['prediction'], ascending = False)['recipe_id'])
                user_tau, user_p_value = kendalltau(user_actual_ordered, user_pred_ordered)
                tau.append(user_tau)
    
        tau_describe = pd.DataFrame(tau)
        return(tau_describe.describe())


    # Takes in user-recipe rating predictions dataframe, returns the normalized discounted cummulative gain
    # evaluation between actual ratings and predicted ratings
    """ Parameters:
        predictions: Dataframe of ture and predicted ratings
        k: Number of k recipes in the ranked list to evaluate, default None	
    """
    @staticmethod
    def nDCG_evaluation(predictions, k = None):
        users = list(predictions.drop_duplicates(subset = ['user_id'])['user_id'])

        ndcg = []
        for user in users:
            user_ratings = predictions[(predictions['user_id'] == user)]
            if (len(user_ratings) > 1):
                relevance = np.asarray([list(user_ratings['rating'])])
                preds = np.asarray([list(user_ratings['prediction'])])
                score = ndcg_score(relevance, preds, k=k)
                ndcg.append(score)
        
        ndcg_describe = pd.DataFrame(ndcg)
        return(ndcg_describe.describe())
    
    # Takes in user-recipe rating predictions dataframe, returns the Spearman Rank Correlation
    # evaluation between actual ratings and predicted ratings
    """ Parameters:
        predictions: Dataframe of ture and predicted ratings
    """
    @staticmethod
    def spearman_evaluation(predictions, k = None):
        users = list(predictions.drop_duplicates(subset = ['user_id'])['user_id'])

        rhos = []
        for user in users:
            user_ratings = predictions[(predictions['user_id'] == user)]
            if (len(user_ratings) > 1):
                ratings = np.asarray(user_ratings['rating'])
                preds = np.asarray(user_ratings['prediction'])
                rho, pval = spearmanr(ratings, preds)
                rhos.append(rho)
        
        rho_describe = pd.DataFrame(rhos)
        return(rho_describe.describe())

Conver the predictions pyspark dataframe to a pandas dataframe for feeding into metrics evaluations. NOTE: We're doing this to keep the eval_metrics object generalized so that it could also accept dask results following conversion

In [8]:
predictions_df = predictions.toPandas()

                                                                                

### Top 10 MSE

In [15]:
k_mse = RecEvalMetrics.top_k_evaluation(predictions_df, 10)
print("The top 10 MSE:", k_mse)

The top 10 MSE: 0.4750850644602118


### Percent in top 10 (Personalization Assessment)

In [16]:
percent_in_top = RecEvalMetrics.percent_in_top_ratings(predictions_df, 10)
print("The percent of recipes that are in some users top 10:", percent_in_top)

The percent of recipes that are in some users top 10: 0.9696149098004858


### Ranked Biased Overlap, top 10

In [18]:
rbo_summary = RecEvalMetrics.rbo_evaluation(predictions_df, 10)
print(rbo_summary)

                  0
count  19037.000000
mean       0.790631
std        0.236962
min        0.108175
25%        0.500000
50%        1.000000
75%        1.000000
max        1.000000


### Kendall's Tau

In [20]:
kendalls_tau_summary = RecEvalMetrics.kendalls_tau(predictions_df)
print(kendalls_tau_summary)

                  0
count  11834.000000
mean       0.021153
std        0.715416
min       -1.000000
25%       -0.466667
50%        0.000000
75%        0.666667
max        1.000000


### Normalized Discounted Cummulative Gain

In [21]:
ndcg_summary = RecEvalMetrics.nDCG_evaluation(predictions_df, 10)
print(ndcg_summary)

                  0
count  11834.000000
mean       0.981945
std        0.038286
min        0.650069
25%        0.983007
50%        1.000000
75%        1.000000
max        1.000000


### Spearman Rank Correlation

In [24]:
spearman_summary = RecEvalMetrics.spearman_evaluation(predictions_df)
print(spearman_summary)



                 0
count  5710.000000
mean      0.040791
std       0.671445
min      -1.000000
25%      -0.500000
50%       0.000000
75%       0.670820
max       1.000000


## (2) Modeling with the Similarity Scorer and Averaging, dask

This uses an alternative approach to predicting user ratings through similarity scoring metrics and different types of rating averaging.  This part is done in dask.

In [5]:
pip install sparse

Collecting sparse
  Downloading sparse-0.13.0-py2.py3-none-any.whl (77 kB)
[K     |████████████████████████████████| 77 kB 6.1 MB/s eta 0:00:011
Installing collected packages: sparse
Successfully installed sparse-0.13.0
Note: you may need to restart the kernel to use updated packages.


In [1]:
import dask
import dask.array as da
import dask.dataframe as dd
import sparse
import dask_ml
import time

import numpy as np
import pandas as pd

from dask.distributed import Client
client = Client(memory_limit='6GB')

2022-05-26 00:20:47,486 - distributed.diskutils - INFO - Found stale lock file and directory '/home/work/dask-worker-space/worker-1k2guxok', purging
2022-05-26 00:20:47,496 - distributed.diskutils - INFO - Found stale lock file and directory '/home/work/dask-worker-space/worker-ab10wnq0', purging
2022-05-26 00:20:47,509 - distributed.diskutils - INFO - Found stale lock file and directory '/home/work/dask-worker-space/worker-k8nzx1qj', purging


In [2]:
seed = 25
ddf = dd.read_csv("data/interactions_train_cleaned.csv")
train, val = dask_ml.model_selection.train_test_split(ddf, test_size=0.1, train_size=0.9,shuffle=True,random_state=seed)
print(len(train), len(val))

308015 34001


### Using Averages as Predicted Ratings:

Computing rating average and rating standard deviation ...

In [3]:
rating_avg = train.rating.mean().compute()
rating_std = train.rating.std().compute()
print(rating_avg, rating_std)

4.671889356037855 0.7121078429179404


### Baseline 1: Predict using global average rating among all users

How well would a model perform with just rating averages as predictions?

In [4]:
val["prediction"] = rating_avg
print(dask_ml.metrics.mean_squared_error(val.rating.to_dask_array(), val.prediction.to_dask_array()))

0.5169555594737777


### Baseline 2: Predict using average rating for each user

In [5]:
user_avgs = train.groupby("user_id").rating.mean().compute()
val["prediction"] = val.user_id.apply(
    lambda x: user_avgs[x] if x in user_avgs else rating_avg, 
    meta=('user_id', 'int64')
)
print(dask_ml.metrics.mean_squared_error(val.rating.to_dask_array(), val.prediction.to_dask_array()))

0.4964676076844779


### Baseline 3: Predict using average rating for each user, bayesian style

In [6]:
bayesian_df = pd.DataFrame()
user_avgs = train.groupby("user_id").rating.mean().compute()
user_counts = train.groupby("user_id").rating.count().compute()
k = 6
val["personal_rating"] = val.user_id.apply(
    lambda x: (rating_avg * k + user_avgs[x] * user_counts[x]) / (user_counts[x] + k) if x in user_avgs else rating_avg, 
    meta=('personal_rating', 'float32')
)
err = dask_ml.metrics.mean_squared_error(val.rating.to_dask_array(), val.personal_rating.to_dask_array())
bayesian_df = bayesian_df.append({"k": k, "err": err}, ignore_index=True)
print(bayesian_df)

     k       err
0  6.0  0.470208


  bayesian_df = bayesian_df.append({"k": k, "err": err}, ignore_index=True)


### Similarity Scorer Recommender Modeling

In [7]:
train["personal_rating"] = train.user_id.apply(
    lambda x: (rating_avg * k + user_avgs[x] * user_counts[x]) / (user_counts[x] + k) if x in user_avgs else rating_avg, 
    meta=('personal_rating', 'float32')
)
train["person_normalized_rating"] = train.rating - train.personal_rating

class SimilarityScorer:
    def __init__(self, interactions):
        start = time.time()
        print("Generating indices")
        user_codes, self.user_idx_to_id = pd.factorize(interactions.user_id.compute())
        recipe_codes, self.recipe_idx_to_id = pd.factorize(interactions.recipe_id.compute())
        self.user_id_to_idx = {user: idx for idx, user in enumerate(self.user_idx_to_id)}
        self.recipe_id_to_idx = {recipe: idx for idx, recipe in enumerate(self.recipe_idx_to_id)}
        
        print("Creating sparse matrix", time.time() - start)
        s = sparse.COO(
            [user_codes, recipe_codes],
            interactions.person_normalized_rating.compute(),
            shape=(len(self.user_idx_to_id), len(self.recipe_idx_to_id)),
            fill_value=0
        )
        self.sparse_mat = da.from_array(s, chunks=(5000, 5000))
        print("Generating dot products", time.time() - start)

        dot_product_similarities = (self.sparse_mat @ self.sparse_mat.T).compute()
        print("Raw similarities computed", time.time() - start)
        
        dense_similarities = dot_product_similarities.todense()
        sims_summed = dense_similarities.sum(axis=1) + 1e-20
        self.similarities = dense_similarities / sims_summed.reshape(-1, 1)
        self.sparse_mat = self.sparse_mat.compute()
        print("Similarities normalized!", time.time() - start)

    def predict_topk_for_user(self, user_id, k):
        user_idx = self.user_id_to_idx[user_id]
        similarities_norm = self.similarities_normalized[user_idx]
        recs = self.similarities[user_idx] @ self.sparse_mat
        rec_values = recs.topk(k)
        rec_idxs = recs.argtopk(k)
        recs_ids = [self.recipe_idx_to_id[idx] for idx in rec_idxs]
        return recs_ids, rec_values
    
    def predict_pair(self, user_id, recipe_id):
        if recipe_id not in self.recipe_id_to_idx or user_id not in self.user_id_to_idx:
            return 0
        user_idx = self.user_id_to_idx[user_id]
        recipe_idx = self.recipe_id_to_idx[recipe_id]
        predicted_score = self.similarities[user_idx] @ self.sparse_mat[:, recipe_idx]
        return predicted_score

scorer = SimilarityScorer(train)

Generating indices
Creating sparse matrix 12.197404861450195
Generating dot products 20.847816228866577


AttributeError: 'COO' object has no attribute 'squeeze'