# Modeling <sup>[1]</sup>

### Evaluation: performance criterion

Performance evaluation of recommendation systems include:

- RMSE: $\sqrt{\frac{\sum(\hat y - y)^2}{n}}$
- Precision / Recall / F-scores
- ROC curves
- Cost curves

## Imports

In [1]:
import numpy as np
import pandas as pd

## Load Data

Only loading a subset of the original data set because this is an educational project.

In [2]:
# 80/20 split earlier
df_train = pd.read_csv('../Data/training_data_subset.csv')
df_test = pd.read_csv('../Data/testing_data_subset.csv')

In [3]:
df_train.head(2)

Unnamed: 0,category,description,title,also_buy,brand,rank,also_view,main_cat,price,asin,details,overall,verified,reviewerID,reviewText,summary,vote,style,for_testing
0,"['Grocery & Gourmet Food', 'Sauces, Gravies & ...",['Sriracha chili sauce made from sun ripened c...,"Huy Fong Sriracha Chili Sauce, 28 Ounce Bottle...","['B001E5DZZM', 'B003NROMC4', 'B00U9VTL5U', 'B0...",Huy Fong,"145,292 in Grocery & Gourmet Food (","['B001E5DZZM', 'B008AV5HLS', 'B00U9VTL5U', 'B0...",Grocery,,B00BT7C9R0,"{'Shipping Weight:': '11.4 pounds', 'ASIN: ': ...",5.0,True,A3FYXMWYC9KUCK,I have been using Sriracha for several years n...,This stuff is great!,,,False
1,"['Grocery & Gourmet Food', 'Breakfast Foods', ...",['belVita Chocolate Breakfast Biscuits are lig...,"belVita Chocolate Breakfast Biscuits, 5 Count ...","['B00QF27JL0', 'B01BNIN5ZO', 'B01FLPFPOY', 'B0...",Belvita,"19,427 in Grocery & Gourmet Food (","['B01COWTO4O', 'B01FLPFPOY', 'B00QF27JL0', 'B0...",Grocery,,B00IO2DO2W,"{'Shipping Weight:': '4.1 pounds', 'Domestic S...",5.0,True,A2OWR2PL3DLWS4,My daughter is a Belvita addict. She likes al...,Delciious,,,False


In [4]:
df_test.head(2)

Unnamed: 0,category,description,title,also_buy,brand,rank,also_view,main_cat,price,asin,details,overall,verified,reviewerID,reviewText,summary,vote,style,for_testing
0,"['Grocery & Gourmet Food', 'Produce', 'Fresh V...","['<div class=""aplus""> <div class=""three-fourth...","Organic Green Cabbage, 1 Head",,produce aisle,,,Grocery,,B000P6H29Q,{'\n Product Dimensions: \n ': '7.5 x 6....,5.0,True,A1NKRXSU63EA4M,Hugh and delicious,Five Stars,,,True
1,"['Grocery & Gourmet Food', 'Cooking & Baking',...",['Light & Fluffy. Just add water. Made with re...,"Krusteaz Complete Pancake Mix, Buttermilk, 32 oz","['B000R32RJC', 'B07CX6LN8T', 'B000PXZZQG', 'B0...",Krusteaz,,"['B00DXGGSBI', 'B00CEMP2Z0', 'B00BP2RY42', 'B0...",Grocery,,B000QCLEB6,{'\n Product Dimensions: \n ': '6.1 x 2....,5.0,True,A3TR0FIT13SSVN,Great flavor and surprisingly fluffy out of th...,Surprisingly good :),6.0,,True


### RMSE

In [5]:
def compute_rmse(y_pred, y_true):
    """ Compute Root Mean Squared Error. """
    
    return np.sqrt(np.mean(np.power(y_pred - y_true, 2)))

### Evaluation method

In [6]:
def evaluate(estimate_f):
    """ RMSE-based predictive performance evaluation with pandas. """
    
    ids_to_estimate = zip(df_test.reviewerID, df_test.asin)
    estimated = np.array([estimate_f(u,i) for (u,i) in ids_to_estimate])
    real = df_test.overall.values
    return compute_rmse(estimated, real)

### Baseline function

In [7]:
# This is a baseline that just gives an average rating to everything
def baseline_function(user_id, product_id):
    return 3

In [8]:
print('RMSE for baseline function: %s' % evaluate(baseline_function))

RMSE for baseline function: 1.801249566273369


Want to improve on 1.8012 for all future analysis.  
A value of 0 means there is no error, and the recommendation is perfect.
A value of 4 is the maximum amount it could be off (5-1).

In [9]:
def hard_coded_5_function(user_id, product_id):
    return 5

In [10]:
print('RMSE for hard coded most common rating: %s' % evaluate(hard_coded_5_function))

RMSE for hard coded most common rating: 1.1603878661895772


This is lower than the previous baseline, and the results make sense because the majority of the reviews are 5's.

In [11]:
def hard_coded_4_function(user_id, product_id):
    return 4

In [12]:
print('RMSE for hard coded 4: %s' % evaluate(hard_coded_4_function))

RMSE for hard coded 4: 1.1382003338604325


A hard coded 4 is the best so far and makes sense because there are a full range of values for the reviews. 
Despite 5 being the most common rating, it doesn't mean it is closest to the average rating.

## Summary
- Using a subset of the data set going forward for processing speed and testing reasons in this educational project.
- Also, would still do this step first in most scenarios to get feedback from several models faster.
- Then, for the best performing models, would analyze them using the full data sets to account for more scenarios.
- Using the Root Mean Square Error (RMSE) to determine how well the model performed; i.e. how close the prediction was to the actual rating.
- The baseline results simply set all recommendations to the same hard coded value in various scenarios.
- Therefore, we want all future models to improve on these recommendations.
- The hard_coded_4_function had the best result with a RMSE of 1.1382. A score of 0 is perfect.

## References
1) Unata 2015 [Hands-on with PyData: How to Build a Minimal Recommendation Engine](https://www.youtube.com/watch?v=F6gWjOc1FUs).  