1. As in your previous assignments, compare the accuracy of at least two
recommender system algorithms against your offline data.
2. Implement support for at least one business or user experience goal such as
increased serendipity, novelty, or diversity.
3. Compare and report on any change in accuracy before and after you’ve made the
change in #2.
4. As part of your textual conclusion, discuss one or more additional experiments that
could be performed and/or metrics that could be evaluated only if online evaluation
was possible. Also, briefly propose how you would design a reasonable online
evaluation environment. 

This project, uses data from Institut für Informatik and contains user ratings for various books: http://www2.informatik.uni-freiburg.de/~cziegler/BX/

The recommender systems are similar to ones used in previous assignments

In [2]:
!pip install surprise



In [0]:
#load libraries
import pandas as pd
import numpy as np
import random
from sklearn.neighbors import NearestNeighbors
from scipy.sparse import csr_matrix 
import csv
import surprise
from surprise import SVD
from surprise import Dataset, Reader
from surprise.model_selection import cross_validate

#import data
reviews = []
openfile =open('BX-Book-Ratings.csv', 'r', newline='', encoding='ISO-8859-1')
r = csv.reader(openfile, delimiter = ';')
for i in r:
    reviews.append(i)
openfile.close()

reviews_df = pd.DataFrame(reviews)
header = reviews_df.iloc[0]
reviews_df = reviews_df.rename(columns = header)

#there are 36,000 books in the dataset. Going to only include the top 500 and bottom 200 rated books 
a = reviews_df['ISBN'].value_counts()
top_rated_books = a[0:500]
least_rated_books = a[36281:36281]
books = top_rated_books + least_rated_books
books = books.index 
reviews_df = reviews_df[reviews_df['ISBN'].isin(books)]

#there are 29,000 users in the dataset. Going to only include the top 5000 and bottom 4000 common users 
u = reviews_df['User-ID'].value_counts()
top_rated_users = u[0:5000]
least_rated_users = u[25202:29202]
users = top_rated_users + least_rated_users
users = users.index 
reviews_df = reviews_df[reviews_df['User-ID'].isin(users)]

In [0]:
#this dataset has a lot of null values, more than others I have been working with in this class. Therefore, it could be challening to obtain a good recommendation
user_matrix = (reviews_df.pivot_table(index=['User-ID'], 
                      columns=['ISBN'], 
                      values=['Book-Rating'],
                      aggfunc='first'))
user_matrix = user_matrix.astype(float)
user_matrix_na = user_matrix.copy()
user_matrix = user_matrix.fillna(1)

In [5]:
#Using SVD

from scipy.sparse.linalg import svds
from scipy.sparse import csr_matrix

R = user_matrix.as_matrix()
user_ratings_mean = np.mean(R, axis = 1)
R_demeaned = R - user_ratings_mean.reshape(-1, 1)

U, s, Vt = svds(R_demeaned, k = 50)
sigma = np.diag(s)
all_user_predicted_ratings = np.dot(np.dot(U, sigma), Vt) + user_ratings_mean.reshape(-1, 1)
preds_df = pd.DataFrame(all_user_predicted_ratings, columns = user_matrix.columns, index = user_matrix.index)

preds_df.head(5)

  """


Unnamed: 0_level_0,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating
ISBN,002542730X,006016848X,0060173289,0060175400,0060199652,0060391626,0060392452,0060502258,0060915544,0060921145,0060928336,0060930535,0060934417,0060938455,0060958022,0060959037,0060976845,0060977493,0060987103,0060987529,0060987561,0061009059,006101351X,0061015725,0061097101,0061097314,0062502182,0064400557,0064407667,0066214122,0070212570,0099771519,0140067477,0140119906,014023313X,0140244824,014025448X,014028009X,0140293248,014029628X,...,0743467523,0749397543,0767900383,0767902521,0767905180,0767905385,0786817070,0786867647,0786868716,0786881852,0786885688,0802130208,080410526X,0804106304,080410753X,080411109X,0804111359,0804114986,080411868X,0805063897,0812511816,0812550706,0842329129,0842329218,0842342702,0871136791,089480829X,0971880107,140003065X,1400031354,1400031362,1400034779,155874262X,1558743669,1558744150,1558745157,1559029838,1573225789,1573229326,1878424319
User-ID,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2,Unnamed: 26_level_2,Unnamed: 27_level_2,Unnamed: 28_level_2,Unnamed: 29_level_2,Unnamed: 30_level_2,Unnamed: 31_level_2,Unnamed: 32_level_2,Unnamed: 33_level_2,Unnamed: 34_level_2,Unnamed: 35_level_2,Unnamed: 36_level_2,Unnamed: 37_level_2,Unnamed: 38_level_2,Unnamed: 39_level_2,Unnamed: 40_level_2,Unnamed: 41_level_2,Unnamed: 42_level_2,Unnamed: 43_level_2,Unnamed: 44_level_2,Unnamed: 45_level_2,Unnamed: 46_level_2,Unnamed: 47_level_2,Unnamed: 48_level_2,Unnamed: 49_level_2,Unnamed: 50_level_2,Unnamed: 51_level_2,Unnamed: 52_level_2,Unnamed: 53_level_2,Unnamed: 54_level_2,Unnamed: 55_level_2,Unnamed: 56_level_2,Unnamed: 57_level_2,Unnamed: 58_level_2,Unnamed: 59_level_2,Unnamed: 60_level_2,Unnamed: 61_level_2,Unnamed: 62_level_2,Unnamed: 63_level_2,Unnamed: 64_level_2,Unnamed: 65_level_2,Unnamed: 66_level_2,Unnamed: 67_level_2,Unnamed: 68_level_2,Unnamed: 69_level_2,Unnamed: 70_level_2,Unnamed: 71_level_2,Unnamed: 72_level_2,Unnamed: 73_level_2,Unnamed: 74_level_2,Unnamed: 75_level_2,Unnamed: 76_level_2,Unnamed: 77_level_2,Unnamed: 78_level_2,Unnamed: 79_level_2,Unnamed: 80_level_2,Unnamed: 81_level_2
100004,0.968942,1.043124,1.120346,0.989396,0.898873,1.015123,0.790131,1.19536,1.243414,1.010119,0.906969,1.446407,1.505791,0.867416,1.003705,1.158392,1.542883,1.105817,1.047067,0.925089,1.235036,0.819786,1.342627,1.043212,1.132979,1.045049,0.933875,1.130529,1.075291,1.032806,1.069508,1.0434,1.068774,1.054972,0.850359,0.99541,1.00341,1.070137,0.811821,0.933385,...,0.931961,0.95373,0.939224,1.123219,1.100143,1.026471,0.931155,0.90505,0.47,0.996819,0.871051,1.119363,0.95843,1.394647,0.821235,0.990139,1.17405,0.858471,1.017038,1.023658,1.047418,0.993149,1.953902,1.5233,1.046779,1.020568,1.020547,0.985345,1.063826,0.729549,0.795651,0.696616,0.841065,1.284217,1.12928,1.049514,0.947717,0.916396,0.991961,1.021588
100009,0.999877,1.147274,1.083304,1.154993,1.133223,1.149889,2.403191,2.466775,1.122589,1.116944,1.042394,1.127524,0.797378,1.952217,0.887337,0.823908,1.661108,1.066024,0.323846,0.904421,0.964353,0.740571,1.029518,1.051739,0.913478,1.242654,0.898615,0.858542,0.608388,1.171737,1.072171,0.96151,1.171772,1.065762,1.213658,0.770137,1.082368,0.789984,1.170182,1.079296,...,1.198148,1.057318,1.270484,1.336444,0.980153,1.344806,1.004131,0.880698,0.687234,0.966718,1.018315,1.254606,1.177263,2.073611,1.468229,1.604088,1.108732,1.273864,0.977891,0.82932,1.051416,1.458653,1.160029,1.296937,1.120479,1.10657,1.009914,0.787179,1.004752,1.157365,0.834327,1.358702,0.802707,0.433402,0.814146,0.854235,1.124533,0.945633,0.739723,1.187031
10001,1.010174,0.997165,0.98817,0.986986,1.020312,0.992522,0.958357,1.000543,0.963449,1.012243,1.027039,0.991843,1.010962,1.0293,1.012855,0.959047,0.884368,0.973254,1.02501,1.026204,0.918857,1.013972,0.997901,0.995466,0.984339,0.976685,1.041925,0.987711,1.028168,1.012727,1.053321,0.974894,0.99695,1.006785,0.994523,1.010347,0.985512,0.974856,0.996945,0.996218,...,1.011446,1.002415,1.014177,1.009291,0.976695,1.006244,1.016322,1.001974,0.996451,1.019045,1.03096,1.004514,1.014322,0.973468,0.989515,0.992907,0.991076,0.969325,0.968626,0.985821,0.998285,1.026666,1.015921,0.992382,1.001211,1.027988,0.996248,1.008566,1.005562,0.991921,0.995929,1.008272,1.014998,1.031246,1.010273,1.012892,0.97653,1.00037,1.0153,0.997773
100030,0.99536,0.99687,0.997332,0.995241,0.994682,0.994866,0.996704,1.000982,0.99606,0.995396,1.015365,1.010296,1.00868,0.987295,0.994507,0.998828,0.998152,0.996576,1.000486,0.996923,0.994463,0.998987,0.997083,0.996446,0.998681,0.996694,0.997149,0.9927,0.996297,0.994679,1.001463,0.996534,0.997572,0.99683,0.994153,0.993676,0.994514,0.999803,1.000688,0.999634,...,0.997391,0.995331,0.997337,0.997462,0.996854,0.995566,0.996809,0.998788,1.000237,0.998215,0.995916,0.995537,0.996109,1.011623,1.001121,0.999147,0.998044,0.99513,0.995869,0.997174,0.994902,0.998848,1.006552,1.001556,0.996644,0.996967,0.995963,0.990185,0.997127,0.997785,0.995073,0.993956,0.995978,0.996653,0.997191,0.994825,0.997549,0.992878,0.997476,0.996337
100053,1.02084,1.019857,1.041798,0.922934,0.827943,0.859502,0.888098,1.204609,0.833419,0.943418,0.970415,1.010169,0.92067,0.296179,1.011261,1.019948,1.38205,1.249814,1.375052,0.951067,0.704477,0.810482,0.820813,0.910489,0.973929,1.288283,1.135127,0.887628,1.297509,0.92729,0.948839,1.218234,0.977551,1.146264,0.919351,0.998336,0.728711,0.897955,1.20439,0.999476,...,0.979076,1.114496,1.018742,0.878606,1.088778,1.019201,1.210045,1.303963,0.791811,1.084813,1.157097,1.063301,1.096882,1.159837,1.17333,1.09246,1.410157,0.726143,0.909102,0.706938,0.871617,0.992696,1.24289,1.098378,0.962912,0.964291,1.155123,0.925039,1.063866,0.686713,0.92988,1.170384,1.087334,1.176084,1.053713,0.851184,1.083864,0.832059,1.007886,1.067784


In [6]:
"""Calculate the RMSE"""
SE = (preds_df-user_matrix)*(preds_df-user_matrix)
MSE = SE.mean().mean()
RMSE = MSE ** (1/2)
"RMSE is " + str(RMSE)

'RMSE is 0.48518449738930386'

In [8]:
from surprise.model_selection import train_test_split
from surprise import accuracy

reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(reviews_df[['User-ID', 'ISBN', 'Book-Rating']], reader)

# sample random trainset and testset
trainset, testset = train_test_split(data, test_size=.25)

# We'll use the famous SVD algorithm.
algo = SVD()

# Train the algorithm on the trainset, and predict ratings for the testset
algo.fit(trainset)
predictions = algo.test(testset)

# Then compute RMSE
accuracy.rmse(predictions)

RMSE: 3.7607


3.7606617968896465

In [35]:
#See if better than Random

df_rand = 5 - pd.DataFrame(np.random.randn(user_rating_na.shape[0],user_rating_na.shape[1]), columns= user_rating_na.columns, index = user_rating_na.index )

"""Calculate the RMSE"""
SE = (df_rand-user_matrix)*(df_rand-user_matrix)
MSE = SE.mean().mean()
RMSE = MSE ** (1/2)
"RMSE is " + str(RMSE)

'RMSE is 4.131758822433176'

In [0]:
#Increased Serendipity By Adding Random Data Variables to Skew dataset

user_rating_na = user_matrix.mask(np.random.random(user_matrix.shape) < .1)
df_rand = 5 - pd.DataFrame(np.random.randn(user_rating_na.shape[0],user_rating_na.shape[1]), columns= user_rating_na.columns, index = user_rating_na.index )
user_rating_rand = user_rating_na.fillna(df_rand)

In [37]:
#Using SVD

from scipy.sparse.linalg import svds
from scipy.sparse import csr_matrix

R = user_rating_rand.as_matrix()
user_ratings_mean = np.mean(R, axis = 1)
R_demeaned = R - user_ratings_mean.reshape(-1, 1)

U, s, Vt = svds(R_demeaned, k = 50)
sigma = np.diag(s)
all_user_predicted_ratings = np.dot(np.dot(U, sigma), Vt) + user_ratings_mean.reshape(-1, 1)
preds_df = pd.DataFrame(all_user_predicted_ratings, columns = user_matrix.columns, index = user_matrix.index)

preds_df.head(5)

  """


Unnamed: 0_level_0,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating,Book-Rating
ISBN,002542730X,006016848X,0060173289,0060175400,0060199652,0060391626,0060392452,0060502258,0060915544,0060921145,0060928336,0060930535,0060934417,0060938455,0060958022,0060959037,0060976845,0060977493,0060987103,0060987529,0060987561,0061009059,006101351X,0061015725,0061097101,0061097314,0062502182,0064400557,0064407667,0066214122,0070212570,0099771519,0140067477,0140119906,014023313X,0140244824,014025448X,014028009X,0140293248,014029628X,...,0743467523,0749397543,0767900383,0767902521,0767905180,0767905385,0786817070,0786867647,0786868716,0786881852,0786885688,0802130208,080410526X,0804106304,080410753X,080411109X,0804111359,0804114986,080411868X,0805063897,0812511816,0812550706,0842329129,0842329218,0842342702,0871136791,089480829X,0971880107,140003065X,1400031354,1400031362,1400034779,155874262X,1558743669,1558744150,1558745157,1559029838,1573225789,1573229326,1878424319
User-ID,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2,Unnamed: 26_level_2,Unnamed: 27_level_2,Unnamed: 28_level_2,Unnamed: 29_level_2,Unnamed: 30_level_2,Unnamed: 31_level_2,Unnamed: 32_level_2,Unnamed: 33_level_2,Unnamed: 34_level_2,Unnamed: 35_level_2,Unnamed: 36_level_2,Unnamed: 37_level_2,Unnamed: 38_level_2,Unnamed: 39_level_2,Unnamed: 40_level_2,Unnamed: 41_level_2,Unnamed: 42_level_2,Unnamed: 43_level_2,Unnamed: 44_level_2,Unnamed: 45_level_2,Unnamed: 46_level_2,Unnamed: 47_level_2,Unnamed: 48_level_2,Unnamed: 49_level_2,Unnamed: 50_level_2,Unnamed: 51_level_2,Unnamed: 52_level_2,Unnamed: 53_level_2,Unnamed: 54_level_2,Unnamed: 55_level_2,Unnamed: 56_level_2,Unnamed: 57_level_2,Unnamed: 58_level_2,Unnamed: 59_level_2,Unnamed: 60_level_2,Unnamed: 61_level_2,Unnamed: 62_level_2,Unnamed: 63_level_2,Unnamed: 64_level_2,Unnamed: 65_level_2,Unnamed: 66_level_2,Unnamed: 67_level_2,Unnamed: 68_level_2,Unnamed: 69_level_2,Unnamed: 70_level_2,Unnamed: 71_level_2,Unnamed: 72_level_2,Unnamed: 73_level_2,Unnamed: 74_level_2,Unnamed: 75_level_2,Unnamed: 76_level_2,Unnamed: 77_level_2,Unnamed: 78_level_2,Unnamed: 79_level_2,Unnamed: 80_level_2,Unnamed: 81_level_2
100004,1.574741,1.835344,2.411033,1.362584,1.506742,1.991462,2.087952,0.776185,2.768995,0.943156,0.21852,-0.014693,1.189851,2.712521,1.683083,2.138504,1.153459,1.416966,1.089875,1.534944,2.337214,0.578676,1.56498,1.646998,2.143334,2.030951,0.096361,1.997129,1.456112,1.149601,0.130323,1.388619,1.051378,1.420784,1.32864,2.095519,1.793257,3.701816,1.457968,1.405926,...,1.310834,1.377249,1.848165,1.376847,1.154584,2.032582,2.273664,1.46306,0.710696,0.774242,0.865403,1.269667,1.251622,3.869008,2.064368,1.890169,1.715411,1.522866,1.199617,0.990809,2.313276,0.587046,2.639452,1.249328,1.34331,1.174857,1.692289,1.32869,1.810384,1.073353,0.957041,1.847322,1.683121,1.733215,1.512619,1.986462,1.152912,1.44937,1.775769,1.866581
100009,1.086284,1.148694,1.839725,1.624072,1.194914,0.515459,2.936816,2.760968,1.140947,1.948025,0.481963,2.153661,2.1034,2.078296,0.890754,0.838015,1.499856,1.253516,1.790018,2.010039,1.316124,0.950662,1.184602,1.392642,1.68386,1.328177,1.734486,1.937382,1.050682,1.919504,2.367163,0.979115,1.557319,1.510856,1.740022,1.160751,1.616452,1.720358,2.110639,2.222124,...,1.689757,1.529281,1.454953,1.179065,1.470126,1.685409,1.657162,1.415685,0.05073,0.717435,1.437931,1.784656,1.65766,0.600771,0.964339,1.012059,1.895191,0.806985,1.85626,1.212679,1.046356,2.103674,1.137711,1.786255,1.207577,1.256661,1.240424,1.718666,1.622782,1.109449,1.707729,1.164247,0.67619,0.684026,1.747478,1.078342,1.481928,1.591033,1.622336,1.763124
10001,1.696376,1.09433,2.196722,1.068573,1.656641,1.306207,1.678207,1.146555,0.891565,1.606092,1.430985,1.618157,0.820548,1.77245,0.880915,1.007754,0.540355,1.169589,0.408173,1.313844,1.906946,1.441375,0.786978,1.375756,1.297829,1.747493,1.021401,0.928944,1.22357,1.952789,0.590185,1.565778,1.49682,0.842161,1.342258,1.587125,2.152425,1.453728,0.552396,1.19808,...,1.144997,0.841317,1.468916,1.37067,1.772204,1.541205,1.311486,1.601719,1.09194,1.29231,1.32727,1.729272,1.019917,-0.043889,0.784745,1.340993,1.262662,1.959998,1.393075,0.503297,1.425327,1.474537,1.365307,0.961261,0.811309,0.779381,1.200053,2.200177,1.777461,1.372069,1.30625,1.315247,1.886485,1.662343,1.723777,1.696743,1.244237,1.626798,1.536803,1.611266
100030,1.41062,1.007396,0.990442,1.803547,1.230168,1.521471,1.239843,1.813385,0.772248,1.888658,1.400696,1.027067,1.642843,3.121225,1.172349,1.710653,1.033369,1.218649,1.173822,1.368835,1.720443,2.562945,1.119241,1.426921,1.65565,1.954339,2.102779,1.335347,0.848662,1.149778,1.83196,1.245048,1.365076,1.059795,1.559577,1.650216,1.221667,2.220362,1.143284,0.806585,...,1.631507,1.679467,1.104537,0.705455,1.085684,1.235311,1.646303,2.062532,1.182639,1.311512,1.30934,1.533494,1.010845,0.229888,0.853755,0.451895,1.236498,1.096592,1.865809,2.144803,1.479641,1.190611,1.020127,1.190349,1.468128,1.696167,1.678601,1.058891,1.244806,0.537361,0.909732,0.271894,1.837819,1.676387,1.621978,1.579835,1.645776,1.894345,1.258396,1.333742
100053,0.921277,1.801306,1.434233,1.132929,1.229347,2.044096,1.547148,0.968947,1.452863,1.666717,3.070808,1.025595,1.066971,2.230992,1.59477,1.207742,1.912937,1.289747,1.288118,1.643591,1.623209,2.396743,1.393546,1.37043,1.608816,0.74285,1.611083,0.900601,0.838317,1.167007,0.364078,1.175744,1.595416,0.837636,1.382953,1.228812,1.677865,1.669447,1.118618,1.380318,...,1.504198,1.178889,1.216698,0.970699,1.781754,1.461838,1.836421,1.945517,1.535284,1.193617,1.233593,1.337184,0.647472,0.771145,0.783121,1.740812,1.281039,1.814284,1.3966,1.02352,1.418558,1.55492,1.337263,1.52855,1.204639,1.386536,1.289307,1.309001,1.457951,1.01357,1.248722,2.297279,1.525803,1.85568,1.395296,1.693444,0.825496,1.373822,1.554621,1.459173


In [38]:
"""Calculate the RMSE"""
SE = (preds_df-user_matrix)*(preds_df-user_matrix)
MSE = SE.mean().mean()
RMSE = MSE ** (1/2)
"RMSE is " + str(RMSE)

'RMSE is 0.7852547716849505'

The RMSE has increased by adding in random datapoints, however, it is still a good model. Adding 10% random data could help with serendipity, getting users to try books that are sightly out of their comfort zone, but might still be something that they are interested in reading.

As part of your textual conclusion, discuss one or more additional experiments that could be performed and/or metrics that could be evaluated only if online evaluation was possible. Also, briefly propose how you would design a reasonable online evaluation environment.

- One test might be A/B testing, where users are divided into groups and given recommendations from a variety of the recommendation systems here. If one group tends to click on or purchase more recommended books than another, it would be a good test to see which recommender system works best.