# Music Recommendation System
$$
\renewcommand{\like}{{\cal L}}
\renewcommand{\loglike}{{\ell}}
\renewcommand{\err}{{\cal E}}
\renewcommand{\dat}{{\cal D}}
\renewcommand{\hyp}{{\cal H}}
\renewcommand{\Ex}[2]{E_{#1}[#2]}
\renewcommand{\x}{{\mathbf x}}
\renewcommand{\v}[1]{{\mathbf #1}}
$$

In this project, we build a music recommendation system based on the listening records of users and users' interactions. The goal is to provide each user a ranked list of recommended artists that he/she might like. Two main methods for the recommendation system are:
<ul>
  <li> <b>Collaborative Filtering:</b>  Recommend the favorite artists of those who are friends of the user. This assumes that users who have connected on the music platform have the same taste of music. No advanced algorithms will be used. Accuracy will be tested on how similiar the recommended artists list is to the list of artists that a user has actually listened to.
  <li> <b>Content Based:</b>  Extract information on a user's favorite artists to recommend similar artists to the user. We will separate users' favorite artists into a training set and a test set. The algorithm to be used is matrix decomposition and Stochastic Gradient Descent(SGD). Accuracy will be tested on if the recommmended artists list match the list in the test set for each user.
</ul>
</div>
<br/>
<div class="span5 alert alert-success">

</div>


In [1]:
from google.colab import drive 
drive.mount('/content/gdrive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/gdrive


## Content-Based Recommendation

Import relevant modules and read datasets

In [38]:
import gc
gc.collect()

2049

In [0]:
import random
random.seed(42)

In [68]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import time
user_artists = pd.read_table('gdrive/My Drive/Project2/hetrec2011-lastfm-2k/user_artists.dat', sep='\t', header=None, skiprows=1)
user_artists = user_artists.rename(index = int, columns={0: "UserID", 1: "ArtistID", 2: "Rate"})
user_artists.Rate = user_artists.Rate/100
user_artists.head()

Unnamed: 0,UserID,ArtistID,Rate
0,2,51,138.83
1,2,52,116.9
2,2,53,113.51
3,2,54,103.0
4,2,55,89.83


Store the user list and the artist list  

In [0]:
user_list = user_artists.UserID.unique()
artist_list = user_artists.ArtistID.unique()

Store pairs of (UserID, ArtistID) in a new column called UA 

In [70]:
user_artists_no_rate = user_artists.drop(columns=['Rate'])
user_artists['UA']= list(zip(user_artists_no_rate.UserID, user_artists_no_rate.ArtistID))
user_artists.head()

Unnamed: 0,UserID,ArtistID,Rate,UA
0,2,51,138.83,"(2, 51)"
1,2,52,116.9,"(2, 52)"
2,2,53,113.51,"(2, 53)"
3,2,54,103.0,"(2, 54)"
4,2,55,89.83,"(2, 55)"


A simplified version of user_artists dataframe

In [0]:
zip_list = list(zip(user_artists.UA, user_artists.Rate))

Initialize P and Q

In [0]:
np.random.seed(42)
P = np.random.rand(10,1892)*np.sqrt(10)
P = pd.DataFrame(data=P,dtype=float,columns=user_list)
Q = np.random.rand(10,17632)*np.sqrt(10)
Q = pd.DataFrame(data=Q,dtype=float,columns=artist_list)

In [76]:
P

Unnamed: 0,2,3,4,5,6,7,8,9,10,11,...,2090,2091,2092,2093,2094,2095,2096,2097,2099,2100
0,1.1844,3.006423,2.314768,1.893124,0.493374,0.493298,0.183677,2.739089,1.900893,2.239122,...,2.01213,2.406878,0.506191,1.459573,0.029509,0.780067,2.297274,3.136378,0.313629,1.269637
1,2.530047,0.645217,1.755333,2.318175,1.947917,0.594586,1.123825,2.478567,1.752618,0.016537,...,2.08287,2.945835,2.59617,1.791763,2.077713,2.840947,1.263389,1.033416,0.034188,2.613784
2,2.532923,0.330314,1.824183,1.467239,0.375391,3.100829,0.678928,0.212613,1.881073,2.336502,...,2.223958,2.638789,0.177764,2.363868,2.690633,0.904838,2.025272,0.993468,3.029371,0.105695
3,1.833466,0.52653,2.252238,0.556214,0.744347,1.563338,2.892962,0.6695,0.456127,2.400095,...,1.67153,2.563354,2.325921,2.099093,0.358561,2.278789,1.582104,2.827391,2.908887,0.17442
4,2.595096,0.11921,1.161092,1.714903,0.468336,2.360051,0.02642,0.533157,1.836641,2.259092,...,1.418416,1.98036,0.462872,0.597449,2.717771,0.810349,1.202282,2.755204,2.097058,1.080588
5,2.162048,0.062244,2.326306,1.54008,2.571508,3.018829,2.9276,2.875349,1.619624,1.138869,...,1.21136,2.756294,1.148314,3.009235,2.992491,0.45821,0.560707,2.728015,1.242019,0.013832
6,0.866059,1.889869,2.527501,0.421812,0.349562,2.915913,2.37534,1.865047,2.432382,1.356157,...,2.135561,2.948811,0.749232,2.193466,3.083543,3.04299,0.742911,0.317275,0.468911,2.220032
7,2.295051,2.132123,1.062773,0.536882,1.836528,1.529532,2.304776,1.395703,1.070873,0.928061,...,2.024956,1.480898,2.713277,1.082706,0.843252,3.11442,2.881264,0.091683,2.437258,0.348909
8,0.086916,0.865501,0.851746,2.590913,2.247327,3.104122,2.250001,1.32264,2.210203,3.160567,...,2.23748,2.321125,0.902697,0.941321,1.48277,0.068903,2.496059,1.002723,2.699836,2.53814
9,2.091537,1.027345,1.632736,1.406674,1.071639,2.682042,1.528984,2.633302,0.33273,0.44596,...,1.470739,1.782583,3.161871,2.03707,2.195458,1.554114,1.120994,2.570783,1.565113,3.145486


In [77]:
Q

Unnamed: 0,51,52,53,54,55,56,57,58,59,60,...,18723,18724,10894,13978,18725,18726,18727,18728,18729,18730
0,1.445542,0.215036,2.158502,0.147493,0.792156,1.118619,2.338452,1.562911,0.090586,1.950651,...,2.401213,0.447044,0.097995,2.167052,2.704638,2.384698,0.641307,1.846036,2.395001,0.809001
1,1.6688,0.994411,0.713332,3.066111,2.885838,1.409325,0.43253,2.301845,0.808598,0.5329,...,1.174118,0.369529,0.779511,0.014449,1.109845,2.518549,0.841672,2.034445,0.31071,0.038848
2,1.576068,1.459341,0.605637,0.906261,0.096992,2.522865,0.95274,0.693982,2.544551,0.124194,...,0.854305,1.908841,1.393638,0.686571,0.974566,0.883756,2.880559,1.650564,2.063879,2.955888
3,0.576223,2.962718,1.706483,2.707587,0.531249,0.863099,0.614023,2.066301,0.445127,2.506176,...,1.462942,2.521764,0.617952,0.644145,0.572468,1.920904,0.091551,1.858222,2.610048,2.474963
4,2.108706,0.694096,2.581738,3.027771,0.581171,2.878711,2.697388,3.096756,1.009639,2.953794,...,0.357288,0.127874,1.935101,2.356903,2.559766,0.667645,0.601214,2.651808,1.067375,2.086054
5,0.476907,2.532969,3.038579,3.072626,1.641261,0.75672,0.445909,0.777078,1.965776,1.378365,...,0.149111,2.423181,0.841979,2.937642,2.096708,1.051864,2.58788,2.584653,0.510894,1.713496
6,0.03207,1.088078,2.195869,1.043847,0.423636,1.825644,3.040535,1.305843,2.902978,2.863589,...,2.712557,1.609139,2.498411,1.066066,0.95946,2.532913,2.872345,2.303781,1.071906,2.972328
7,2.361207,1.93672,1.641615,2.00598,0.133584,2.059861,0.910063,1.286652,1.832184,0.950186,...,3.032417,3.016969,2.781653,0.867939,2.801385,0.196773,2.060996,2.051517,0.347761,1.826394
8,0.421077,3.034071,1.486929,2.558617,1.558791,0.595766,0.368915,1.088777,2.969423,0.124896,...,2.332684,0.753988,2.904343,0.045265,0.761017,2.141264,1.027389,0.3755,0.330763,0.134014
9,1.255982,2.414395,2.098341,1.175562,0.165711,0.918364,0.514883,0.840242,0.78296,2.861768,...,0.452826,2.99043,1.369604,1.544507,0.561005,1.399621,0.346653,3.053469,1.173651,0.677585


Save copies

In [0]:
P_Copy = P.copy()
Q_Copy = Q.copy()

In [79]:
P_Copy

Unnamed: 0,2,3,4,5,6,7,8,9,10,11,...,2090,2091,2092,2093,2094,2095,2096,2097,2099,2100
0,1.1844,3.006423,2.314768,1.893124,0.493374,0.493298,0.183677,2.739089,1.900893,2.239122,...,2.01213,2.406878,0.506191,1.459573,0.029509,0.780067,2.297274,3.136378,0.313629,1.269637
1,2.530047,0.645217,1.755333,2.318175,1.947917,0.594586,1.123825,2.478567,1.752618,0.016537,...,2.08287,2.945835,2.59617,1.791763,2.077713,2.840947,1.263389,1.033416,0.034188,2.613784
2,2.532923,0.330314,1.824183,1.467239,0.375391,3.100829,0.678928,0.212613,1.881073,2.336502,...,2.223958,2.638789,0.177764,2.363868,2.690633,0.904838,2.025272,0.993468,3.029371,0.105695
3,1.833466,0.52653,2.252238,0.556214,0.744347,1.563338,2.892962,0.6695,0.456127,2.400095,...,1.67153,2.563354,2.325921,2.099093,0.358561,2.278789,1.582104,2.827391,2.908887,0.17442
4,2.595096,0.11921,1.161092,1.714903,0.468336,2.360051,0.02642,0.533157,1.836641,2.259092,...,1.418416,1.98036,0.462872,0.597449,2.717771,0.810349,1.202282,2.755204,2.097058,1.080588
5,2.162048,0.062244,2.326306,1.54008,2.571508,3.018829,2.9276,2.875349,1.619624,1.138869,...,1.21136,2.756294,1.148314,3.009235,2.992491,0.45821,0.560707,2.728015,1.242019,0.013832
6,0.866059,1.889869,2.527501,0.421812,0.349562,2.915913,2.37534,1.865047,2.432382,1.356157,...,2.135561,2.948811,0.749232,2.193466,3.083543,3.04299,0.742911,0.317275,0.468911,2.220032
7,2.295051,2.132123,1.062773,0.536882,1.836528,1.529532,2.304776,1.395703,1.070873,0.928061,...,2.024956,1.480898,2.713277,1.082706,0.843252,3.11442,2.881264,0.091683,2.437258,0.348909
8,0.086916,0.865501,0.851746,2.590913,2.247327,3.104122,2.250001,1.32264,2.210203,3.160567,...,2.23748,2.321125,0.902697,0.941321,1.48277,0.068903,2.496059,1.002723,2.699836,2.53814
9,2.091537,1.027345,1.632736,1.406674,1.071639,2.682042,1.528984,2.633302,0.33273,0.44596,...,1.470739,1.782583,3.161871,2.03707,2.195458,1.554114,1.120994,2.570783,1.565113,3.145486


In [80]:
Q_Copy

Unnamed: 0,51,52,53,54,55,56,57,58,59,60,...,18723,18724,10894,13978,18725,18726,18727,18728,18729,18730
0,1.445542,0.215036,2.158502,0.147493,0.792156,1.118619,2.338452,1.562911,0.090586,1.950651,...,2.401213,0.447044,0.097995,2.167052,2.704638,2.384698,0.641307,1.846036,2.395001,0.809001
1,1.6688,0.994411,0.713332,3.066111,2.885838,1.409325,0.43253,2.301845,0.808598,0.5329,...,1.174118,0.369529,0.779511,0.014449,1.109845,2.518549,0.841672,2.034445,0.31071,0.038848
2,1.576068,1.459341,0.605637,0.906261,0.096992,2.522865,0.95274,0.693982,2.544551,0.124194,...,0.854305,1.908841,1.393638,0.686571,0.974566,0.883756,2.880559,1.650564,2.063879,2.955888
3,0.576223,2.962718,1.706483,2.707587,0.531249,0.863099,0.614023,2.066301,0.445127,2.506176,...,1.462942,2.521764,0.617952,0.644145,0.572468,1.920904,0.091551,1.858222,2.610048,2.474963
4,2.108706,0.694096,2.581738,3.027771,0.581171,2.878711,2.697388,3.096756,1.009639,2.953794,...,0.357288,0.127874,1.935101,2.356903,2.559766,0.667645,0.601214,2.651808,1.067375,2.086054
5,0.476907,2.532969,3.038579,3.072626,1.641261,0.75672,0.445909,0.777078,1.965776,1.378365,...,0.149111,2.423181,0.841979,2.937642,2.096708,1.051864,2.58788,2.584653,0.510894,1.713496
6,0.03207,1.088078,2.195869,1.043847,0.423636,1.825644,3.040535,1.305843,2.902978,2.863589,...,2.712557,1.609139,2.498411,1.066066,0.95946,2.532913,2.872345,2.303781,1.071906,2.972328
7,2.361207,1.93672,1.641615,2.00598,0.133584,2.059861,0.910063,1.286652,1.832184,0.950186,...,3.032417,3.016969,2.781653,0.867939,2.801385,0.196773,2.060996,2.051517,0.347761,1.826394
8,0.421077,3.034071,1.486929,2.558617,1.558791,0.595766,0.368915,1.088777,2.969423,0.124896,...,2.332684,0.753988,2.904343,0.045265,0.761017,2.141264,1.027389,0.3755,0.330763,0.134014
9,1.255982,2.414395,2.098341,1.175562,0.165711,0.918364,0.514883,0.840242,0.78296,2.861768,...,0.452826,2.99043,1.369604,1.544507,0.561005,1.399621,0.346653,3.053469,1.173651,0.677585


The initial Recommendation matrix

In [81]:
R0 = pd.DataFrame(data=np.dot(P_Copy.T,Q_Copy),dtype=float,columns=user_artists.ArtistID.unique())
R0['UserID'] = user_artists.UserID.unique()
cols = R0.columns.tolist()
cols = cols[-1:] + cols[:-1]
R0 = R0[cols].set_index('UserID')
R0.head()

Unnamed: 0_level_0,51,52,53,54,55,56,57,58,59,60,...,18723,18724,10894,13978,18725,18726,18727,18728,18729,18730
UserID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2,25.596571,29.877396,32.480852,37.881349,15.67146,30.25101,21.197933,28.873539,24.899355,29.870757,...,22.368932,31.131636,25.258226,24.140543,29.205239,24.722389,25.540496,40.513895,21.720293,30.394158
3,13.277468,14.862563,19.637778,14.370827,7.331554,15.251637,16.831603,14.935423,14.878751,18.492673,...,23.156982,16.905196,16.43963,13.045354,18.927527,18.740391,14.771534,21.053132,13.875149,15.415622
4,19.005877,29.611243,33.250317,33.007067,15.576526,25.522433,23.269378,25.948568,26.110623,30.984355,...,26.039208,29.42892,23.715664,23.492455,26.092033,29.821626,26.032311,37.4405,23.437252,29.989416
5,17.72778,24.348731,25.296593,29.947195,16.673855,20.379976,15.932809,22.214278,21.78896,19.759837,...,19.630142,18.316082,20.995064,17.273259,21.729824,24.173679,17.926775,27.535792,15.498295,17.086181
6,13.838269,24.977957,22.347624,28.778935,15.010857,14.924972,9.335958,16.668766,20.326107,14.933887,...,17.677086,20.827481,19.597997,14.216035,18.670437,18.420405,17.511223,23.45395,9.332502,14.230438


MSE for the initial recommendation matrix

In [82]:
t = time.time()
MSE0 = np.sum([(r_ui - R0.loc[[uid],[artid]].values[0][0])**2 for (uid,artid), r_ui in zip_list])
elapsed = time.time() - t
print("The MSE for the initial recommendation matrix: %.4f" % MSE0)
print("Processed time: %.1f s" % elapsed)

The MSE for the initial recommendation matrix: 164325360.5192
Processed time: 305.0 s


*Apply* SGD to P and Q

In [63]:
#t = time.time()
#for _ in range(1):
#    for (uid,artid), r_ui in list(zip(user_artists.UA, user_artists.Rate)):            
#      p_u = P_Copy[uid]
#      q_i = Q_Copy[artid]
#      err = r_ui-q_i*p_u#This is the negative of the scaled gradient
#      # Update vectors p_u and q_i
#      P[uid] = p_u+0.00001*err*q_i
#      Q[artid] = q_i+0.00001*err*p_u
#      # Save the update
#      P_Copy = P.copy()
#      Q_Copy = Q.copy()
#elapsed = time.time() - t
#print("Processed time: %.1f s" % elapsed)

Processed time: 230.7 s


In [83]:
t = time.time()
for _ in range(1):
    for (uid,artid), r_ui in list(zip(user_artists.UA, user_artists.Rate)):                  
      err = r_ui-Q[artid]*P[uid]#This is the negative of the scaled gradient
      # Update vectors p_u and q_i
      P[uid] = P[uid]+0.00001*err*Q[artid]
      Q[artid] = Q[artid]+0.00001*err*P[uid]

elapsed = time.time() - t
print("Processed time: %.1f s" % elapsed)

Processed time: 92.9 s


In [84]:
P

Unnamed: 0,2,3,4,5,6,7,8,9,10,11,...,2090,2091,2092,2093,2094,2095,2096,2097,2099,2100
0,1.206591,3.005678,2.314459,1.892286,0.492697,0.508659,0.18856,2.738033,1.902062,2.241965,...,2.013357,2.403542,0.512224,1.458669,0.049784,0.779623,2.301951,3.133388,0.314714,1.272707
1,2.552551,0.646103,1.757619,2.316527,1.94525,0.615535,1.128358,2.47837,1.75421,0.023118,...,2.083543,2.940101,2.596629,1.790351,2.089571,2.836579,1.271773,1.034106,0.035769,2.614502
2,2.551538,0.33135,1.825535,1.466733,0.374834,3.116129,0.684114,0.216379,1.883086,2.339705,...,2.223939,2.634285,0.184327,2.362099,2.704009,0.904099,2.031165,0.99431,3.025539,0.110167
3,1.856875,0.530752,2.253563,0.557304,0.743183,1.575072,2.89326,0.67235,0.459448,2.402439,...,1.67281,2.55934,2.327717,2.096555,0.372297,2.275459,1.588216,2.825164,2.905748,0.178955
4,2.618506,0.124196,1.163187,1.713982,0.467714,2.382816,0.031837,0.536411,1.838094,2.261519,...,1.42045,1.977307,0.468261,0.597573,2.72916,0.809596,1.210586,2.752918,2.095089,1.084136
5,2.185818,0.066515,2.327788,1.539456,2.566694,3.035965,2.928785,2.875162,1.621196,1.144012,...,1.213509,2.752577,1.152919,3.005111,3.007103,0.458277,0.569718,2.725766,1.241431,0.018733
6,0.889763,1.891792,2.527028,0.423114,0.349233,2.936944,2.377225,1.866407,2.432789,1.361483,...,2.136236,2.943259,0.753447,2.190923,3.093921,3.039443,0.750866,0.318729,0.469859,2.220821
7,2.318201,2.132247,1.065672,0.538127,1.833219,1.552404,2.306103,1.397555,1.073628,0.93322,...,2.025924,1.478826,2.715693,1.082029,0.85823,3.110962,2.88825,0.094032,2.434379,0.354097
8,0.112776,0.869172,0.854212,2.588223,2.243568,3.11592,2.251243,1.32496,2.211089,3.163317,...,2.238218,2.317311,0.907846,0.94086,1.495244,0.069556,2.499703,1.003156,2.696945,2.539199
9,2.113947,1.029464,1.634811,1.406311,1.069807,2.695526,1.53301,2.633432,0.33716,0.45105,...,1.472694,1.779978,3.163065,2.035356,2.205785,1.552179,1.131037,2.569014,1.564199,3.145222


In [85]:
Q

Unnamed: 0,51,52,53,54,55,56,57,58,59,60,...,18723,18724,10894,13978,18725,18726,18727,18728,18729,18730
0,1.50097,0.217219,2.160406,0.149347,0.853895,1.150456,2.339262,1.567405,0.099688,1.951092,...,2.401213,0.447046,0.098083,2.167085,2.704691,2.384702,0.641334,1.846042,2.394998,0.809022
1,1.72835,0.99759,0.719574,3.067932,2.934878,1.440269,0.435741,2.30526,0.81646,0.533917,...,1.174119,0.36953,0.779642,0.014588,1.109967,2.518465,0.841692,2.03438,0.310762,0.038914
2,1.628021,1.462426,0.612407,0.909348,0.166222,2.524722,0.955331,0.70089,2.546385,0.125244,...,0.854253,1.908692,1.393646,0.686577,0.974575,0.883759,2.880562,1.650567,2.063882,2.95589
3,0.609766,2.964158,1.709845,2.708777,0.590447,0.888255,0.615989,2.068522,0.45462,2.506856,...,1.462844,2.521575,0.617964,0.644154,0.572482,1.920909,0.091556,1.858226,2.610053,2.474967
4,2.149283,0.697646,2.582913,3.029599,0.641087,2.898276,2.699156,3.098603,1.016975,2.954672,...,0.35729,0.127886,1.935155,2.356933,2.559819,0.667674,0.601239,2.651808,1.067393,2.086058
5,0.525821,2.534317,3.038726,3.073983,1.697082,0.772054,0.44857,0.784651,1.969569,1.379187,...,0.14912,2.423154,0.841981,2.937643,2.096709,1.051865,2.587881,2.584653,0.510894,1.713497
6,0.079037,1.089498,2.196529,1.044751,0.482144,1.854413,3.040638,1.309786,2.902528,2.863912,...,2.712555,1.60914,2.498445,1.066133,0.959581,2.532863,2.872269,2.30373,1.071915,2.97224
7,2.405014,1.938612,1.644881,2.008376,0.205634,2.084959,0.912183,1.291911,1.837535,0.951074,...,3.032259,3.016811,2.781674,0.867957,2.801408,0.196785,2.061004,2.051525,0.34777,1.826401
8,0.476159,3.033414,1.488857,2.558437,1.62035,0.622099,0.37005,1.093841,2.969923,0.124941,...,2.332538,0.753956,2.904335,0.045398,0.761161,2.141211,1.027398,0.375547,0.330813,0.134072
9,1.31091,2.416329,2.100828,1.178032,0.235231,0.934547,0.517492,0.846449,0.791087,2.862417,...,0.452829,2.99037,1.36969,1.544523,0.561188,1.399589,0.346712,3.053256,1.173623,0.677601


In [86]:
R = pd.DataFrame(data=np.dot(P.T,Q),dtype=float,columns=user_artists.ArtistID.unique())
R['UserID'] = user_artists.UserID.unique()
cols = R.columns.tolist()
cols = cols[-1:] + cols[:-1]
R = R[cols].set_index('UserID')
R.head()

Unnamed: 0_level_0,51,52,53,54,55,56,57,58,59,60,...,18723,18724,10894,13978,18725,18726,18727,18728,18729,18730
UserID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2,26.756757,30.323923,32.963001,38.377771,17.016137,30.961598,21.520164,29.306678,25.363373,30.261057,...,22.715001,31.498278,25.61397,24.423622,29.552512,25.085168,25.855403,40.978005,21.985566,30.747093
3,13.834115,14.929244,19.711204,14.446092,8.02483,15.564385,16.872778,15.022329,14.97432,18.541233,...,23.180294,16.940805,16.476814,13.077575,18.958662,18.773232,14.799553,21.099629,13.900188,15.454176
4,19.894404,29.677137,33.323894,33.073626,16.671119,25.940499,23.31392,26.056043,26.229652,31.016391,...,26.058826,29.454421,23.743087,23.508933,26.116186,29.840605,26.051083,37.470262,23.450366,30.009107
5,18.46864,24.369597,25.33366,29.959175,17.53809,20.698668,15.960258,22.276753,21.859626,19.769475,...,19.62723,18.317398,20.990473,17.270413,21.725868,24.164855,17.924972,27.531134,15.497039,17.089535
6,14.427108,24.954844,22.342358,28.749358,15.719005,15.177501,9.347799,16.705209,20.352404,14.91804,...,17.647997,20.790941,19.566108,14.192251,18.640185,18.391477,17.481881,23.415303,9.318059,14.20722


MSE for the improved recommendation matrix

In [87]:
t = time.time()
MSE = np.sum([(r_ui - R.loc[[uid],[artid]].values[0][0])**2 for (uid,artid), r_ui in zip_list])
elapsed = time.time() - t
print("The MSE for the improved recommendation matrix is %.4f" % MSE)
print("Processed time: %.1f s" % elapsed)

The MSE for the improved recommendation matrix is 164347752.6441
Processed time: 304.8 s


MSE is larger than what was before even if learning_rate = 0.00001. This is contrary to the theory of Stochastic Gradient Descent!