### An example of Matrix Factorization used for recommendation.
<UL>
    <li><b>Data set:</b> Modified Joke Ratings Data from Jester</li>
    <li><b>Size:</b> 1000 Users x 100 Jokes</li>
</UL>

In [2]:
import numpy as np
import pandas as pd

In [11]:
pd.set_option('display.max_colwidth', 120)

jokes = pd.read_csv("http://facweb.cs.depaul.edu/mobasher/classes/csc478/data/jokes.csv", usecols=[1], header=None)
jokes.head(10)


Unnamed: 0,1
0,"A man visits the doctor. The doctor says ""I have bad news for you.You have cancer and Alzheimer's disease"". The man ..."
1,This couple had an excellent relationship going until one day he came home from work to find his girlfriend packing....
2,Q. What's 200 feet long and has 4 teeth? A. The front row at a Willie Nelson Concert.
3,Q. What's the difference between a man and a toilet? A. A toilet doesn't follow you around after you use it.
4,Q. What's O. J. Simpson's Internet address? A.\tSlash slash backslash slash slash escape.
5,Bill & Hillary are on a trip back to Arkansas. They're almost out of gas so Bill pulls into a service station on the...
6,How many feminists does it take to screw in a light bulb?That's not funny.
7,Q. Did you hear about the dyslexic devil worshipper? A. He sold his soul to Santa.
8,A country guy goes into a city bar that has a dress code and the maitred' demands he wear a tie. Discouraged the guy...
9,"Two cannibals are eating a clown one turns to other and says: ""Does this taste funny to you?"


In [34]:
def get_joke_text(jokes, id):
	return np.array(jokes)[id]

In [41]:
print(get_joke_text(jokes, 99))

["Q: What's the difference between greeting a Queen and greeting thePresident of the United  States?A: You only have to get on one knee to greet the queen."]


#### The rating matrix contains the ratings on 100 jokes by 1000 users (each row is a user profile). The ratings have been normalized to be between 1 and 21 (a 20-point scale), with 1 being the lowest rating. A zero indicated a missing rating

In [25]:
dataMat = pd.read_csv("http://facweb.cs.depaul.edu/mobasher/classes/csc478/data/modified_jester_data.csv", header=None)

dataMat.shape

(1000, 100)

In [26]:
pd.set_option('display.max_colwidth', 40)

dataMat.head(10)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,90,91,92,93,94,95,96,97,98,99
0,3.18,19.79,1.34,2.84,3.48,2.5,1.15,15.17,2.02,6.24,...,13.82,0.0,0.0,0.0,0.0,0.0,5.37,0.0,0.0,0.0
1,15.08,10.71,17.36,15.37,8.62,1.34,10.27,5.66,19.88,20.22,...,13.82,6.05,10.71,18.86,10.81,8.86,14.06,11.34,6.68,12.07
2,0.0,0.0,0.0,0.0,20.03,20.27,20.03,20.27,0.0,0.0,...,0.0,0.0,0.0,20.08,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,19.35,0.0,0.0,12.8,19.16,8.18,17.21,0.0,12.84,...,0.0,0.0,0.0,11.53,0.0,0.0,0.0,0.0,0.0,0.0
4,19.5,15.61,6.83,5.61,12.36,12.6,18.04,15.61,10.56,16.73,...,16.19,16.58,15.27,16.19,16.73,12.55,14.11,17.55,12.8,12.6
5,4.83,7.46,11.44,2.5,3.91,6.68,2.31,10.13,4.35,9.2,...,7.46,4.11,10.32,8.04,8.82,7.65,11.05,1.92,5.95,7.55
6,0.0,0.0,0.0,0.0,19.59,1.15,18.72,19.79,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,13.33,0.0,0.0,0.0,0.0
7,17.84,14.16,20.17,4.79,2.84,9.3,20.27,12.41,5.81,6.58,...,18.23,9.88,10.9,5.32,7.84,7.65,13.14,10.95,12.31,11.0
8,7.21,7.46,1.58,4.11,2.26,10.71,5.71,2.07,3.14,9.4,...,15.37,10.71,15.17,10.71,10.71,10.71,10.71,10.71,7.6,6.05
9,14.01,16.15,16.15,14.01,17.41,16.15,19.93,13.52,14.01,19.16,...,0.0,15.47,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### The following function uses gradient descent optimization to factorize a rating matrix R into matrices P (the user feature matrix) and Q (the item feature matrix). 

In [65]:
def matrix_factorization(R, P, Q, K, steps=5000, alpha=0.0002, beta=0.02):
    
    ### R = The user x item rating matrix (m x n)
    ### P = Initial user-factor matrix (m x k)
    ### Q = Initial item-factor matrix (n x k)
    ### K = The number of latent factors (features)
    ### steps = The number of epochs in gradient descent
    ### alpha = The learning rate for gradient descent
    ### beta = The regularization coefficient
    
    Q = Q.T
    for step in range(steps):
        for i in range(len(R)):
            for j in range(len(R[i])):
                if R[i][j] > 0:
                    eij = R[i][j] - np.dot(P[i,:],Q[:,j])
                    for k in range(K):
                        ### update P and Q based on the partial derivatives
                        P[i][k] = P[i][k] + alpha * (2 * eij * Q[k][j] - beta * P[i][k])
                        Q[k][j] = Q[k][j] + alpha * (2 * eij * P[i][k] - beta * Q[k][j])
        eR = np.dot(P,Q)
        e = 0
        for i in range(len(R)):
            for j in range(len(R[i])):
                if R[i][j] > 0:
                    e = e + pow(R[i][j] - np.dot(P[i,:],Q[:,j]), 2)
                    for k in range(K):
                        e = e + (beta/2) * ( pow(P[i][k],2) + pow(Q[k][j],2) )
        if e < 0.001:
            break
        print("Step %d of %d; Error: %0.5f; Time: %0.2f" %(step+1, steps, e, time()))
    return P, Q.T

In [52]:
M = dataMat.shape[0]
N = dataMat.shape[1]
Ratings = np.array(dataMat)
K = 5
steps = 3000

In [46]:
### Initialize P and Q to random values
P = np.random.rand(M,K)
Q = np.random.rand(N,K)

#### Now let's factorize the Ratings matrix

In [56]:
from time import time
t0 = time()
fP, fQ = matrix_factorization(Ratings, P, Q, K, steps=steps)
print("done in %0.3fs." % (time() - t0))

Step 1 of 3000; Error: 1509014.94007; Time: 1582530880.53
Step 2 of 3000; Error: 1387903.04661; Time: 1582530883.50
Step 3 of 3000; Error: 1349386.37153; Time: 1582530886.62
Step 4 of 3000; Error: 1335925.49228; Time: 1582530889.64
Step 5 of 3000; Error: 1330679.31247; Time: 1582530892.62
Step 6 of 3000; Error: 1328296.47516; Time: 1582530895.61
Step 7 of 3000; Error: 1326941.22290; Time: 1582530898.70
Step 8 of 3000; Error: 1325948.79235; Time: 1582530901.71
Step 9 of 3000; Error: 1325066.48880; Time: 1582530904.75
Step 10 of 3000; Error: 1324190.80349; Time: 1582530907.90
Step 11 of 3000; Error: 1323272.93564; Time: 1582530910.86
Step 12 of 3000; Error: 1322283.93006; Time: 1582530913.85
Step 13 of 3000; Error: 1321201.46808; Time: 1582530916.83
Step 14 of 3000; Error: 1320004.68289; Time: 1582530919.83
Step 15 of 3000; Error: 1318672.01137; Time: 1582530922.97
Step 16 of 3000; Error: 1317180.23639; Time: 1582530926.09
Step 17 of 3000; Error: 1315504.03846; Time: 1582530929.29
Step 1

Step 140 of 3000; Error: 1057665.45414; Time: 1582531315.41
Step 141 of 3000; Error: 1057488.25039; Time: 1582531318.49
Step 142 of 3000; Error: 1057315.52723; Time: 1582531321.88
Step 143 of 3000; Error: 1057147.15708; Time: 1582531325.61
Step 144 of 3000; Error: 1056983.01593; Time: 1582531328.73
Step 145 of 3000; Error: 1056822.98314; Time: 1582531331.76
Step 146 of 3000; Error: 1056666.94142; Time: 1582531334.78
Step 147 of 3000; Error: 1056514.77673; Time: 1582531337.79
Step 148 of 3000; Error: 1056366.37814; Time: 1582531340.79
Step 149 of 3000; Error: 1056221.63781; Time: 1582531343.99
Step 150 of 3000; Error: 1056080.45087; Time: 1582531347.11
Step 151 of 3000; Error: 1055942.71536; Time: 1582531350.20
Step 152 of 3000; Error: 1055808.33216; Time: 1582531353.19
Step 153 of 3000; Error: 1055677.20488; Time: 1582531356.35
Step 154 of 3000; Error: 1055549.23984; Time: 1582531359.42
Step 155 of 3000; Error: 1055424.34600; Time: 1582531362.61
Step 156 of 3000; Error: 1055302.43483; 

Step 277 of 3000; Error: 1049567.43520; Time: 1582531738.78
Step 278 of 3000; Error: 1049547.33702; Time: 1582531741.88
Step 279 of 3000; Error: 1049527.38527; Time: 1582531745.05
Step 280 of 3000; Error: 1049507.57756; Time: 1582531748.35
Step 281 of 3000; Error: 1049487.91151; Time: 1582531751.32
Step 282 of 3000; Error: 1049468.38483; Time: 1582531754.47
Step 283 of 3000; Error: 1049448.99527; Time: 1582531757.42
Step 284 of 3000; Error: 1049429.74064; Time: 1582531760.51
Step 285 of 3000; Error: 1049410.61880; Time: 1582531763.58
Step 286 of 3000; Error: 1049391.62765; Time: 1582531766.65
Step 287 of 3000; Error: 1049372.76516; Time: 1582531769.68
Step 288 of 3000; Error: 1049354.02933; Time: 1582531772.80
Step 289 of 3000; Error: 1049335.41822; Time: 1582531775.76
Step 290 of 3000; Error: 1049316.92993; Time: 1582531778.88
Step 291 of 3000; Error: 1049298.56260; Time: 1582531781.90
Step 292 of 3000; Error: 1049280.31442; Time: 1582531784.88
Step 293 of 3000; Error: 1049262.18362; 

Step 414 of 3000; Error: 1047625.10738; Time: 1582532162.97
Step 415 of 3000; Error: 1047614.69219; Time: 1582532165.93
Step 416 of 3000; Error: 1047604.31361; Time: 1582532168.87
Step 417 of 3000; Error: 1047593.97142; Time: 1582532171.85
Step 418 of 3000; Error: 1047583.66541; Time: 1582532174.91
Step 419 of 3000; Error: 1047573.39537; Time: 1582532177.91
Step 420 of 3000; Error: 1047563.16107; Time: 1582532181.01
Step 421 of 3000; Error: 1047552.96233; Time: 1582532184.02
Step 422 of 3000; Error: 1047542.79893; Time: 1582532187.11
Step 423 of 3000; Error: 1047532.67067; Time: 1582532190.20
Step 424 of 3000; Error: 1047522.57735; Time: 1582532193.20
Step 425 of 3000; Error: 1047512.51877; Time: 1582532196.23
Step 426 of 3000; Error: 1047502.49473; Time: 1582532199.26
Step 427 of 3000; Error: 1047492.50504; Time: 1582532202.54
Step 428 of 3000; Error: 1047482.54951; Time: 1582532205.60
Step 429 of 3000; Error: 1047472.62794; Time: 1582532208.74
Step 430 of 3000; Error: 1047462.74015; 

Step 551 of 3000; Error: 1046471.67498; Time: 1582532588.59
Step 552 of 3000; Error: 1046464.89585; Time: 1582532591.63
Step 553 of 3000; Error: 1046458.13611; Time: 1582532594.61
Step 554 of 3000; Error: 1046451.39568; Time: 1582532597.83
Step 555 of 3000; Error: 1046444.67448; Time: 1582532600.86
Step 556 of 3000; Error: 1046437.97244; Time: 1582532604.02
Step 557 of 3000; Error: 1046431.28949; Time: 1582532607.17
Step 558 of 3000; Error: 1046424.62556; Time: 1582532610.37
Step 559 of 3000; Error: 1046417.98056; Time: 1582532613.42
Step 560 of 3000; Error: 1046411.35444; Time: 1582532616.38
Step 561 of 3000; Error: 1046404.74711; Time: 1582532619.50
Step 562 of 3000; Error: 1046398.15851; Time: 1582532622.46
Step 563 of 3000; Error: 1046391.58857; Time: 1582532625.43
Step 564 of 3000; Error: 1046385.03721; Time: 1582532628.45
Step 565 of 3000; Error: 1046378.50436; Time: 1582532631.71
Step 566 of 3000; Error: 1046371.98996; Time: 1582532634.71
Step 567 of 3000; Error: 1046365.49393; 

Step 688 of 3000; Error: 1045696.95288; Time: 1582533018.18
Step 689 of 3000; Error: 1045692.26902; Time: 1582533021.48
Step 690 of 3000; Error: 1045687.59712; Time: 1582533024.72
Step 691 of 3000; Error: 1045682.93713; Time: 1582533027.83
Step 692 of 3000; Error: 1045678.28901; Time: 1582533030.99
Step 693 of 3000; Error: 1045673.65272; Time: 1582533034.11
Step 694 of 3000; Error: 1045669.02822; Time: 1582533037.40
Step 695 of 3000; Error: 1045664.41548; Time: 1582533040.56
Step 696 of 3000; Error: 1045659.81446; Time: 1582533043.79
Step 697 of 3000; Error: 1045655.22512; Time: 1582533046.87
Step 698 of 3000; Error: 1045650.64741; Time: 1582533050.07
Step 699 of 3000; Error: 1045646.08131; Time: 1582533053.33
Step 700 of 3000; Error: 1045641.52678; Time: 1582533056.70
Step 701 of 3000; Error: 1045636.98378; Time: 1582533059.81
Step 702 of 3000; Error: 1045632.45226; Time: 1582533063.02
Step 703 of 3000; Error: 1045627.93220; Time: 1582533066.40
Step 704 of 3000; Error: 1045623.42356; 

Step 825 of 3000; Error: 1045152.21266; Time: 1582533457.05
Step 826 of 3000; Error: 1045148.85986; Time: 1582533460.20
Step 827 of 3000; Error: 1045145.51487; Time: 1582533463.29
Step 828 of 3000; Error: 1045142.17768; Time: 1582533466.40
Step 829 of 3000; Error: 1045138.84826; Time: 1582533469.76
Step 830 of 3000; Error: 1045135.52659; Time: 1582533472.90
Step 831 of 3000; Error: 1045132.21264; Time: 1582533476.35
Step 832 of 3000; Error: 1045128.90640; Time: 1582533479.54
Step 833 of 3000; Error: 1045125.60784; Time: 1582533482.78
Step 834 of 3000; Error: 1045122.31694; Time: 1582533485.96
Step 835 of 3000; Error: 1045119.03367; Time: 1582533489.02
Step 836 of 3000; Error: 1045115.75802; Time: 1582533492.17
Step 837 of 3000; Error: 1045112.48996; Time: 1582533495.33
Step 838 of 3000; Error: 1045109.22947; Time: 1582533498.47
Step 839 of 3000; Error: 1045105.97653; Time: 1582533501.65
Step 840 of 3000; Error: 1045102.73111; Time: 1582533504.94
Step 841 of 3000; Error: 1045099.49320; 

Step 962 of 3000; Error: 1044757.13708; Time: 1582533874.84
Step 963 of 3000; Error: 1044754.67198; Time: 1582533877.80
Step 964 of 3000; Error: 1044752.21220; Time: 1582533880.85
Step 965 of 3000; Error: 1044749.75773; Time: 1582533883.86
Step 966 of 3000; Error: 1044747.30855; Time: 1582533886.90
Step 967 of 3000; Error: 1044744.86464; Time: 1582533890.00
Step 968 of 3000; Error: 1044742.42600; Time: 1582533892.97
Step 969 of 3000; Error: 1044739.99261; Time: 1582533896.06
Step 970 of 3000; Error: 1044737.56446; Time: 1582533898.99
Step 971 of 3000; Error: 1044735.14152; Time: 1582533901.95
Step 972 of 3000; Error: 1044732.72379; Time: 1582533905.01
Step 973 of 3000; Error: 1044730.31126; Time: 1582533908.15
Step 974 of 3000; Error: 1044727.90391; Time: 1582533911.08
Step 975 of 3000; Error: 1044725.50172; Time: 1582533914.01
Step 976 of 3000; Error: 1044723.10468; Time: 1582533916.96
Step 977 of 3000; Error: 1044720.71278; Time: 1582533920.19
Step 978 of 3000; Error: 1044718.32601; 

Step 1097 of 3000; Error: 1044467.30703; Time: 1582534291.65
Step 1098 of 3000; Error: 1044465.44764; Time: 1582534294.82
Step 1099 of 3000; Error: 1044463.59200; Time: 1582534298.02
Step 1100 of 3000; Error: 1044461.74011; Time: 1582534301.56
Step 1101 of 3000; Error: 1044459.89195; Time: 1582534304.92
Step 1102 of 3000; Error: 1044458.04752; Time: 1582534308.30
Step 1103 of 3000; Error: 1044456.20681; Time: 1582534311.66
Step 1104 of 3000; Error: 1044454.36980; Time: 1582534315.06
Step 1105 of 3000; Error: 1044452.53648; Time: 1582534318.48
Step 1106 of 3000; Error: 1044450.70686; Time: 1582534321.92
Step 1107 of 3000; Error: 1044448.88091; Time: 1582534325.26
Step 1108 of 3000; Error: 1044447.05864; Time: 1582534328.52
Step 1109 of 3000; Error: 1044445.24002; Time: 1582534331.69
Step 1110 of 3000; Error: 1044443.42506; Time: 1582534334.82
Step 1111 of 3000; Error: 1044441.61374; Time: 1582534338.04
Step 1112 of 3000; Error: 1044439.80605; Time: 1582534341.27
Step 1113 of 3000; Error

Step 1232 of 3000; Error: 1044246.83519; Time: 1582534728.26
Step 1233 of 3000; Error: 1044245.40818; Time: 1582534731.49
Step 1234 of 3000; Error: 1044243.98388; Time: 1582534734.81
Step 1235 of 3000; Error: 1044242.56229; Time: 1582534738.15
Step 1236 of 3000; Error: 1044241.14341; Time: 1582534741.29
Step 1237 of 3000; Error: 1044239.72721; Time: 1582534744.44
Step 1238 of 3000; Error: 1044238.31371; Time: 1582534747.52
Step 1239 of 3000; Error: 1044236.90288; Time: 1582534750.79
Step 1240 of 3000; Error: 1044235.49474; Time: 1582534754.05
Step 1241 of 3000; Error: 1044234.08926; Time: 1582534757.17
Step 1242 of 3000; Error: 1044232.68645; Time: 1582534760.53
Step 1243 of 3000; Error: 1044231.28629; Time: 1582534763.63
Step 1244 of 3000; Error: 1044229.88878; Time: 1582534766.78
Step 1245 of 3000; Error: 1044228.49392; Time: 1582534769.99
Step 1246 of 3000; Error: 1044227.10170; Time: 1582534773.17
Step 1247 of 3000; Error: 1044225.71211; Time: 1582534776.37
Step 1248 of 3000; Error

Step 1367 of 3000; Error: 1044076.42368; Time: 1582535153.49
Step 1368 of 3000; Error: 1044075.31245; Time: 1582535156.60
Step 1369 of 3000; Error: 1044074.20323; Time: 1582535159.69
Step 1370 of 3000; Error: 1044073.09601; Time: 1582535162.76
Step 1371 of 3000; Error: 1044071.99078; Time: 1582535165.95
Step 1372 of 3000; Error: 1044070.88754; Time: 1582535169.32
Step 1373 of 3000; Error: 1044069.78629; Time: 1582535172.61
Step 1374 of 3000; Error: 1044068.68702; Time: 1582535175.72
Step 1375 of 3000; Error: 1044067.58973; Time: 1582535178.97
Step 1376 of 3000; Error: 1044066.49441; Time: 1582535182.02
Step 1377 of 3000; Error: 1044065.40105; Time: 1582535185.14
Step 1378 of 3000; Error: 1044064.30967; Time: 1582535188.17
Step 1379 of 3000; Error: 1044063.22024; Time: 1582535191.38
Step 1380 of 3000; Error: 1044062.13277; Time: 1582535194.42
Step 1381 of 3000; Error: 1044061.04725; Time: 1582535197.48
Step 1382 of 3000; Error: 1044059.96367; Time: 1582535200.55
Step 1383 of 3000; Error

Step 1502 of 3000; Error: 1043942.91925; Time: 1582535575.52
Step 1503 of 3000; Error: 1043942.04319; Time: 1582535578.63
Step 1504 of 3000; Error: 1043941.16864; Time: 1582535581.71
Step 1505 of 3000; Error: 1043940.29559; Time: 1582535584.78
Step 1506 of 3000; Error: 1043939.42403; Time: 1582535587.93
Step 1507 of 3000; Error: 1043938.55398; Time: 1582535591.28
Step 1508 of 3000; Error: 1043937.68542; Time: 1582535594.56
Step 1509 of 3000; Error: 1043936.81834; Time: 1582535597.62
Step 1510 of 3000; Error: 1043935.95276; Time: 1582535600.82
Step 1511 of 3000; Error: 1043935.08866; Time: 1582535604.01
Step 1512 of 3000; Error: 1043934.22604; Time: 1582535607.11
Step 1513 of 3000; Error: 1043933.36490; Time: 1582535610.37
Step 1514 of 3000; Error: 1043932.50523; Time: 1582535613.58
Step 1515 of 3000; Error: 1043931.64704; Time: 1582535616.84
Step 1516 of 3000; Error: 1043930.79031; Time: 1582535620.05
Step 1517 of 3000; Error: 1043929.93505; Time: 1582535623.26
Step 1518 of 3000; Error

Step 1637 of 3000; Error: 1043837.12405; Time: 1582536011.82
Step 1638 of 3000; Error: 1043836.42608; Time: 1582536015.09
Step 1639 of 3000; Error: 1043835.72927; Time: 1582536018.14
Step 1640 of 3000; Error: 1043835.03360; Time: 1582536021.33
Step 1641 of 3000; Error: 1043834.33907; Time: 1582536024.69
Step 1642 of 3000; Error: 1043833.64569; Time: 1582536027.76
Step 1643 of 3000; Error: 1043832.95345; Time: 1582536030.90
Step 1644 of 3000; Error: 1043832.26235; Time: 1582536034.08
Step 1645 of 3000; Error: 1043831.57238; Time: 1582536037.23
Step 1646 of 3000; Error: 1043830.88355; Time: 1582536040.49
Step 1647 of 3000; Error: 1043830.19585; Time: 1582536043.75
Step 1648 of 3000; Error: 1043829.50928; Time: 1582536047.02
Step 1649 of 3000; Error: 1043828.82384; Time: 1582536050.31
Step 1650 of 3000; Error: 1043828.13952; Time: 1582536053.44
Step 1651 of 3000; Error: 1043827.45632; Time: 1582536056.52
Step 1652 of 3000; Error: 1043826.77425; Time: 1582536059.70
Step 1653 of 3000; Error

Step 1772 of 3000; Error: 1043752.46280; Time: 1582536435.30
Step 1773 of 3000; Error: 1043751.90169; Time: 1582536438.39
Step 1774 of 3000; Error: 1043751.34148; Time: 1582536441.48
Step 1775 of 3000; Error: 1043750.78215; Time: 1582536444.60
Step 1776 of 3000; Error: 1043750.22371; Time: 1582536447.74
Step 1777 of 3000; Error: 1043749.66615; Time: 1582536451.05
Step 1778 of 3000; Error: 1043749.10948; Time: 1582536454.27
Step 1779 of 3000; Error: 1043748.55368; Time: 1582536457.45
Step 1780 of 3000; Error: 1043747.99877; Time: 1582536460.80
Step 1781 of 3000; Error: 1043747.44473; Time: 1582536463.91
Step 1782 of 3000; Error: 1043746.89157; Time: 1582536467.05
Step 1783 of 3000; Error: 1043746.33929; Time: 1582536470.18
Step 1784 of 3000; Error: 1043745.78788; Time: 1582536473.46
Step 1785 of 3000; Error: 1043745.23734; Time: 1582536476.66
Step 1786 of 3000; Error: 1043744.68767; Time: 1582536479.88
Step 1787 of 3000; Error: 1043744.13887; Time: 1582536483.11
Step 1788 of 3000; Error

Step 1907 of 3000; Error: 1043684.14359; Time: 1582536871.17
Step 1908 of 3000; Error: 1043683.68901; Time: 1582536874.50
Step 1909 of 3000; Error: 1043683.23513; Time: 1582536877.67
Step 1910 of 3000; Error: 1043682.78194; Time: 1582536880.72
Step 1911 of 3000; Error: 1043682.32945; Time: 1582536883.85
Step 1912 of 3000; Error: 1043681.87765; Time: 1582536887.02
Step 1913 of 3000; Error: 1043681.42654; Time: 1582536890.36
Step 1914 of 3000; Error: 1043680.97612; Time: 1582536893.56
Step 1915 of 3000; Error: 1043680.52639; Time: 1582536896.65
Step 1916 of 3000; Error: 1043680.07735; Time: 1582536899.88
Step 1917 of 3000; Error: 1043679.62900; Time: 1582536903.09
Step 1918 of 3000; Error: 1043679.18133; Time: 1582536906.24
Step 1919 of 3000; Error: 1043678.73434; Time: 1582536909.50
Step 1920 of 3000; Error: 1043678.28804; Time: 1582536912.80
Step 1921 of 3000; Error: 1043677.84242; Time: 1582536916.04
Step 1922 of 3000; Error: 1043677.39749; Time: 1582536919.23
Step 1923 of 3000; Error

Step 2042 of 3000; Error: 1043628.61410; Time: 1582537302.89
Step 2043 of 3000; Error: 1043628.24337; Time: 1582537306.02
Step 2044 of 3000; Error: 1043627.87319; Time: 1582537309.26
Step 2045 of 3000; Error: 1043627.50357; Time: 1582537312.48
Step 2046 of 3000; Error: 1043627.13449; Time: 1582537315.62
Step 2047 of 3000; Error: 1043626.76597; Time: 1582537319.02
Step 2048 of 3000; Error: 1043626.39799; Time: 1582537322.22
Step 2049 of 3000; Error: 1043626.03055; Time: 1582537325.50
Step 2050 of 3000; Error: 1043625.66367; Time: 1582537328.72
Step 2051 of 3000; Error: 1043625.29732; Time: 1582537331.95
Step 2052 of 3000; Error: 1043624.93152; Time: 1582537335.15
Step 2053 of 3000; Error: 1043624.56627; Time: 1582537338.28
Step 2054 of 3000; Error: 1043624.20155; Time: 1582537341.37
Step 2055 of 3000; Error: 1043623.83738; Time: 1582537344.66
Step 2056 of 3000; Error: 1043623.47375; Time: 1582537347.99
Step 2057 of 3000; Error: 1043623.11066; Time: 1582537351.14
Step 2058 of 3000; Error

Step 2177 of 3000; Error: 1043583.20056; Time: 1582537731.62
Step 2178 of 3000; Error: 1043582.89649; Time: 1582537734.78
Step 2179 of 3000; Error: 1043582.59286; Time: 1582537737.89
Step 2180 of 3000; Error: 1043582.28967; Time: 1582537740.91
Step 2181 of 3000; Error: 1043581.98692; Time: 1582537744.15
Step 2182 of 3000; Error: 1043581.68461; Time: 1582537747.30
Step 2183 of 3000; Error: 1043581.38274; Time: 1582537750.33
Step 2184 of 3000; Error: 1043581.08130; Time: 1582537753.47
Step 2185 of 3000; Error: 1043580.78030; Time: 1582537756.46
Step 2186 of 3000; Error: 1043580.47973; Time: 1582537759.51
Step 2187 of 3000; Error: 1043580.17960; Time: 1582537762.74
Step 2188 of 3000; Error: 1043579.87990; Time: 1582537765.84
Step 2189 of 3000; Error: 1043579.58064; Time: 1582537768.98
Step 2190 of 3000; Error: 1043579.28181; Time: 1582537772.05
Step 2191 of 3000; Error: 1043578.98341; Time: 1582537775.04
Step 2192 of 3000; Error: 1043578.68544; Time: 1582537778.16
Step 2193 of 3000; Error

Step 2312 of 3000; Error: 1043545.86278; Time: 1582538151.18
Step 2313 of 3000; Error: 1043545.61216; Time: 1582538154.23
Step 2314 of 3000; Error: 1043545.36190; Time: 1582538157.29
Step 2315 of 3000; Error: 1043545.11199; Time: 1582538160.31
Step 2316 of 3000; Error: 1043544.86243; Time: 1582538163.44
Step 2317 of 3000; Error: 1043544.61323; Time: 1582538166.51
Step 2318 of 3000; Error: 1043544.36438; Time: 1582538169.51
Step 2319 of 3000; Error: 1043544.11588; Time: 1582538172.61
Step 2320 of 3000; Error: 1043543.86773; Time: 1582538175.69
Step 2321 of 3000; Error: 1043543.61994; Time: 1582538178.76
Step 2322 of 3000; Error: 1043543.37249; Time: 1582538181.85
Step 2323 of 3000; Error: 1043543.12539; Time: 1582538184.99
Step 2324 of 3000; Error: 1043542.87865; Time: 1582538188.23
Step 2325 of 3000; Error: 1043542.63225; Time: 1582538191.24
Step 2326 of 3000; Error: 1043542.38619; Time: 1582538194.34
Step 2327 of 3000; Error: 1043542.14049; Time: 1582538197.66
Step 2328 of 3000; Error

Step 2447 of 3000; Error: 1043515.02475; Time: 1582538582.56
Step 2448 of 3000; Error: 1043514.81732; Time: 1582538585.81
Step 2449 of 3000; Error: 1043514.61018; Time: 1582538589.03
Step 2450 of 3000; Error: 1043514.40332; Time: 1582538592.13
Step 2451 of 3000; Error: 1043514.19675; Time: 1582538595.44
Step 2452 of 3000; Error: 1043513.99047; Time: 1582538598.65
Step 2453 of 3000; Error: 1043513.78447; Time: 1582538601.87
Step 2454 of 3000; Error: 1043513.57876; Time: 1582538604.95
Step 2455 of 3000; Error: 1043513.37333; Time: 1582538608.24
Step 2456 of 3000; Error: 1043513.16819; Time: 1582538611.40
Step 2457 of 3000; Error: 1043512.96333; Time: 1582538614.57
Step 2458 of 3000; Error: 1043512.75875; Time: 1582538617.85
Step 2459 of 3000; Error: 1043512.55446; Time: 1582538621.04
Step 2460 of 3000; Error: 1043512.35045; Time: 1582538624.31
Step 2461 of 3000; Error: 1043512.14672; Time: 1582538627.48
Step 2462 of 3000; Error: 1043511.94328; Time: 1582538630.78
Step 2463 of 3000; Error

Step 2582 of 3000; Error: 1043489.45546; Time: 1582539009.20
Step 2583 of 3000; Error: 1043489.28316; Time: 1582539012.25
Step 2584 of 3000; Error: 1043489.11109; Time: 1582539015.29
Step 2585 of 3000; Error: 1043488.93925; Time: 1582539018.44
Step 2586 of 3000; Error: 1043488.76765; Time: 1582539021.57
Step 2587 of 3000; Error: 1043488.59628; Time: 1582539024.64
Step 2588 of 3000; Error: 1043488.42515; Time: 1582539027.67
Step 2589 of 3000; Error: 1043488.25424; Time: 1582539030.71
Step 2590 of 3000; Error: 1043488.08358; Time: 1582539033.81
Step 2591 of 3000; Error: 1043487.91314; Time: 1582539036.83
Step 2592 of 3000; Error: 1043487.74293; Time: 1582539040.16
Step 2593 of 3000; Error: 1043487.57296; Time: 1582539043.37
Step 2594 of 3000; Error: 1043487.40322; Time: 1582539046.42
Step 2595 of 3000; Error: 1043487.23370; Time: 1582539049.58
Step 2596 of 3000; Error: 1043487.06442; Time: 1582539052.66
Step 2597 of 3000; Error: 1043486.89537; Time: 1582539055.80
Step 2598 of 3000; Error

Step 2717 of 3000; Error: 1043468.18375; Time: 1582539441.39
Step 2718 of 3000; Error: 1043468.04018; Time: 1582539444.67
Step 2719 of 3000; Error: 1043467.89680; Time: 1582539447.99
Step 2720 of 3000; Error: 1043467.75362; Time: 1582539451.24
Step 2721 of 3000; Error: 1043467.61063; Time: 1582539454.72
Step 2722 of 3000; Error: 1043467.46783; Time: 1582539457.87
Step 2723 of 3000; Error: 1043467.32522; Time: 1582539461.22
Step 2724 of 3000; Error: 1043467.18280; Time: 1582539464.53
Step 2725 of 3000; Error: 1043467.04058; Time: 1582539467.74
Step 2726 of 3000; Error: 1043466.89854; Time: 1582539471.15
Step 2727 of 3000; Error: 1043466.75669; Time: 1582539474.29
Step 2728 of 3000; Error: 1043466.61504; Time: 1582539477.57
Step 2729 of 3000; Error: 1043466.47357; Time: 1582539480.89
Step 2730 of 3000; Error: 1043466.33230; Time: 1582539484.20
Step 2731 of 3000; Error: 1043466.19121; Time: 1582539487.42
Step 2732 of 3000; Error: 1043466.05031; Time: 1582539490.54
Step 2733 of 3000; Error

Step 2852 of 3000; Error: 1043450.43655; Time: 1582539871.80
Step 2853 of 3000; Error: 1043450.31661; Time: 1582539874.87
Step 2854 of 3000; Error: 1043450.19683; Time: 1582539878.11
Step 2855 of 3000; Error: 1043450.07720; Time: 1582539881.44
Step 2856 of 3000; Error: 1043449.95774; Time: 1582539884.55
Step 2857 of 3000; Error: 1043449.83843; Time: 1582539887.59
Step 2858 of 3000; Error: 1043449.71928; Time: 1582539890.91
Step 2859 of 3000; Error: 1043449.60029; Time: 1582539894.03
Step 2860 of 3000; Error: 1043449.48146; Time: 1582539897.12
Step 2861 of 3000; Error: 1043449.36278; Time: 1582539900.12
Step 2862 of 3000; Error: 1043449.24426; Time: 1582539903.24
Step 2863 of 3000; Error: 1043449.12589; Time: 1582539906.25
Step 2864 of 3000; Error: 1043449.00769; Time: 1582539909.34
Step 2865 of 3000; Error: 1043448.88963; Time: 1582539912.46
Step 2866 of 3000; Error: 1043448.77174; Time: 1582539915.54
Step 2867 of 3000; Error: 1043448.65400; Time: 1582539918.51
Step 2868 of 3000; Error

Step 2987 of 3000; Error: 1043435.59355; Time: 1582540300.75
Step 2988 of 3000; Error: 1043435.49312; Time: 1582540303.96
Step 2989 of 3000; Error: 1043435.39283; Time: 1582540307.25
Step 2990 of 3000; Error: 1043435.29266; Time: 1582540310.56
Step 2991 of 3000; Error: 1043435.19263; Time: 1582540313.70
Step 2992 of 3000; Error: 1043435.09272; Time: 1582540316.84
Step 2993 of 3000; Error: 1043434.99295; Time: 1582540320.26
Step 2994 of 3000; Error: 1043434.89331; Time: 1582540323.44
Step 2995 of 3000; Error: 1043434.79380; Time: 1582540326.59
Step 2996 of 3000; Error: 1043434.69442; Time: 1582540329.77
Step 2997 of 3000; Error: 1043434.59516; Time: 1582540332.96
Step 2998 of 3000; Error: 1043434.49604; Time: 1582540336.24
Step 2999 of 3000; Error: 1043434.39705; Time: 1582540339.41
Step 3000 of 3000; Error: 1043434.29818; Time: 1582540342.58
done in 9465.043s.


#### We can write the P and Q factor matrices to disc for later use.

In [58]:
outP = open("jokes_p.csv", "w")
outQ = open("jokes_q.csv", "w")
np.savetxt(outP, fP, delimiter=',', fmt='%1.4f')
np.savetxt(outQ, fQ, delimiter=',', fmt='%1.4f')

#### An individual prediction for a given user-item pair can now be obtained by computing the dot product of user's row in the P matrix and the item's column in the Q matrix. 

In [66]:
### Compute the predicted rating for user 979 and joke 9
print(np.dot(fP[979],fQ[9].T))

11.672180288756982


#### We can compute all the pedictions by multiplying P and Q.T. These predictions can also be saved for later use.

In [59]:
Preds = np.dot(fP,fQ.T)

In [60]:
outPreds = open("jokes_predictions.csv", "w")
np.savetxt(outPreds, Preds, delimiter=',', fmt='%1.4f')

#### To evaluate the performance of the algorithm, we will measure the Mean Absolute Error (MAE) by comparing the known ratings in Ratings matrix with the predicted ratings from matrix factorization.

In [63]:
totCount = 0
totError = 0
for u in range(M):
    err_u = 0
    rateCount_u = 0
    for j in range(N):
        if (Ratings[u,j] > 0): ### Only use known ratings computing error
            rateCount_u += 1
            err_u += abs(np.dot(fP[u],fQ[j]) - Ratings[u,j])
    print("Mean Absolute Error for User %d = %0.3f" %(u, err_u/rateCount_u))
    totCount += rateCount_u
    totError += err_u
print
print("Overall Mean Absolute Error = %0.3f" %(totError/totCount))

Mean Absolute Error for User 0 = 3.725
Mean Absolute Error for User 1 = 3.146
Mean Absolute Error for User 2 = 1.996
Mean Absolute Error for User 3 = 3.171
Mean Absolute Error for User 4 = 2.826
Mean Absolute Error for User 5 = 2.322
Mean Absolute Error for User 6 = 2.833
Mean Absolute Error for User 7 = 3.700
Mean Absolute Error for User 8 = 2.340
Mean Absolute Error for User 9 = 1.665
Mean Absolute Error for User 10 = 2.202
Mean Absolute Error for User 11 = 2.087
Mean Absolute Error for User 12 = 1.569
Mean Absolute Error for User 13 = 3.489
Mean Absolute Error for User 14 = 4.860
Mean Absolute Error for User 15 = 2.716
Mean Absolute Error for User 16 = 3.771
Mean Absolute Error for User 17 = 2.168
Mean Absolute Error for User 18 = 1.846
Mean Absolute Error for User 19 = 3.894
Mean Absolute Error for User 20 = 1.665
Mean Absolute Error for User 21 = 2.765
Mean Absolute Error for User 22 = 3.072
Mean Absolute Error for User 23 = 2.920
Mean Absolute Error for User 24 = 2.948
Mean Absol

Mean Absolute Error for User 468 = 1.742
Mean Absolute Error for User 469 = 2.736
Mean Absolute Error for User 470 = 3.567
Mean Absolute Error for User 471 = 2.018
Mean Absolute Error for User 472 = 2.703
Mean Absolute Error for User 473 = 2.667
Mean Absolute Error for User 474 = 4.135
Mean Absolute Error for User 475 = 0.841
Mean Absolute Error for User 476 = 2.984
Mean Absolute Error for User 477 = 1.742
Mean Absolute Error for User 478 = 3.099
Mean Absolute Error for User 479 = 3.011
Mean Absolute Error for User 480 = 3.340
Mean Absolute Error for User 481 = 1.306
Mean Absolute Error for User 482 = 3.219
Mean Absolute Error for User 483 = 1.050
Mean Absolute Error for User 484 = 2.199
Mean Absolute Error for User 485 = 1.329
Mean Absolute Error for User 486 = 3.512
Mean Absolute Error for User 487 = 4.117
Mean Absolute Error for User 488 = 5.061
Mean Absolute Error for User 489 = 2.461
Mean Absolute Error for User 490 = 2.344
Mean Absolute Error for User 491 = 2.807
Mean Absolute Er

Mean Absolute Error for User 930 = 2.970
Mean Absolute Error for User 931 = 2.078
Mean Absolute Error for User 932 = 4.204
Mean Absolute Error for User 933 = 2.262
Mean Absolute Error for User 934 = 2.049
Mean Absolute Error for User 935 = 1.391
Mean Absolute Error for User 936 = 2.622
Mean Absolute Error for User 937 = 2.463
Mean Absolute Error for User 938 = 2.075
Mean Absolute Error for User 939 = 1.689
Mean Absolute Error for User 940 = 2.247
Mean Absolute Error for User 941 = 4.975
Mean Absolute Error for User 942 = 2.861
Mean Absolute Error for User 943 = 1.365
Mean Absolute Error for User 944 = 1.849
Mean Absolute Error for User 945 = 2.922
Mean Absolute Error for User 946 = 3.400
Mean Absolute Error for User 947 = 1.838
Mean Absolute Error for User 948 = 2.571
Mean Absolute Error for User 949 = 3.443
Mean Absolute Error for User 950 = 3.558
Mean Absolute Error for User 951 = 3.463
Mean Absolute Error for User 952 = 3.287
Mean Absolute Error for User 953 = 3.203
Mean Absolute Er