# Movie Recommendation System: Model-Based Collaborative Filtering Model (Matrix Factorization)

**Background**:

An accurate, personalized recommendation system can improve business and sales and build customer satisfaction.


**Types of Recommendation Engines**:
1. **Popularity Model (see "popularity_movie_recommendation" notebook)**
2. **Recommendation algorithms**
    - **Content-based filtering (see "content_based_movie_recommendation" notebook)**
    - **Memory-based collaborative filtering (see "memory_based_collaborative_filtering_movie_recommendation" notebook)**
    - **Model-based collaborative filtering**
        - **Uses parametric machine learning to find user ratings of unrated items**
        - Examples: PCA, SVD, Neural Nets, Matrix Factorization
        - Learns parameters using gradient descent or other optimization algorithms
        - Applies dimensionality reduction techniques to capture more signals from the big datasets
        - **Advantages**:
            - Dimensionality reduction handles sparse/missing data
            - Avoids computational issues with large numbers of predictors
            - Easier to visualize the data
            - Easier to data mine with fewer predictors
            - Uncover hidden correlations/features in the raw data
            - Remove redundant and noise features that are useless
            - Access easier data storage and processing
        - **Disadvantages**:
            - Inference is intracable because factors are hidden/latent
        - Types of model-based collaborative filtering algorithms:
            1. **Matrix Factorization Based**
                - Unsupervised 
                - Deals better with scalability and sparsity than memory-based CF
                - Learns the latent prefernces of users and latent attributse of items from known ratings to predict the unknown ratings through the dot product of the latent features of users and items
                - Restructures the user_item matrix into low-rank structure and represent the matrix by the multiplication of two low-rank matrices where the rows contain the latent vector
                - Multiply the low-rank matrices together to approximate the original matrix and fill in any missing entries
                1. **Singular Vector Decomposition (see "model_based_cf_matrix_factorization_movie_recommendation" notebook)**
                    - Algorithm that decomposes a matrix A into two unitary matrices and a diagonal matrix
                    - A: input data matrix (ie user's ratings)
                    - U: left singular vectors (user "features" matrix) that represents how much users like each feature
                    - D: diagonal matrix of singular values (weights/strengths of each concept)
                    - V^T: right singular vectors (movie "features" matrix) that represents how relevant each feature is to each movie
                    - Reduces the dimension of the dataset
                    - Gives low-rank approximation of user tastes and preferences
                    - Need to extract user preferences that can be determined by a small number of hidden factors called **embeddings**
                    - **Embeddings** are low dimensional hidden factors for items and users
                    - Python packages for implementation: surprise
                2. **Probabilitistic Matrix Factorization**
                    - Python packages for implementation: fastai
                3. **Non -ve Matrix Factorization**
                    - Python packages for implementation: surprise
            2. **Deep learning/neural network (see "model_based_cf_deep_learning_movie_recommendation_model" notebook)**
                
3. **Using a classifier to make recommendation**
    - Classifiers are parametric solutions that require some parameters of the user and item to be defined first
    - Pros:
        - Incorporates personalization
        - Works even if the user's past history is short or not available
    - Cons:
        - Features might not be availalbe or sufficient to create a good classifier
        - Making a good classifier will become exponentially difficult as the number of user and items grow
        
(https://medium.com/@james_aka_yale/the-4-recommendation-engines-that-can-predict-your-movie-tastes-bbec857b8223)

**Data**:

We will be using the online movie recommender service MovieLens' dataset collected from the MovieLens website. The datasets were collected over several periods of time.
Users were selected at random to be included in the data. All users have rated 20+ movies. No demographic information is included.

The data includes:
- 100K ratings (1-5) from 1000 users on 1700 movies
- Each user has rated 20+ movies
- Simple demographic information for the users, such as gender, age, occupation, zip, etc.
- Genre information of movies

(https://grouplens.org/datasets/movielens/10m/)

In [1]:
import pandas as pd
import numpy as np
import scipy as sc
import pickle

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('fivethirtyeight')

import warnings
warnings.filterwarnings('ignore')

pd.set_option('display.max_columns', 500)
pd.set_option('display.max_rows', 500)

# Data

## Users Data

In [2]:
users = pd.read_pickle('users.pickle')
users.head()

Unnamed: 0,user_id,age,sex,occupation,zip_code
0,1,24,M,technician,85711
1,2,53,F,other,94043
2,3,23,M,writer,32067
3,4,24,M,technician,43537
4,5,33,F,other,15213


## Ratings Data

In [3]:
ratings = pd.read_pickle('ratings.pickle')
ratings.head()

Unnamed: 0,user_id,movie_id,rating,timestamp
0,196,242,3,881250949
1,186,302,3,891717742
2,22,377,1,878887116
3,244,51,2,880606923
4,166,346,1,886397596


## Movies Data

In [4]:
movies = pd.read_pickle('movies.pickle')
movies.drop(movies.columns[2:24], axis=1, inplace=True)
movies.head()

Unnamed: 0,movie_id,movie_title,genres
0,1,Toy Story (1995),animation|childrens|comedy
1,2,GoldenEye (1995),action|adventure|thriller
2,3,Four Rooms (1995),thriller
3,4,Get Shorty (1995),action|comedy|drama
4,5,Copycat (1995),crime|drama|thriller


# Model-Based Collaborative Filtering Movie Recommendation Model: Matrix Factorization with SVD

(https://medium.com/@james_aka_yale/the-4-recommendation-engines-that-can-predict-your-movie-tastes-bbec857b8223)

**Let's count the number of unique users and movies.**

In [5]:
num_users = ratings['user_id'].unique().shape[0]
num_movies = ratings['movie_id'].unique().shape[0]

print 'Number of users:', num_users
print 'Number of movies:', num_movies

Number of users: 943
Number of movies: 1682


**We need to format the ratings matrix to be one row per user and one column per movie.**

In [6]:
ratings_pivot = ratings.pivot(index='user_id', columns='movie_id', values='rating').fillna(0)
ratings_pivot.head()

movie_id,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,...,1433,1434,1435,1436,1437,1438,1439,1440,1441,1442,1443,1444,1445,1446,1447,1448,1449,1450,1451,1452,1453,1454,1455,1456,1457,1458,1459,1460,1461,1462,1463,1464,1465,1466,1467,1468,1469,1470,1471,1472,1473,1474,1475,1476,1477,1478,1479,1480,1481,1482,1483,1484,1485,1486,1487,1488,1489,1490,1491,1492,1493,1494,1495,1496,1497,1498,1499,1500,1501,1502,1503,1504,1505,1506,1507,1508,1509,1510,1511,1512,1513,1514,1515,1516,1517,1518,1519,1520,1521,1522,1523,1524,1525,1526,1527,1528,1529,1530,1531,1532,1533,1534,1535,1536,1537,1538,1539,1540,1541,1542,1543,1544,1545,1546,1547,1548,1549,1550,1551,1552,1553,1554,1555,1556,1557,1558,1559,1560,1561,1562,1563,1564,1565,1566,1567,1568,1569,1570,1571,1572,1573,1574,1575,1576,1577,1578,1579,1580,1581,1582,1583,1584,1585,1586,1587,1588,1589,1590,1591,1592,1593,1594,1595,1596,1597,1598,1599,1600,1601,1602,1603,1604,1605,1606,1607,1608,1609,1610,1611,1612,1613,1614,1615,1616,1617,1618,1619,1620,1621,1622,1623,1624,1625,1626,1627,1628,1629,1630,1631,1632,1633,1634,1635,1636,1637,1638,1639,1640,1641,1642,1643,1644,1645,1646,1647,1648,1649,1650,1651,1652,1653,1654,1655,1656,1657,1658,1659,1660,1661,1662,1663,1664,1665,1666,1667,1668,1669,1670,1671,1672,1673,1674,1675,1676,1677,1678,1679,1680,1681,1682
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1,Unnamed: 129_level_1,Unnamed: 130_level_1,Unnamed: 131_level_1,Unnamed: 132_level_1,Unnamed: 133_level_1,Unnamed: 134_level_1,Unnamed: 135_level_1,Unnamed: 136_level_1,Unnamed: 137_level_1,Unnamed: 138_level_1,Unnamed: 139_level_1,Unnamed: 140_level_1,Unnamed: 141_level_1,Unnamed: 142_level_1,Unnamed: 143_level_1,Unnamed: 144_level_1,Unnamed: 145_level_1,Unnamed: 146_level_1,Unnamed: 147_level_1,Unnamed: 148_level_1,Unnamed: 149_level_1,Unnamed: 150_level_1,Unnamed: 151_level_1,Unnamed: 152_level_1,Unnamed: 153_level_1,Unnamed: 154_level_1,Unnamed: 155_level_1,Unnamed: 156_level_1,Unnamed: 157_level_1,Unnamed: 158_level_1,Unnamed: 159_level_1,Unnamed: 160_level_1,Unnamed: 161_level_1,Unnamed: 162_level_1,Unnamed: 163_level_1,Unnamed: 164_level_1,Unnamed: 165_level_1,Unnamed: 166_level_1,Unnamed: 167_level_1,Unnamed: 168_level_1,Unnamed: 169_level_1,Unnamed: 170_level_1,Unnamed: 171_level_1,Unnamed: 172_level_1,Unnamed: 173_level_1,Unnamed: 174_level_1,Unnamed: 175_level_1,Unnamed: 176_level_1,Unnamed: 177_level_1,Unnamed: 178_level_1,Unnamed: 179_level_1,Unnamed: 180_level_1,Unnamed: 181_level_1,Unnamed: 182_level_1,Unnamed: 183_level_1,Unnamed: 184_level_1,Unnamed: 185_level_1,Unnamed: 186_level_1,Unnamed: 187_level_1,Unnamed: 188_level_1,Unnamed: 189_level_1,Unnamed: 190_level_1,Unnamed: 191_level_1,Unnamed: 192_level_1,Unnamed: 193_level_1,Unnamed: 194_level_1,Unnamed: 195_level_1,Unnamed: 196_level_1,Unnamed: 197_level_1,Unnamed: 198_level_1,Unnamed: 199_level_1,Unnamed: 200_level_1,Unnamed: 201_level_1,Unnamed: 202_level_1,Unnamed: 203_level_1,Unnamed: 204_level_1,Unnamed: 205_level_1,Unnamed: 206_level_1,Unnamed: 207_level_1,Unnamed: 208_level_1,Unnamed: 209_level_1,Unnamed: 210_level_1,Unnamed: 211_level_1,Unnamed: 212_level_1,Unnamed: 213_level_1,Unnamed: 214_level_1,Unnamed: 215_level_1,Unnamed: 216_level_1,Unnamed: 217_level_1,Unnamed: 218_level_1,Unnamed: 219_level_1,Unnamed: 220_level_1,Unnamed: 221_level_1,Unnamed: 222_level_1,Unnamed: 223_level_1,Unnamed: 224_level_1,Unnamed: 225_level_1,Unnamed: 226_level_1,Unnamed: 227_level_1,Unnamed: 228_level_1,Unnamed: 229_level_1,Unnamed: 230_level_1,Unnamed: 231_level_1,Unnamed: 232_level_1,Unnamed: 233_level_1,Unnamed: 234_level_1,Unnamed: 235_level_1,Unnamed: 236_level_1,Unnamed: 237_level_1,Unnamed: 238_level_1,Unnamed: 239_level_1,Unnamed: 240_level_1,Unnamed: 241_level_1,Unnamed: 242_level_1,Unnamed: 243_level_1,Unnamed: 244_level_1,Unnamed: 245_level_1,Unnamed: 246_level_1,Unnamed: 247_level_1,Unnamed: 248_level_1,Unnamed: 249_level_1,Unnamed: 250_level_1,Unnamed: 251_level_1,Unnamed: 252_level_1,Unnamed: 253_level_1,Unnamed: 254_level_1,Unnamed: 255_level_1,Unnamed: 256_level_1,Unnamed: 257_level_1,Unnamed: 258_level_1,Unnamed: 259_level_1,Unnamed: 260_level_1,Unnamed: 261_level_1,Unnamed: 262_level_1,Unnamed: 263_level_1,Unnamed: 264_level_1,Unnamed: 265_level_1,Unnamed: 266_level_1,Unnamed: 267_level_1,Unnamed: 268_level_1,Unnamed: 269_level_1,Unnamed: 270_level_1,Unnamed: 271_level_1,Unnamed: 272_level_1,Unnamed: 273_level_1,Unnamed: 274_level_1,Unnamed: 275_level_1,Unnamed: 276_level_1,Unnamed: 277_level_1,Unnamed: 278_level_1,Unnamed: 279_level_1,Unnamed: 280_level_1,Unnamed: 281_level_1,Unnamed: 282_level_1,Unnamed: 283_level_1,Unnamed: 284_level_1,Unnamed: 285_level_1,Unnamed: 286_level_1,Unnamed: 287_level_1,Unnamed: 288_level_1,Unnamed: 289_level_1,Unnamed: 290_level_1,Unnamed: 291_level_1,Unnamed: 292_level_1,Unnamed: 293_level_1,Unnamed: 294_level_1,Unnamed: 295_level_1,Unnamed: 296_level_1,Unnamed: 297_level_1,Unnamed: 298_level_1,Unnamed: 299_level_1,Unnamed: 300_level_1,Unnamed: 301_level_1,Unnamed: 302_level_1,Unnamed: 303_level_1,Unnamed: 304_level_1,Unnamed: 305_level_1,Unnamed: 306_level_1,Unnamed: 307_level_1,Unnamed: 308_level_1,Unnamed: 309_level_1,Unnamed: 310_level_1,Unnamed: 311_level_1,Unnamed: 312_level_1,Unnamed: 313_level_1,Unnamed: 314_level_1,Unnamed: 315_level_1,Unnamed: 316_level_1,Unnamed: 317_level_1,Unnamed: 318_level_1,Unnamed: 319_level_1,Unnamed: 320_level_1,Unnamed: 321_level_1,Unnamed: 322_level_1,Unnamed: 323_level_1,Unnamed: 324_level_1,Unnamed: 325_level_1,Unnamed: 326_level_1,Unnamed: 327_level_1,Unnamed: 328_level_1,Unnamed: 329_level_1,Unnamed: 330_level_1,Unnamed: 331_level_1,Unnamed: 332_level_1,Unnamed: 333_level_1,Unnamed: 334_level_1,Unnamed: 335_level_1,Unnamed: 336_level_1,Unnamed: 337_level_1,Unnamed: 338_level_1,Unnamed: 339_level_1,Unnamed: 340_level_1,Unnamed: 341_level_1,Unnamed: 342_level_1,Unnamed: 343_level_1,Unnamed: 344_level_1,Unnamed: 345_level_1,Unnamed: 346_level_1,Unnamed: 347_level_1,Unnamed: 348_level_1,Unnamed: 349_level_1,Unnamed: 350_level_1,Unnamed: 351_level_1,Unnamed: 352_level_1,Unnamed: 353_level_1,Unnamed: 354_level_1,Unnamed: 355_level_1,Unnamed: 356_level_1,Unnamed: 357_level_1,Unnamed: 358_level_1,Unnamed: 359_level_1,Unnamed: 360_level_1,Unnamed: 361_level_1,Unnamed: 362_level_1,Unnamed: 363_level_1,Unnamed: 364_level_1,Unnamed: 365_level_1,Unnamed: 366_level_1,Unnamed: 367_level_1,Unnamed: 368_level_1,Unnamed: 369_level_1,Unnamed: 370_level_1,Unnamed: 371_level_1,Unnamed: 372_level_1,Unnamed: 373_level_1,Unnamed: 374_level_1,Unnamed: 375_level_1,Unnamed: 376_level_1,Unnamed: 377_level_1,Unnamed: 378_level_1,Unnamed: 379_level_1,Unnamed: 380_level_1,Unnamed: 381_level_1,Unnamed: 382_level_1,Unnamed: 383_level_1,Unnamed: 384_level_1,Unnamed: 385_level_1,Unnamed: 386_level_1,Unnamed: 387_level_1,Unnamed: 388_level_1,Unnamed: 389_level_1,Unnamed: 390_level_1,Unnamed: 391_level_1,Unnamed: 392_level_1,Unnamed: 393_level_1,Unnamed: 394_level_1,Unnamed: 395_level_1,Unnamed: 396_level_1,Unnamed: 397_level_1,Unnamed: 398_level_1,Unnamed: 399_level_1,Unnamed: 400_level_1,Unnamed: 401_level_1,Unnamed: 402_level_1,Unnamed: 403_level_1,Unnamed: 404_level_1,Unnamed: 405_level_1,Unnamed: 406_level_1,Unnamed: 407_level_1,Unnamed: 408_level_1,Unnamed: 409_level_1,Unnamed: 410_level_1,Unnamed: 411_level_1,Unnamed: 412_level_1,Unnamed: 413_level_1,Unnamed: 414_level_1,Unnamed: 415_level_1,Unnamed: 416_level_1,Unnamed: 417_level_1,Unnamed: 418_level_1,Unnamed: 419_level_1,Unnamed: 420_level_1,Unnamed: 421_level_1,Unnamed: 422_level_1,Unnamed: 423_level_1,Unnamed: 424_level_1,Unnamed: 425_level_1,Unnamed: 426_level_1,Unnamed: 427_level_1,Unnamed: 428_level_1,Unnamed: 429_level_1,Unnamed: 430_level_1,Unnamed: 431_level_1,Unnamed: 432_level_1,Unnamed: 433_level_1,Unnamed: 434_level_1,Unnamed: 435_level_1,Unnamed: 436_level_1,Unnamed: 437_level_1,Unnamed: 438_level_1,Unnamed: 439_level_1,Unnamed: 440_level_1,Unnamed: 441_level_1,Unnamed: 442_level_1,Unnamed: 443_level_1,Unnamed: 444_level_1,Unnamed: 445_level_1,Unnamed: 446_level_1,Unnamed: 447_level_1,Unnamed: 448_level_1,Unnamed: 449_level_1,Unnamed: 450_level_1,Unnamed: 451_level_1,Unnamed: 452_level_1,Unnamed: 453_level_1,Unnamed: 454_level_1,Unnamed: 455_level_1,Unnamed: 456_level_1,Unnamed: 457_level_1,Unnamed: 458_level_1,Unnamed: 459_level_1,Unnamed: 460_level_1,Unnamed: 461_level_1,Unnamed: 462_level_1,Unnamed: 463_level_1,Unnamed: 464_level_1,Unnamed: 465_level_1,Unnamed: 466_level_1,Unnamed: 467_level_1,Unnamed: 468_level_1,Unnamed: 469_level_1,Unnamed: 470_level_1,Unnamed: 471_level_1,Unnamed: 472_level_1,Unnamed: 473_level_1,Unnamed: 474_level_1,Unnamed: 475_level_1,Unnamed: 476_level_1,Unnamed: 477_level_1,Unnamed: 478_level_1,Unnamed: 479_level_1,Unnamed: 480_level_1,Unnamed: 481_level_1,Unnamed: 482_level_1,Unnamed: 483_level_1,Unnamed: 484_level_1,Unnamed: 485_level_1,Unnamed: 486_level_1,Unnamed: 487_level_1,Unnamed: 488_level_1,Unnamed: 489_level_1,Unnamed: 490_level_1,Unnamed: 491_level_1,Unnamed: 492_level_1,Unnamed: 493_level_1,Unnamed: 494_level_1,Unnamed: 495_level_1,Unnamed: 496_level_1,Unnamed: 497_level_1,Unnamed: 498_level_1,Unnamed: 499_level_1,Unnamed: 500_level_1,Unnamed: 501_level_1
1,5.0,3.0,4.0,3.0,3.0,5.0,4.0,1.0,5.0,3.0,2.0,5.0,5.0,5.0,5.0,5.0,3.0,4.0,5.0,4.0,1.0,4.0,4.0,3.0,4.0,3.0,2.0,4.0,1.0,3.0,3.0,5.0,4.0,2.0,1.0,2.0,2.0,3.0,4.0,3.0,2.0,5.0,4.0,5.0,5.0,4.0,4.0,5.0,3.0,5.0,4.0,4.0,3.0,3.0,5.0,4.0,5.0,4.0,5.0,5.0,4.0,3.0,2.0,5.0,4.0,4.0,3.0,4.0,3.0,3.0,3.0,4.0,3.0,1.0,4.0,4.0,4.0,1.0,4.0,4.0,5.0,5.0,3.0,4.0,3.0,5.0,5.0,4.0,5.0,4.0,5.0,3.0,5.0,2.0,4.0,5.0,3.0,4.0,3.0,5.0,2.0,2.0,1.0,1.0,2.0,4.0,4.0,5.0,5.0,1.0,5.0,1.0,5.0,5.0,5.0,3.0,3.0,3.0,5.0,1.0,4.0,3.0,4.0,5.0,3.0,2.0,5.0,4.0,5.0,3.0,1.0,4.0,4.0,4.0,4.0,3.0,5.0,1.0,3.0,1.0,3.0,2.0,1.0,4.0,2.0,4.0,3.0,2.0,2.0,5.0,4.0,5.0,3.0,5.0,2.0,4.0,4.0,3.0,3.0,4.0,4.0,4.0,4.0,3.0,5.0,5.0,2.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,5.0,3.0,3.0,5.0,4.0,5.0,4.0,4.0,4.0,4.0,3.0,3.0,5.0,5.0,4.0,4.0,4.0,5.0,5.0,5.0,5.0,4.0,3.0,3.0,5.0,4.0,5.0,3.0,4.0,5.0,5.0,4.0,4.0,3.0,4.0,2.0,4.0,3.0,5.0,3.0,3.0,1.0,3.0,5.0,4.0,5.0,5.0,2.0,3.0,4.0,5.0,4.0,4.0,1.0,3.0,2.0,4.0,5.0,4.0,2.0,4.0,4.0,3.0,4.0,5.0,1.0,2.0,2.0,5.0,1.0,4.0,4.0,4.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,4.0,4.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,4.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,3.0,0.0,0.0,4.0,3.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.0,3.0,0.0,0.0,0.0,3.0,4.0,0.0,0.0,3.0,3.0,5.0,5.0,3.0,0.0,0.0,3.0,0.0,0.0,0.0,5.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,3.0,3.0,1.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,5.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,5.0,0.0,0.0,0.0,2.0,3.0,5.0,0.0,0.0,5.0,4.0,5.0,0.0,3.0,0.0,0.0,0.0,0.0,5.0,0.0,4.0,0.0,3.0,5.0,0.0,0.0,5.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,4.0,5.0,3.0,4.0,0.0,0.0,3.0,0.0,1.0,0.0,0.0,3.0,0.0,0.0,4.0,0.0,0.0,2.0,3.0,4.0,5.0,2.0,3.0,2.0,0.0,4.0,2.0,4.0,0.0,0.0,0.0,4.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


**Next, we need to denormalize the data (normalize by each user's mean) and convert it from a dataframe to a numpy array.**

In [7]:
ratings_pivot_matrix = ratings_pivot.as_matrix()
user_ratings_mean = np.mean(ratings_pivot_matrix, axis=1)
ratings_normalized = ratings_pivot_matrix - user_ratings_mean.reshape(-1, 1)

**Let's check the sparsity of the ratings dataset. When there is a very sparse matrix with lots of dimensions, doing matrix factorization can restructure the user-item matrix into low-rank structure to better approximate the original matrix.**

In [8]:
sparsity = round(1 - len(ratings)/float(num_users * num_movies), 3)
print 'The sparsity level of MovieLens100K dataset is', sparsity * 100, '%.'

The sparsity level of MovieLens100K dataset is 93.7 %.


**We will use the Scipy function `svds` because it allows us to choose how many latent factors to use to approximate the original ratings matrix.**

In [9]:
from scipy.sparse.linalg import svds
U, diags, Vt = svds(ratings_normalized, k=50)

In [10]:
D = np.diag(diags)

**We can start making movie predictions for every user from the decomposed matrices. But first we need to add the user means back to get the actual star ratings prediction.**

In [11]:
all_user_pred_ratings = np.dot(np.dot(U, D), Vt) + user_ratings_mean.reshape(-1, 1)

In [12]:
preds = pd.DataFrame(all_user_pred_ratings, columns=ratings_pivot.columns)
preds.head()

movie_id,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,...,1433,1434,1435,1436,1437,1438,1439,1440,1441,1442,1443,1444,1445,1446,1447,1448,1449,1450,1451,1452,1453,1454,1455,1456,1457,1458,1459,1460,1461,1462,1463,1464,1465,1466,1467,1468,1469,1470,1471,1472,1473,1474,1475,1476,1477,1478,1479,1480,1481,1482,1483,1484,1485,1486,1487,1488,1489,1490,1491,1492,1493,1494,1495,1496,1497,1498,1499,1500,1501,1502,1503,1504,1505,1506,1507,1508,1509,1510,1511,1512,1513,1514,1515,1516,1517,1518,1519,1520,1521,1522,1523,1524,1525,1526,1527,1528,1529,1530,1531,1532,1533,1534,1535,1536,1537,1538,1539,1540,1541,1542,1543,1544,1545,1546,1547,1548,1549,1550,1551,1552,1553,1554,1555,1556,1557,1558,1559,1560,1561,1562,1563,1564,1565,1566,1567,1568,1569,1570,1571,1572,1573,1574,1575,1576,1577,1578,1579,1580,1581,1582,1583,1584,1585,1586,1587,1588,1589,1590,1591,1592,1593,1594,1595,1596,1597,1598,1599,1600,1601,1602,1603,1604,1605,1606,1607,1608,1609,1610,1611,1612,1613,1614,1615,1616,1617,1618,1619,1620,1621,1622,1623,1624,1625,1626,1627,1628,1629,1630,1631,1632,1633,1634,1635,1636,1637,1638,1639,1640,1641,1642,1643,1644,1645,1646,1647,1648,1649,1650,1651,1652,1653,1654,1655,1656,1657,1658,1659,1660,1661,1662,1663,1664,1665,1666,1667,1668,1669,1670,1671,1672,1673,1674,1675,1676,1677,1678,1679,1680,1681,1682
0,6.488436,2.959503,1.634987,3.024467,1.656526,1.659506,3.630469,0.240669,1.791518,3.347816,2.966128,5.873074,4.101069,3.389306,4.106865,2.365477,2.051866,0.874133,3.430062,2.858839,0.867343,2.910597,2.212223,3.262437,4.399117,0.821052,1.236004,3.681455,0.583801,1.966995,3.699791,2.086317,2.275488,0.107836,0.302267,0.09806,0.390855,1.287024,2.429299,0.923861,0.293912,3.334737,1.046127,2.125516,3.148509,0.847907,3.611339,1.611321,2.307993,5.608465,0.741138,1.144871,1.456105,2.223851,3.684991,4.517366,1.174318,2.168867,4.942899,4.039338,3.309326,1.863975,1.118751,4.534535,3.181996,1.198788,1.866837,3.071543,4.892945,3.93679,2.27345,1.739154,2.207839,0.305892,0.726822,0.557505,2.177733,0.146881,2.264251,2.152591,4.17499,3.634308,2.189742,0.936213,1.452871,1.646372,4.807039,3.133743,6.711143,1.588641,2.951726,1.600362,1.551082,1.535893,3.502176,4.879115,2.973201,5.344757,0.248126,7.55804,1.801997,0.494464,0.190717,0.115783,1.249732,1.193688,0.971799,2.079902,4.025532,0.1571,2.819177,0.249827,1.505223,2.912911,1.242792,2.564737,1.585004,0.699503,0.98778,0.732848,1.63377,1.679707,1.408618,3.589457,2.542732,1.794945,4.773624,2.372239,2.484785,0.739868,0.584046,2.765883,1.781313,3.06529,5.680061,2.183779,3.520059,0.307302,0.858638,0.274902,1.429935,0.262193,0.167982,4.077501,1.31597,0.868073,2.004982,1.062698,0.861949,4.348056,3.557386,1.589309,4.534839,4.270741,1.334718,4.786115,4.026566,1.879015,2.55848,2.279085,2.745865,2.382294,2.766001,2.752556,2.652652,2.935661,1.63977,5.295662,3.712385,2.920086,1.836562,4.042819,3.92017,4.441769,4.654532,6.683531,2.95778,2.265215,4.437784,4.341606,3.727169,3.689881,5.663938,3.025496,2.084134,3.910041,2.552406,3.435352,2.374156,3.146244,4.629102,0.759868,0.753244,2.139217,4.009049,5.022452,4.14637,4.959577,2.516724,4.332671,2.64242,4.408699,2.668161,2.967262,1.315385,2.276211,1.86375,3.478914,3.634439,3.2288,2.871693,2.41838,0.534121,3.475752,1.101188,4.691427,2.029422,1.754204,0.913638,0.349141,3.120991,4.09513,3.903481,2.314776,0.716496,3.193553,2.340034,4.849075,2.790664,3.01466,1.803513,0.556525,3.003617,4.744082,2.872244,2.09507,1.727583,2.746845,4.436778,2.896365,2.029936,3.725035,0.144128,1.605107,-0.529638,4.950201,0.23,3.726589,3.925241,2.87539,...,-0.026264,-0.058517,0.267548,0.017408,0.22727,-0.092869,-0.105292,0.186799,0.203729,0.037842,-0.071126,0.359281,0.073438,0.478866,0.019194,0.169103,0.244222,0.147405,-0.129981,0.036875,0.001513,0.073664,0.013531,0.245242,0.019194,0.036875,-0.054185,0.019194,0.001513,0.373116,0.052865,-0.132868,0.035301,0.091436,0.12975,0.093778,-0.024402,-0.287127,-0.212072,0.014601,0.025861,-0.114211,0.396979,0.028818,0.061764,-0.323512,-0.102182,0.041469,0.143845,-0.065097,-0.388922,0.054414,0.241932,0.014481,0.001751,0.234744,0.10517,0.259322,0.196419,0.15947,0.014481,0.014481,0.138651,0.226701,0.067281,0.15947,0.131488,0.194045,0.278005,-0.101005,-0.200836,-0.09469,-0.15328,0.023118,-0.004202,-0.080805,0.036303,-0.059877,0.077162,0.163498,0.095548,0.358033,0.06077,0.065371,-0.027471,-0.180244,-0.171919,-0.078161,0.050125,-0.159223,-0.042345,0.170802,-0.026363,0.018091,-0.103206,-0.05631,0.120155,-0.100815,0.002676,0.041861,0.062712,-0.093362,0.024331,-0.06348,0.064601,0.070875,-0.011254,-0.081309,-0.055019,0.004,-0.073277,-0.114448,-0.10494,-0.073342,-0.141352,-0.073342,-0.045913,-0.088888,-0.094729,0.116083,-0.010551,-0.164394,0.243744,-0.057798,-0.073342,-0.007385,-0.073342,0.144057,-0.073342,-0.073342,-0.073342,-0.073342,-0.073342,-0.073342,-0.073342,-0.073342,-0.073342,-0.073342,-0.073342,-0.073342,-0.021617,-0.073342,-0.073342,-0.073342,-0.073342,-0.032198,-0.073342,-0.073342,-0.073342,-0.073342,-0.073342,-0.073342,-0.018484,-0.073342,-0.073342,-0.040396,0.03955,-0.040396,-0.04515,0.25476,-0.01559,-0.14084,-0.019636,-0.019636,0.104271,0.030431,-0.146482,0.256175,-0.047964,0.014546,0.038575,-0.062365,0.13752,0.024974,-0.011953,0.058241,0.051528,0.005473,0.053777,-0.043851,-0.047674,-0.004545,0.228385,0.094341,-0.016817,-0.045124,0.064824,0.189358,-0.049818,0.002455,0.180105,-0.036084,-0.038319,-0.039962,-0.014752,0.026187,0.024902,0.007295,0.024363,0.007295,0.007295,-0.006419,0.007295,0.02101,0.007295,0.007295,-0.033642,0.007295,0.007295,0.024008,0.148698,0.00183,0.02101,0.065803,0.007295,-0.006419,0.007295,0.02101,0.02101,-0.008884,-0.020404,-0.040883,0.007746,0.067863,0.084001,0.209484,-0.03273,-0.031613,-0.054435,2.3e-05,-0.011884,0.070986,-0.011884,-0.011884,-0.000902,-0.000902,-0.011884,-0.000902,-0.029554,0.000936,0.011976,-0.092017,-0.074553,-0.060985,0.009427,-0.035641,-0.039227,-0.037434,-0.025552,0.023513
1,2.347262,0.129689,-0.098917,0.328828,0.159517,0.481361,0.213002,0.097908,1.8921,0.671,0.35714,0.121637,1.410628,3.077856,2.234084,0.466284,0.200723,0.133164,1.948106,1.025987,0.05921,-0.019519,-0.035814,0.477771,2.685548,0.314334,0.081742,-0.425463,0.157504,-0.059219,0.16732,-0.039025,-0.373729,0.079431,-0.072124,0.035314,0.013882,0.150273,0.197624,-0.022988,-0.157782,-0.179674,0.140533,0.269805,0.664097,-0.163135,0.091001,-0.256106,0.100469,3.535139,-0.222576,-0.197033,0.075891,0.063312,-0.028569,-0.109267,0.284178,-0.220634,-0.10604,-0.270397,-0.159177,0.03029,-0.392901,-0.379258,-0.072763,0.120354,-0.096736,-0.222453,0.005285,0.754223,-0.216319,-0.007377,-0.082722,0.041897,0.054452,0.007571,0.282757,-0.156571,-0.304082,-0.100481,0.210751,-0.319694,0.424723,0.031134,0.377364,-0.274291,0.222344,0.06153,-0.38236,0.05019,0.154772,-0.271714,0.624199,-0.189033,0.092903,0.26225,-0.278637,0.594342,0.111119,4.154256,-0.089275,0.067678,0.068199,0.006739,-0.263994,0.300809,0.431679,0.041991,0.132168,-0.06559,2.299302,-0.054577,0.170469,0.127683,-0.005475,1.80492,0.5739,0.523204,0.151575,-0.219433,0.975706,-0.121203,0.029327,2.247547,1.215652,1.748956,5.02911,0.141746,0.261045,0.085204,0.216676,-0.472062,0.262321,0.557224,-1.411796,0.137177,1.936996,-0.10066,0.146936,-0.061856,-0.253008,0.192714,0.327862,0.046993,0.062035,0.215421,-0.576433,0.396218,0.163695,1.051867,0.3834,0.00122,0.531657,-0.131669,0.302739,-0.117947,0.065515,0.143567,0.23085,0.066698,0.202908,-0.107121,0.08629,-0.034717,0.134779,0.224197,0.167137,-0.566804,-0.118681,0.280468,-0.167489,0.128305,0.318707,0.105101,-0.254029,0.090783,-0.055737,0.355734,0.029851,-0.743578,2.235796,0.250462,-0.071973,0.429312,0.156583,-0.335729,1.078621,-0.519871,-0.11846,0.247424,0.285337,-0.080433,-0.450486,0.190578,-0.216415,-0.203013,0.01176,0.048672,0.067097,0.160019,0.088481,0.280477,-0.063509,0.165322,0.160733,0.223659,-0.081561,-0.165416,0.001433,0.488978,0.325141,-0.250997,0.1577,-0.379165,-0.018629,0.699091,0.225656,0.255424,0.425426,0.79644,1.158089,0.74034,0.46003,0.593286,0.609104,0.359389,-0.078717,-0.151161,-0.152053,-0.122994,0.012673,-0.047427,0.17503,-0.059946,0.480432,0.329616,2.757886,0.110829,0.543165,-0.292864,0.038928,2.703945,-0.255643,0.658016,1.12496,0.7383,-0.001084,1.834,0.18256,0.399562,...,-0.00463,0.055258,-0.007987,0.026065,-0.03129,-0.026236,0.02228,0.038979,0.007366,-0.016461,-0.037704,-0.02992,0.044526,0.071975,0.00254,0.04521,-0.016105,-0.004164,-0.017575,0.00963,-0.004549,0.026196,-0.013669,0.080827,0.00254,0.00963,-0.025263,0.00254,-0.004549,0.100097,0.032652,-0.032261,0.003785,0.022618,-0.144983,0.036249,-0.128621,-0.046311,-0.107375,0.025216,-0.106553,-0.034883,0.088206,-0.023724,-0.001529,-0.093795,0.012102,-0.07094,-0.057814,-0.03409,0.051584,-0.050121,-0.007368,-0.026579,-0.04872,-0.057414,-0.059147,0.015914,-0.057911,-0.050133,-0.026579,-0.026579,0.03704,-0.004646,-0.007659,-0.050133,-0.014571,-0.04792,0.082092,0.010977,-0.02097,0.063957,-0.064655,0.024393,-0.015123,0.012362,-0.051456,-0.016471,-0.001773,-0.024943,0.019627,0.115153,-0.010732,0.134762,-0.013166,-0.07177,-0.040966,-0.029575,0.134732,0.013438,0.007547,-0.069692,0.01978,-0.016096,0.025428,-0.037927,-0.003784,-0.003618,-0.042217,0.012093,-0.002115,-0.028996,0.020559,-0.01305,-0.003568,0.023403,-0.03611,0.079896,-0.005863,0.004222,-0.002505,-0.01433,0.010812,-0.022615,-0.039091,-0.022615,-0.000101,-0.037398,-0.032057,0.003001,0.01915,-0.011763,-0.037106,-0.0142,-0.022615,-0.018573,-0.022615,0.066469,-0.022615,-0.022615,-0.022615,-0.022615,-0.022615,-0.022615,-0.022615,-0.022615,-0.022615,-0.022615,-0.022615,-0.022615,-0.032879,-0.022615,-0.022615,-0.022615,-0.022615,0.011156,-0.022615,-0.022615,-0.022615,-0.022615,-0.022615,-0.022615,0.022414,-0.022615,-0.022615,-0.00042,-0.008133,-0.00042,-0.066614,0.022956,-0.022086,0.082987,-0.017078,-0.017078,-0.033584,0.064808,-0.089219,0.089315,-0.02412,0.072805,0.024806,-0.031316,0.028651,0.023032,0.02552,-0.040739,0.017791,0.013039,0.044091,-0.057686,-0.040518,-0.028443,-0.017034,-0.052449,-0.041292,-0.021709,-0.050328,0.013864,-0.020287,-0.009257,-0.014854,-0.019194,-0.01966,-0.018133,-0.006504,0.065983,-0.054919,0.015044,0.046379,0.015044,0.015044,0.003787,0.015044,0.026301,0.015044,0.015044,0.004431,0.015044,0.015044,0.048914,0.001659,0.007326,0.026301,0.024492,0.015044,0.003787,0.015044,0.026301,0.026301,-0.002794,0.018632,-0.014828,-0.022,0.005451,-0.002534,0.067161,-0.012035,-0.005342,-0.006852,0.012926,-0.003931,0.005725,-0.003931,-0.003931,0.003467,0.003467,-0.003931,0.003467,-0.015181,0.006241,0.003943,-0.026939,-0.03546,-0.029883,-0.027153,-0.015244,-0.008277,-0.01176,0.011639,-0.046924
2,0.291905,-0.26383,-0.151454,-0.179289,0.013462,-0.088309,-0.057624,0.568764,-0.018506,0.280742,0.142404,0.506428,-0.263161,-0.770552,-0.233785,-0.124033,0.181566,-0.099044,-0.467579,-0.046501,0.167284,0.541315,0.494334,-0.443344,-0.715635,0.174654,-0.055005,0.236562,-0.184623,-0.110396,-0.04673,0.022726,-0.094683,0.021683,-0.004416,-0.00274,0.014568,-0.117285,-0.036792,0.006796,-0.040351,-0.169377,-0.028555,0.125309,-0.210559,0.002554,0.085771,0.32927,0.042207,1.108815,-0.062797,0.147241,-0.157542,0.010112,-0.256728,-0.149268,0.080676,0.215808,-0.007866,-0.130083,-0.107753,-0.188407,-0.005109,0.705246,0.125457,-0.02734,-0.24292,-0.209655,-0.147036,-0.179489,0.113804,-0.006125,-0.003946,0.008171,-0.038036,0.00139,0.002312,0.009997,0.087883,-0.0893,0.218124,0.10926,-0.225103,0.035349,0.062652,0.296327,0.312881,0.14489,-0.278355,-0.044493,-0.116444,-0.047015,0.42431,-0.248648,0.329418,-0.415893,-0.136485,0.007174,-0.054149,-0.741498,-0.229556,-0.211058,-0.087618,0.001995,0.038075,-0.249118,0.008445,0.038751,0.360372,0.05997,-0.080588,0.037182,-0.027344,-0.101558,0.036909,-0.28859,-0.039355,0.050266,-0.084486,0.019493,0.167711,0.155839,0.208548,-0.086746,-0.174818,-0.583747,0.617316,0.112272,0.006142,-0.038888,-0.201545,0.416494,0.06748,-0.080353,-0.173589,0.154062,-0.098874,-0.016336,-0.011418,0.099967,-0.146266,-0.119423,-0.512408,0.039966,0.015996,-0.05698,0.215078,0.067796,-0.104944,0.218717,0.145948,0.157232,-0.44518,0.167754,-0.009258,-0.123175,0.134911,-0.138081,-0.030431,0.383552,-0.226666,0.036592,0.048811,0.19777,-0.093428,0.058905,0.07062,0.465329,-0.26183,-0.182175,0.365297,0.597234,-0.105167,-0.203753,-0.261391,-0.146075,0.049123,0.180418,0.175224,0.015785,1.002272,0.363805,-0.188324,-0.080905,0.173052,0.056558,0.486759,0.190563,-0.206675,0.035621,0.202508,0.529828,-0.118604,-0.173187,-0.302126,0.323729,-0.145919,0.018414,-0.043208,-0.18488,0.190308,0.161847,0.203805,0.088341,-0.054978,-0.049504,-0.087032,0.069162,-0.335596,-0.073576,-0.080513,-0.327768,-0.047787,-0.312446,0.059503,-0.279441,0.0872,-0.178147,-0.004121,-0.170654,-0.088869,0.381173,0.193431,0.000761,-0.014005,-0.297939,0.013486,0.010352,0.04547,0.058634,-0.144437,0.120108,-0.398327,0.295891,0.380406,0.175157,0.148991,-0.079797,-0.033405,0.42513,0.137581,0.140235,0.742762,-0.016012,1.163087,0.919928,-0.019735,-0.163933,0.052333,0.436395,...,0.000155,0.13909,-0.050123,-0.028698,-0.038739,-0.015985,-0.037839,-0.031272,-0.062712,0.008637,0.001717,-0.005413,0.009123,-0.004461,-0.016938,0.014047,-0.088394,-0.010044,-0.017626,-0.018331,-0.015544,-0.061166,-0.004524,-0.071954,-0.016938,-0.018331,-0.021226,-0.016938,-0.015544,0.067825,0.031437,0.024415,0.001907,0.026147,0.038639,0.050478,-0.044574,0.027391,0.010854,-0.015121,-0.035202,-0.073704,0.01124,-0.024359,0.008817,0.014477,-0.020621,-0.036901,-0.029258,0.02668,0.160229,-0.065289,-0.039554,-0.021813,-0.013554,-0.05383,-0.030951,-0.051274,-0.058466,-0.048977,-0.021813,-0.021813,-0.01837,0.007008,-0.020898,-0.048977,-0.037734,-0.041165,0.032776,0.001707,-0.001245,0.038414,-0.022773,0.006899,-0.005697,-0.026738,-0.00815,-0.013206,0.024245,-0.060624,-0.006007,0.069027,-0.006406,-0.038721,-0.000213,0.009193,-0.010256,-0.012297,-0.025954,-0.050471,-0.031191,-0.002679,0.005595,-0.02365,0.081033,-0.001868,0.015252,-0.001593,-0.010439,-0.007236,-0.015068,0.003049,-0.013755,-0.012476,0.024823,0.018949,-0.018238,0.016146,0.003319,0.002126,0.000912,-0.016708,-0.007002,-0.011814,-0.034172,-0.011814,-0.005137,-0.027794,-0.029291,-0.004174,-0.013716,-0.002305,-0.039465,-0.012475,-0.011814,-0.066216,-0.011814,0.000672,-0.011814,-0.011814,-0.011814,-0.011814,-0.011814,-0.011814,-0.011814,-0.011814,-0.011814,-0.011814,-0.011814,-0.011814,-0.040346,-0.011814,-0.011814,-0.011814,-0.011814,-0.001798,-0.011814,-0.011814,-0.011814,-0.011814,-0.011814,-0.011814,0.001541,-0.011814,-0.011814,0.021362,0.055541,0.021362,0.109629,0.110771,-0.010351,-0.019649,0.003447,0.003447,0.072423,0.103206,-0.000945,0.023748,-0.008185,0.020282,-0.021733,-0.03606,0.040394,-0.022855,0.051476,0.022665,0.019384,0.01268,0.031242,0.168111,0.066664,-0.010622,0.068565,-0.033905,0.056198,-0.01225,-0.031579,0.078778,-0.009923,0.000704,0.029451,0.014249,0.041256,-0.012366,-0.031529,0.040147,0.004248,-0.002742,0.008074,-0.002742,-0.002742,-0.006081,-0.002742,0.000597,-0.002742,-0.002742,-0.019843,-0.002742,-0.002742,0.016803,0.07773,0.012698,0.000597,0.025852,-0.002742,-0.006081,-0.002742,0.000597,0.000597,-0.003951,-0.004688,-0.009376,-0.001585,0.010947,-0.014502,0.034649,-0.010266,-0.007774,-0.018861,0.030393,0.009359,0.093543,0.009359,0.009359,0.020418,0.020418,0.009359,0.020418,-0.009574,0.005615,-0.028964,-0.031622,0.045513,0.026089,-0.021705,0.002282,0.032363,0.017322,-0.006644,-0.00948
3,0.36641,-0.443535,0.041151,-0.007616,0.055373,-0.080352,0.299015,-0.010882,-0.160888,-0.118834,0.769274,0.086523,-0.066764,-0.406609,0.628166,0.00097,0.065264,-0.025545,-0.212796,-0.144018,-0.09425,0.684112,0.215035,-0.218843,0.225003,0.142014,-0.086578,0.035754,-0.04735,-0.220572,-0.294959,-0.118327,0.046766,0.078226,0.023054,0.023894,-0.000842,-0.087519,-0.084298,-0.059802,0.190934,0.12509,0.053552,0.254264,-0.184201,0.050386,0.169078,-0.372409,-0.193322,2.149895,0.076869,-0.307635,0.074349,0.076376,0.174302,0.595155,0.031325,-0.176448,-0.017342,-0.097118,-0.129971,-0.229491,0.000462,0.409119,0.02139,0.137821,0.167853,0.021914,0.316482,0.043181,-0.265857,0.163843,0.114877,0.001225,0.047656,0.02472,0.187869,0.023671,0.237031,0.034098,-0.050883,0.311505,-0.193187,0.058961,0.163202,-0.170906,-0.213493,-0.109838,-0.198996,0.074411,-0.01842,0.01407,0.206647,-0.197012,-0.310015,0.363785,0.165161,0.335022,-0.28836,-0.371803,-0.004832,-0.256146,-0.091912,0.041648,-0.035982,-0.225126,-0.128976,0.020121,0.055594,-0.030588,-0.177327,-0.070314,-0.086988,0.207171,-0.018436,-0.065529,0.000217,0.035563,-0.032668,0.014035,0.353946,0.16314,0.246203,-0.326325,0.002038,-0.513522,0.591588,-0.072638,-0.204153,-0.068933,0.209106,0.362352,0.202577,0.094332,-0.219321,-0.005136,-0.180068,0.055343,-0.045323,0.057095,0.057146,-0.133613,-0.28835,0.117617,0.046701,0.010372,0.156102,0.030598,-0.092536,-0.011643,0.338346,-0.144197,-0.193399,-0.220607,-0.222694,-0.018075,-0.03426,0.027545,-0.040854,-0.130736,-0.271855,0.124829,0.008385,0.264676,-0.036979,0.080727,-0.063279,0.398654,-0.023099,-0.191235,0.303318,0.738346,0.56241,0.784766,0.02889,0.521347,0.148286,-0.158618,0.336757,0.070402,1.818284,0.104621,0.176408,0.157181,0.306843,0.240607,0.428897,0.09781,-0.13607,-0.190666,0.361041,0.0114,-0.288144,-0.168985,0.752746,0.099068,-0.017531,0.000966,-0.131332,-0.054208,0.231439,0.439518,-0.041903,0.441119,0.185946,-0.215366,0.112164,0.094585,-0.219664,0.841163,0.103913,-0.249197,0.137229,-0.122137,-0.039541,0.036445,-0.179342,-0.041893,0.011221,-0.005567,-0.169866,0.820787,0.305854,-0.129773,-0.128241,-0.009626,-0.130322,0.008946,-0.195217,-0.132804,-0.10068,0.24712,-0.12452,-0.099602,-0.075829,0.105017,0.853299,0.274253,0.01776,0.133196,0.04655,0.289994,0.595795,-0.142805,1.101906,0.34007,-0.001894,-0.239464,0.10169,0.107819,...,0.028766,0.062234,-0.00608,0.013079,0.008749,0.027353,0.071777,0.006757,0.001405,0.098335,0.038273,0.057718,0.024109,0.003734,0.01442,-0.014413,-0.091744,0.003491,0.077597,0.013512,0.015327,-0.035009,0.029585,-0.040988,0.01442,0.013512,-0.033584,0.01442,0.015327,-0.031816,0.031308,0.031784,0.009903,-0.059316,0.019531,0.010412,0.039017,0.007923,0.061525,0.057429,0.0354,0.07445,0.033996,0.013868,0.029215,0.01201,0.008464,0.038708,0.003324,0.032223,0.001616,0.014694,-0.013769,0.011801,0.017057,-0.009661,0.00264,0.002796,-0.005625,-0.004217,0.011801,0.011801,-0.035681,0.019411,-0.003134,-0.004217,-0.022903,0.017601,0.003699,0.026877,0.040176,0.054895,0.034366,-0.012666,0.009894,0.038753,0.03526,0.039985,0.011491,-0.045117,0.013842,0.041488,0.029406,-0.005617,-0.005394,0.057706,0.034589,0.020591,0.023202,0.014848,0.008056,-0.007376,0.007136,-0.013046,0.084218,0.017089,0.040589,0.018567,-0.014385,0.014089,0.011411,0.015089,-0.0073,-0.007689,0.010166,0.014776,0.013853,0.013026,0.025289,0.038494,0.027937,-0.0006,0.037923,0.023565,0.019832,0.023565,0.023378,0.038245,0.015947,0.00563,0.003167,0.042649,0.025873,0.022882,0.023565,-0.01931,0.023565,-0.019386,0.023565,0.023565,0.023565,0.023565,0.023565,0.023565,0.023565,0.023565,0.023565,0.023565,0.023565,0.023565,0.004408,0.023565,0.023565,0.023565,0.023565,0.023284,0.023565,0.023565,0.023565,0.023565,0.023565,0.023565,0.02319,0.023565,0.023565,0.038911,0.054029,0.038911,0.034753,-0.020846,0.014493,-0.046478,0.019251,0.019251,0.006231,0.056459,0.019709,0.048208,0.015624,0.026151,0.024341,0.006984,0.0251,0.011775,0.06514,0.050685,0.03621,0.043346,0.046973,0.075409,0.047339,0.020604,0.039664,0.011633,0.034736,0.020842,0.013174,0.102602,0.022335,0.018993,0.003918,0.029851,0.042561,0.01388,0.01026,-0.004541,0.01646,0.01686,0.009569,0.01686,0.01686,0.016954,0.01686,0.016766,0.01686,0.01686,0.020988,0.01686,0.01686,-0.004109,0.034462,0.027278,0.016766,0.021962,0.01686,0.016954,0.01686,0.016766,0.016766,0.018793,0.01411,0.018332,0.019915,-0.016737,0.018789,-0.006103,0.014742,0.012343,0.008547,0.036397,0.027372,0.068624,0.027372,0.027372,0.032487,0.032487,0.027372,0.032487,0.019994,0.018673,0.020069,0.015981,-0.000182,0.005593,0.026634,0.023562,0.036405,0.029984,0.015612,-0.008713
4,4.263488,1.937122,0.052529,1.04935,0.652765,0.002836,1.730461,0.870584,0.341027,0.569055,-0.675459,0.06819,-0.322558,0.032664,0.026021,0.075774,0.992908,0.109995,0.053157,-0.320141,1.438065,-0.669422,0.147151,2.172482,2.46214,-0.361953,0.336169,2.293966,1.264276,0.335608,0.162204,0.604379,0.559965,0.037597,0.11435,-0.108925,0.247861,0.888398,0.979366,1.170652,0.228253,1.891088,-0.111786,0.096035,-0.17157,0.30085,2.175281,-0.378666,0.28413,5.108787,-0.269969,0.548943,1.233711,0.399009,0.477067,1.959626,0.359168,0.057221,-0.252488,-0.17278,0.30883,1.793969,1.48249,-0.964113,0.651444,0.677223,1.30761,1.420235,0.750306,1.473458,1.332696,1.344243,1.159364,0.240028,0.166007,-0.639838,-0.252603,0.27283,1.934372,1.409796,0.574956,1.981096,1.442806,0.580354,0.603146,0.307626,0.681931,1.151963,2.769829,1.418881,1.63124,-0.232279,0.219904,2.670652,2.338143,1.817238,-0.199281,2.255147,1.584106,2.783181,2.050011,0.895897,0.052203,0.096363,1.041394,0.054239,0.318615,0.815685,2.740696,0.33434,-0.351559,0.132881,0.172911,2.115681,0.451901,0.657689,1.212025,1.174713,0.057952,0.189361,2.334317,0.523019,0.852543,0.234009,0.694535,-0.111751,0.386275,1.569139,-0.25058,0.243521,0.399281,1.413271,0.778165,-0.047609,2.343902,0.11652,-0.025616,0.0032,1.219759,0.211353,0.626809,1.253333,1.533722,2.380337,1.214419,0.103261,0.139657,0.522308,-0.01217,0.675971,2.756736,2.030547,3.509897,3.992719,0.29975,0.963276,0.007726,0.905487,0.340947,0.473122,1.317652,0.367596,2.502523,1.72623,0.109197,0.331473,1.356667,5.421363,3.071102,-0.577796,0.139464,3.9417,3.589411,3.844578,2.35569,1.690655,-0.0546,-0.207712,1.337331,0.036139,4.952569,-0.212069,1.84891,2.060207,1.94598,3.211403,-0.240623,-0.054181,2.931977,0.617905,-0.003148,-0.65139,1.334517,2.825481,2.106757,0.683407,1.210089,0.121277,0.37101,2.165171,1.746255,1.615391,-0.377606,3.727266,1.194432,0.743967,0.231411,4.148172,3.503575,3.363188,2.252905,0.050704,0.853616,2.403461,0.640733,0.484174,1.825376,1.05985,2.162066,-0.480083,-0.425328,4.278392,0.074588,0.555519,0.741839,1.773762,2.896812,4.345519,2.895498,3.228298,1.894977,0.294238,1.346777,2.239459,1.990401,0.429701,0.082029,1.907087,2.599302,1.593887,-0.17909,0.327344,0.172151,-0.225219,-0.302636,0.067502,0.010485,0.280539,0.98307,1.436513,...,0.013987,0.012497,0.246255,-0.02087,0.261384,-0.004476,-0.088705,-0.011131,-0.051725,0.120276,-0.032699,0.597302,-0.050223,0.184705,-0.016465,-0.0019,0.229165,0.012979,-0.224247,-0.02631,-0.006619,0.023152,-0.029445,0.155418,-0.016465,-0.02631,0.058875,-0.016465,-0.006619,0.059144,-0.047034,0.01499,-0.019623,-0.030488,-0.006157,-0.048336,0.100392,0.084266,0.029437,0.136763,-0.068262,0.100812,0.201309,0.04703,0.028673,-0.007186,-0.096845,0.312988,0.321892,0.118159,0.144561,0.2423,0.34783,0.077141,0.107194,0.325417,0.277591,0.28294,0.326068,0.269349,0.077141,0.077141,0.405173,0.207424,0.142383,0.269349,0.374029,0.321726,0.048412,0.010811,-0.036225,0.036485,0.063427,0.056724,-0.011352,0.110668,0.067575,0.000373,0.007065,0.088696,0.018909,-0.060502,0.042517,-0.035283,-0.033437,-0.241062,-0.046272,-0.04231,-0.007405,0.030139,0.05779,0.091344,0.044463,0.056331,-0.028448,0.027893,0.06022,0.013849,-0.016503,-0.022443,-0.046437,-0.016711,0.098488,0.119145,0.049301,0.029355,0.029005,0.044397,0.045284,0.117734,0.036321,0.084152,-0.00804,0.020322,0.045723,0.020322,0.014561,0.011749,0.033935,0.078694,-0.017318,0.036309,0.134912,0.011194,0.020322,0.05418,0.020322,-0.001103,0.020322,0.020322,0.020322,0.020322,0.020322,0.020322,0.020322,0.020322,0.020322,0.020322,0.020322,0.020322,0.071059,0.020322,0.020322,0.020322,0.020322,0.01168,0.020322,0.020322,0.020322,0.020322,0.020322,0.020322,0.0088,0.020322,0.020322,0.001965,-0.061442,0.001965,0.027521,0.007308,0.022034,-0.020522,0.022142,0.022142,0.053262,-0.051445,0.064531,0.009404,0.014778,0.004784,0.032424,0.031224,0.034054,0.012645,-0.050504,-0.026622,-0.013309,0.008935,-0.01159,-0.023639,0.001592,0.021838,0.292098,-0.020087,-0.007138,0.017343,-0.023854,0.070626,-0.00372,-0.012996,-0.052325,0.008605,0.004139,0.037535,0.053129,-0.030143,0.099995,0.004431,0.00243,0.004431,0.004431,0.007311,0.004431,0.00155,0.004431,0.004431,-0.042569,0.004431,0.004431,-0.073139,0.008113,-0.002046,0.00155,-0.007685,0.004431,0.007311,0.004431,0.00155,0.00155,0.008369,0.018649,0.018787,0.005329,0.040607,0.047767,-0.038099,0.013365,0.013658,0.039383,-0.013631,0.000834,-0.057128,0.000834,0.000834,-0.005284,-0.005284,0.000834,-0.005284,0.015088,-0.015417,0.019973,-0.053521,-0.017242,-0.007137,-0.038987,0.010338,0.004869,0.007603,-0.020575,0.00333


**We now create a function taht returns the movies with the highest predicted ratings that the specified user hasn't rated already.**

In [13]:
def rec_movies(predictions, user_id, movies, original_ratings, num_rec):
    
    # Get and sort the user's predictions
    user_row_num = user_id - 1
    sorted_user_preds = preds.iloc[user_row_num].sort_values(ascending=False)
    
    # Get the user's data and merge in the movie info
    user_data = original_ratings[original_ratings['user_id']==user_id]
    user_full = user_data.merge(movies, how='left', on='movie_id').sort_values(['rating'], ascending=False)
    print 'User {0} has already rated {1} movies.'.format(user_id, user_full.shape[0])
    print 'Recommending highest {0} predicted rating movies not already rated.'.format(num_rec)
    
    # Recommend the highest predicted rating movies that the user hasn't seen yet
    recs = movies[~movies['movie_id'].isin(user_full['movie_id'])].merge(pd.DataFrame(sorted_user_preds).reset_index(), how='left', on='movie_id').rename(columns={user_row_num: 'Predictions'}).sort_values('Predictions', ascending=False).iloc[:num_rec, :-1]
    
    return user_full, recs

**Let's recommend 20 movies to user id 210.**

In [14]:
already_rated, predictions = rec_movies(preds, 210, movies, ratings, 20)

User 210 has already rated 132 movies.
Recommending highest 20 predicted rating movies not already rated.


**Here are the top 20 movies user 210 has rated.**

In [15]:
already_rated.head(20)

Unnamed: 0,user_id,movie_id,rating,timestamp,movie_title,genres
53,210,152,5,887730676,Sleeper (1973),comedy|sci_fi
44,210,182,5,887736232,GoodFellas (1990),crime|drama
29,210,73,5,891035837,Maverick (1994),action|comedy|western
82,210,483,5,887736482,Casablanca (1942),drama|romance|war
34,210,50,5,887731014,Star Wars (1977),action|adventure|romance|sci_fi|war
81,210,181,5,887731082,Return of the Jedi (1983),action|adventure|romance|sci_fi|war
125,210,168,5,887730342,Monty Python and the Holy Grail (1974),comedy
79,210,482,5,887736739,Some Like It Hot (1959),comedy|crime
124,210,302,5,890059415,L.A. Confidential (1997),crime|film_noir|mystery|thriller
41,210,257,5,887730789,Men in Black (1997),action|adventure|comedy|sci_fi


**Here are the top 20 movies that our model-based CF model using matrix factorization recommends.**

In [16]:
predictions

Unnamed: 0,movie_id,movie_title,genres
135,183,Alien (1979),action|horror|sci_fi|thriller
142,194,"Sting, The (1973)",comedy|crime
63,82,Jurassic Park (1993),action|adventure|sci_fi
111,143,"Sound of Music, The (1965)",musical
112,144,Die Hard (1988),action|thriller
103,133,Gone with the Wind (1939),drama|romance|war
404,504,Bonnie and Clyde (1967),crime|drama
52,66,While You Were Sleeping (1995),comedy|romance
171,239,Sneakers (1992),crime|drama|sci_fi
91,118,Twister (1996),action|adventure|thriller


**Although we didn't actually use the genre of the movies as a feature, the truncated matrix factorization features "picked up" on the underlying tastes of the user, as seen by the genres of the recommended movies and those of the user's top rated movies.**

# Evaluation the Model-Based CF Model Using Matrix Factorization

**We will use the `suprise` library instead because it provides powerful ready-to-use prediction algorithms including SVD to evaluate its RMSE. The `surprise` library is a Python scikit building and analyzing recommender systems.**

In [20]:
from surprise import Reader, Dataset, SVD, evaluate

In [23]:
# Load reader library
reader = Reader()

# Load ratings dataset with Dataset library
data = Dataset.load_from_df(ratings[['user_id', 'movie_id', 'rating']], reader)

# Split the dataset for 5-fold evaluation
data.split(n_folds=5)

In [24]:
svd = SVD()
evaluate(svd, data, measures=['RMSE'])

Evaluating RMSE of algorithm SVD.

------------
Fold 1
RMSE: 0.9375
------------
Fold 2
RMSE: 0.9369
------------
Fold 3
RMSE: 0.9323
------------
Fold 4
RMSE: 0.9363
------------
Fold 5
RMSE: 0.9351
------------
------------
Mean RMSE: 0.9356
------------
------------


CaseInsensitiveDefaultDict(list,
                           {'rmse': [0.93754135160943108,
                             0.93685829553426336,
                             0.93228252758368224,
                             0.93629478980283276,
                             0.93509856910237676]})

**The mean RMSE is 0.9356, which is pretty good. Let's now train on the dataset and get the predictions.**

In [25]:
trainset = data.build_full_trainset()
svd.train(trainset)

<surprise.prediction_algorithms.matrix_factorization.SVD at 0x11e7fedd0>

**Let's use the sample user 210 again.**

In [26]:
ratings[ratings['user_id']==210]

Unnamed: 0,user_id,movie_id,rating,timestamp
13,210,40,3,891035994
1018,210,204,5,887730676
1155,210,70,4,887730589
1354,210,97,5,887736454
1417,210,527,5,887736232
1435,210,274,5,887730676
1850,210,357,5,887736206
2081,210,161,5,887736393
2392,210,380,4,887736482
2626,210,58,4,887730177


**Let use SVD to predict the rating that user 210 will give to a random movie, ie movie id 835.**

In [27]:
svd.predict(210, 835)

Prediction(uid=210, iid=835, r_ui=None, est=4.2458539630091527, details={u'was_impossible': False})

**The estimated rating is 4.246. The recommender system works purely on the basis of an assigned movie ID and tries to predict ratings based on how the other users have predicted the movie.**

**Although the method of using matrix factorization (low-rank approximation) can capture the underlying features driving the raw data to scale better to large datasets and make better recommendations based on user tastes, we lose some meaningful signals. There is an interpretability problem as singular vector specifies a linear combination of all input columns or rows and there is a lack of sparsity when the singular vectors are quite dense; hence, the SVD method is limited to linear projections.**