# Recommendations Engine For Movies

The moving rating data used in this excercise was taken from [Grouplens](https://grouplens.org/datasets/movielens/).

## Importing Data

In [136]:
import pandas as pd
import numpy as np
ratings_df = pd.read_csv("./ml-latest-small/ratings.csv")
movies_df = pd.read_csv("./ml-latest-small/movies.csv")

In [137]:
movies_df.head(2)

Unnamed: 0,movieId,title,genres
0,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
1,2,Jumanji (1995),Adventure|Children|Fantasy


In [138]:
ratings_df.head(2)

Unnamed: 0,userId,movieId,rating,timestamp
0,1,31,2.5,1260759144
1,1,1029,3.0,1260759179


## Creating the User Item Table

To create the User - Item Matrix lets use a pivot table.

In [139]:
R_df = ratings_df.pivot(index="userId", columns="movieId", values="rating").fillna(0)
R_df.head(2)

movieId,1,2,3,4,5,6,7,8,9,10,...,161084,161155,161594,161830,161918,161944,162376,162542,162672,163949
userId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Singular Value Decomposition

In [193]:
R = R_df.as_matrix()
from scipy.sparse.linalg import svds
U,sigma,Vt = svds(R,k=np.min([(np.min(R.shape)-1),8]))

## Making Predictions

In [194]:
all_users_predicted_ratings = np.dot(np.dot(U,np.diag(sigma)),Vt)

In [196]:
all_users_predicted_ratings

array([[  1.16508645e-02,  -3.69107499e-03,   1.51857682e-02, ...,
         -3.16552788e-04,  -1.89931673e-04,   1.50938488e-03],
       [  1.64924370e+00,   1.55426966e+00,   6.10894118e-01, ...,
          1.56953603e-03,   9.41721616e-04,   6.51877834e-02],
       [  1.03236874e+00,   4.27261247e-01,   1.05487502e-01, ...,
          1.46437781e-03,   8.78626683e-04,  -3.43021502e-02],
       ..., 
       [  2.09072274e-01,   4.47633373e-02,   3.46432779e-02, ...,
         -4.41212343e-04,  -2.64727406e-04,  -2.78438712e-02],
       [  7.48372680e-01,   3.42115840e-01,   1.21867871e-01, ...,
          1.70906435e-04,   1.02543861e-04,  -1.84469764e-02],
       [  2.26192041e+00,   3.95873315e-01,   4.23334198e-02, ...,
          4.44739385e-03,   2.66843631e-03,  -1.13711249e-01]])

## How to execute it

The below command executes the movies recommendation engine for the user with id 2, it making 6 recommendations, and showing the top 8 historical ratings.
```bash
python recommender.py '{"user id":2, "Recommendation limit": 3, "Historical limit":8}'
```

## References

1. [Generals on movies recommendation systems.](https://blog.statsbot.co/recommendation-system-algorithms-ba67f39ac9a3)
1. [Matrix factorization recommender.](https://beckernick.github.io/matrix-factorization-recommender/)
2. [A movie recommendation system inplemented on Spark.](https://www.packtpub.com/books/content/building-recommendation-engine-spark)
3. [About the Netflix recommendation system.](https://medium.com/netflix-techblog/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429)
4. [Performance metrics.](https://en.wikipedia.org/wiki/Information_retrieval#Precision_at_K)
5. [Movie ratings dataset.](https://grouplens.org/datasets/movielens/)