##### User based collaborative filtering
* Recommend products to a user that another similar user has liked.

<img src="img/reco5.jpg">
<img src="img/reco6.jpg">
<img src="img/reco7.jpg">

##### Item based collaborative filtering
* Instead of looking into user based similarities we look into item based similarities

<img src="img/reco8.jpg">


#### High Throughput and Low Latency

One of the main issue in recommender systems is scalability.

if there are m users, and n products, then time complexity will be O(m*n)

Sparse matrix generation

Recommendation is an optimization problem.

### Solution : Singular Value Decomposition

In [4]:
import pandas as pd



In [1]:
# !conda install -c conda-forge scikit-surprise
# !pip install surprise

In [8]:
from surprise import Reader, Dataset, SVD

from surprise.model_selection import cross_validate

reader = Reader()

df = pd.read_csv('datasets/ratings_small.csv')

In [10]:
data = Dataset.load_from_df(df[['userId', 'movieId', 'rating']],reader)

algo = SVD()

cross_validate(algo,data,measures=['RMSE','MAE'],cv=5,verbose=True)

Evaluating RMSE, MAE of algorithm SVD on 5 split(s).

                  Fold 1  Fold 2  Fold 3  Fold 4  Fold 5  Mean    Std     
RMSE (testset)    0.8917  0.8904  0.9005  0.9024  0.9043  0.8979  0.0057  
MAE (testset)     0.6882  0.6860  0.6944  0.6974  0.6958  0.6924  0.0045  
Fit time          4.06    4.23    4.36    4.34    4.38    4.27    0.12    
Test time         0.12    0.13    0.24    0.12    0.12    0.14    0.05    


{'test_rmse': array([0.89174637, 0.89042325, 0.9005333 , 0.90238528, 0.90425513]),
 'test_mae': array([0.68823892, 0.68595795, 0.69441764, 0.69738215, 0.69578159]),
 'fit_time': (4.058589935302734,
  4.227385997772217,
  4.3574299812316895,
  4.338274717330933,
  4.376881837844849),
 'test_time': (0.12016415596008301,
  0.12646913528442383,
  0.2389509677886963,
  0.12193632125854492,
  0.11616683006286621)}

In [11]:
df.columns

Index(['userId', 'movieId', 'rating', 'timestamp'], dtype='object')

In [12]:
df[df['userId']==1]

Unnamed: 0,userId,movieId,rating,timestamp
0,1,31,2.5,1260759144
1,1,1029,3.0,1260759179
2,1,1061,3.0,1260759182
3,1,1129,2.0,1260759185
4,1,1172,4.0,1260759205
5,1,1263,2.0,1260759151
6,1,1287,2.0,1260759187
7,1,1293,2.0,1260759148
8,1,1339,3.5,1260759125
9,1,1343,2.0,1260759131


In [13]:
algo.predict(1,302,verbose=True)

user: 1          item: 302        r_ui = None   est = 2.78   {'was_impossible': False}


Prediction(uid=1, iid=302, r_ui=None, est=2.7791730229265523, details={'was_impossible': False})

In [16]:
algo.predict(1,17,verbose=True)

user: 1          item: 17         r_ui = None   est = 3.19   {'was_impossible': False}


Prediction(uid=1, iid=17, r_ui=None, est=3.1860024646150156, details={'was_impossible': False})

### Conclusions

1. 3 types of recommendation engines:
    * Demography based
    * Content based
    * Collaborative Filtering based


2. Demography based is pretty elementary.


3. Content based can be improved by increasing the metadata.


4. UBCF and IBCF

# Great Job !