In [156]:
import numpy as np
import pandas as pd
from scipy import stats


## How Matrix Factorization works

If we think of a factorisation with numbers we may say that we can factor 15 into 3 and 5 Matrix factorisation is quite similar but instead of finding two numbers that multiply together, we are finding two matrices that multiply together. Recommendation systems love using matrix factorisation as by factoring the user x item matrix into a user x feature matrix and a item x feature matrix we uncover underlying features that connect the user and the item. This may make more sense in an example, take for instance Netflix recommending movies, underlying features may be things like if it is a comedy or if it has Will Smith in it. Likewise, the users may have a preference for comedy of a preference for Will Smith. Therefore, if we can break down the items into features and breakdown our users into how much they like those features, we can recommend items with similar features to those that the user likes. For example, if we know that a user likes Will Smith and comedies from his past ratings we can recommend the Fresh Prince Of Bel Air.

Once we have both the item feature matrix and the user feature matrix we can just take the dot product of the individual user and the individual item to get a predicted rating. Below is an simple illustration of this.

![factorization](https://i.ytimg.com/vi/ZspR5PZemcs/maxresdefault.jpg)

I decided to make quite a basic Matrix Factorising class myself to try understand better what is going on behind the scenes.

In [2]:
#example user - item matrix
ratings = np.array([
    [2, 5, 3, 0],
    [5, 0, 2, 1],
    [3, 0, 1, 2],
    [3, 3, 0, 4],
    [0, 1, 5, 4],
])

In [3]:
class rec_sys():
    def __init__(self, data, learning_rate=0.01, regularization=0.1, n_factors=10, runs=10):
        '''Learn vectors of people and items, 
        data is dataset containing ratings
           factors are how many features you want to include'''
    
        self.data = data
        self.num_users = data.shape[0]
        self.num_items = data.shape[1]
        #how fast the gradient descent is
        #- cant have it too high otherwise you will swing around optimum
        self.learning_rate = learning_rate
        self.n_factors = n_factors
        #how many times we run through the data
        self.runs = runs
        #regularization so that we avoid overfitting
        self.regularization = regularization

    def fit(self):
        '''Learn vectors of people and items, 
        data is dataset containing ratings
           factors are how many features you want to include-
           sort of acts like regularisation'''
    

        #init user and item factors
        self.p = np.random.normal(0, 0.1, (self.num_users, self.n_factors))
        self.q = np.random.normal(0, 0.1, (self.num_items, self.n_factors))
        
        #init user and item biases
        self.user_bias = np.zeros(self.num_users)
        self.item_bias = np.zeros(self.num_items)

        self.samples = []
        for user in range(self.num_users):
            for item in range(self.num_items):
                #0s mean that we havent rated yet so dont want to include
                if self.data[user, item] > 0:
                    self.samples.append((user, item, self.data[user, item]))
        

        #Gradient Descent
        for n in range(self.runs):
            for user, item, rating in self.samples:
                #comparing actual rating to predicted
                prediction = self.predict_rating(user,item)
                #getting error
                err = rating - prediction
                
                #adjusting biases based on error
                self.user_bias[user] += self.learning_rate * (
                    err - self.regularization * self.user_bias[user])
                self.item_bias[item] += self.learning_rate * (
                    err - self.regularization * self.item_bias[item])

                #updating item-feature relationship and user-feature relationship based on error

                prev_p = self.p[user,:]
                self.p[user,:] += self.learning_rate * (err * self.q[item,:] - self.regularization *self.p[user,:])
                self.q[item,:] += self.learning_rate * (err * prev_p - self.regularization * self.q[item,:])
     
      

    def predict_rating(self, user, item):
        #0s we havent rated so dont want to include
        mean = np.mean(self.data[np.where(self.data != 0)])
        prediction = mean + self.user_bias[user] + self.item_bias[item]+ np.dot(self.p[user, :], self.q[item, :].T)
        return prediction
    
    

In [4]:
# creating instance of the system
reccomend =rec_sys(data=ratings,learning_rate=0.01,regularization=0.1, n_factors=3,runs=1000)
#fitting system
reccomend.fit()
#predicting existing rating
print(f'existing rating item 0 user 0 : {reccomend.predict_rating(0,0)}')
#predicting new rating
print(f'new rating item 0 user 5 : {reccomend.predict_rating(-1,0)}')

existing rating item 0 user 0 : 2.1574050664099715
new rating item 0 user 5 : 3.2724078324264783
