## Matrix Factorization

#### Matrix Factorization is a simple embedding model

#### Given a rating matrix R of dimension n x m, where m = no. of users and n = no. of items. The model learns 1. User Embedding U of dimension n x k 2. Item Embedding V of dimension m x k, where k = embedding dimension

In [1]:
import numpy as np

In [9]:
def matrix_factorization(R, U, V, K, steps = 5000, alpha = 0.002, beta = 0.02):
    '''
    R : rating matrix
    U : user embedding matrix 
    V : item embedding matrix
    K : number of features / embedding dimension
    steps : iterations
    alpha : learning rate
    beta : regularization paramter
    '''
    V = V.T
    
    for step in range(steps):
        for i in range(len(R)):
            for j in range(len(R[0])):
                # if rating is present
                if R[i][j] > 0:
                    # calculate error
                    eij = R[i][j] - numpy.dot(U[i,:], V[:,j])
                    
                    for k in range(K):
                        # calculate gradient with alpha and beta parameter
                        U[i][k] = U[i][k] + alpha * (2 * eij * V[k][j] - beta * U[i][k])
                        V[k][j] = V[k][j] + alpha * (2 * eij * U[i][k] - beta * V[k][j])
        
        e = 0
        
        # calculating MSE
        for i in range(len(R)):
            for j in range(len(R[0])):
                
                if R[i][j] > 0:
                    
                    e += numpy.square(R[i][j] - numpy.dot(U[i,:],V[:,j])).mean()
                    
                    for k in range(K):
                        e += (beta/2) * (pow(U[i][k],2) + pow(V[k][j],2))
        if e < 0.001:
            break
            
    return U,V.T                  

In [2]:
# input rating matrix
R  = [

     [5,3,0,1],

     [4,0,0,1],

     [1,1,0,5],

     [1,0,0,4],

     [0,1,5,4],
    
     [2,1,3,0],

    ]

In [11]:
R = numpy.array(R)
# N : number of users
N = len(R)
# M : number of items
M = len(R[0])
# K : numbers of features
K = 3

# Initializing user and item embedding matrices
U = numpy.random.rand(N,K)
V = numpy.random.rand(M,K)

# trained user and item embeddings
nU, nV = matrix_factorization(R, U, V, K)

In [12]:
# nR : predicted rating matrix
nR = numpy.dot(nU, nV.T)

In [13]:
nR

array([[4.97869381, 2.9735533 , 3.31500006, 1.00214929],
       [3.98041539, 2.26333595, 2.84224295, 0.99963324],
       [1.00791549, 0.98454008, 5.87308563, 4.96961038],
       [0.99899871, 0.69524851, 4.75333885, 3.98276579],
       [1.44561732, 1.0074433 , 4.97863307, 3.99768097],
       [1.9810794 , 1.02096781, 2.98938699, 1.97950478]])