In [11]:
import pandas as pd
import numpy as np
import scipy as sp
import os
import gc
import collections

<h2> Set up Custom Gradient Descent </h2>

In [12]:
intmatrix = np.load("./data/movie-matrix.npz")["data"]
#Let us define 512-length vector embeddings for users and items such that we prefer higher inner products
latentdim = 512
np.random.seed(1)
Usr = np.random.normal(loc=0, scale=1, size=(intmatrix.shape[0], latentdim))
Itm = np.random.normal(loc=0, scale=1, size=(latentdim, intmatrix.shape[1]))

In [13]:
mean = np.sum(intmatrix, axis=1)/(intmatrix!=0).sum(axis=1)
zmask = intmatrix==0
intmatrix = intmatrix-mean.reshape(-1,1)
intmatrix[zmask]=0

In [14]:
#Define Loss
#We add the inner products of all users-items above user-mean and subtract the inner products of all user-items
#below mean. We must maximize this quantity, and hence minimize its negation
def embed_loss(Usr, Itm, intmatrix, wbl=0):
    mat = np.matmul(Usr, Itm)
    posquant = mat[intmatrix>0].sum()
    negquant = mat[intmatrix<0].sum()
    return negquant-posquant

In [15]:
embed_loss(Usr, Itm, intmatrix)

-4792.202913091733

When defining the gradient at each epoch, we realise that, for our embed_loss function $L$, the user matrix P, and the item matrix Q: <br/>
$\frac{dL}{dP} = D \cdot Q^T$ <br/>
$\frac{dL}{dQ} = P^T \cdot D$ <br/>
where D is a matrix with the same dimensions as intmatrix W, 943x1682, and for $1\leq i \leq 943, 1\leq j \leq 1682,$ <br/> $D_{i,j} = -1$ if $W_{i,j}>0$, $1$ if $W_{i,j}<0$ else $0$. Hence D just represents the sign that the inner product takes inside the embed_loss function. Since we negate the result in embed_loss, we can subtract our gradient, in context of a learning rate, at each epoch. Specifically, for epoch $i$, $1 \leq i \leq n$, we can define $P_{i}$ and $Q_{i}$, with $P_{0}$ and $Q_{0}$ being initialized: <br/>
$P_{i} = P_{i-1}-\alpha \cdot \frac{dL_{i-1}}{dP_{i-1}}$ <br/>
$Q_{i} = Q_{i-1}-\alpha \cdot \frac{dL_{i-1}}{dQ_{i-1}}$ <br/>
for a learning rate $\alpha$

In [17]:
#Make the mask matrix D
D = np.zeros((intmatrix.shape[0], intmatrix.shape[1]))
D[intmatrix>0] = -1
D[intmatrix<0] = 1

In [28]:
def epoch_updater(P, Q, D, alpha):
    gradP = np.matmul(D, Q.T)
    gradQ = np.matmul(P.T, D)
    return P-alpha*gradP, Q-alpha*gradQ

<h2> Perform Gradient Descent to Create Recommendation Matrix </h2>