# Recommender Systems

##### By Bourhan DERNAYKA
___

## 1. Presentation of the model

___
##### Question 1.1

In [1]:
from movielens_utils import *

path = "./ml-100k/u.data"
data, mask = load_movielens(path, False)
R = data
print(R.shape)
print(mask.shape)

(943, 1682)
(943, 1682)


##### Answer:
   We clearly see that the size of the matrix R, which is the scores matrix is of size 943x1682. 
   
   The minidata option is used to take just a small part of our full data and do the predictions on.
    For instance, if we set 
   <code>minidata = True</code>
we will get a matrix R and its Mask of sizes 100x200, meaning that the study now is being made on 100 users and just 200 movies.

In [2]:
data, small_mask = load_movielens(path, True)
R=data
mask = small_mask
print(data.shape)

(100, 200)


___
##### Question 1.2

From the output of Question 1.1, we can see that the first dimension of R is 943, it means that our study is being held on 943 users.

For the second dimension (movies), it is 1682. Meaning that the different movies watched by the different users is 1682 movie.

In [3]:
nUser, nMovie = R.shape
print("There are", nUser,"user, and", nMovie, "movie.")

There are 100 user, and 200 movie.


The total number of grades can be obtained by summing all of the matrix R elements. Since each element describes the presence of a grade put by a user on a movie. So it is a binary matrix.

Here is the total:

In [4]:
print("The total number of grades is", sum(sum(mask)), "grade.")

The total number of grades is 3571 grade.


___
## 2. Finding P when Q$^0$ is fixed
___

##### Question 2.1

    We modify in the library movielens_utils.py, the funcyion: objective.
    The gradient of  g(P) will look like:
    
$$ \nabla g(P) = - Q^0  (1_k \circ (R - Q^0 P)) + \rho P$$

___
##### Question 2.2

In [5]:
from scipy.sparse.linalg import svds
Q0, singular, P0 = svds(R, k=4)

In [6]:
rho = 0.3
val, gradP = objective(P0, Q0, R, mask, rho)

In [7]:
def G(Pvec):
    P = np.reshape(Pvec, P0.shape)
    return objective(P, Q0, R, mask, rho)[0]
def gradG(Pvec):
    P = np.reshape(Pvec, P0.shape)
    return objective(P, Q0, R, mask, rho)[1].ravel()

In [9]:
from scipy.optimize import check_grad
check_grad(G, gradG, P0.ravel())

0.004347925326378847

We run the objective function to calculate the value of the current equation with our chosen initial values, and the gradient of the function.

We check that the dimensions of the gradient complies with the definition of g, where P is our variable.

We then check the correctness of our calculated gradient with the check_grad() function of scipy.optimize

___
##### Question 2.3

In [60]:
import math

In [67]:
def gradient(g, P0, gamma, epsilon):
    grad = objective(P0, Q0, R, mask, rho)[1]
    P1 = P0 - gamma * grad
    err = float(math.sqrt(sum(sum(grad.T.dot(grad) ** 2))))
    if err > epsilon:
        P0 = P1
        gradient(g, P0, gamma, epsilon)
    else:
        return P1, err

___
##### Question 2.4

In [70]:
gamma = rho + np.sqrt(sum(sum(Q0.T.dot(Q0) ** 2)))
epsilon = 1
gradient(1, P0, gamma, epsilon)

  after removing the cwd from sys.path.
  after removing the cwd from sys.path.


In [69]:
 rho + np.sqrt(np.sum(Q0.T.dot(Q0)**2))

2.3

In [31]:
np.sum(Q0.T.dot(Q0))

3.9999999999999996

In [45]:
grad = objective(P0, Q0, R, mask, rho)[1]
np.sqrt(sum(sum(grad.T.dot(grad) ** 2)))

27129.026700739243

In [41]:
a = np.array(([1, 2], [3,3]))

In [51]:
rho + np.sqrt(sum(sum(Q0.T.dot(Q0) ** 2)))

2.3

In [43]:
a**2

array([[1, 4],
       [9, 9]], dtype=int32)

In [48]:
(sum(sum(a ** 2)))

23

In [59]:
import math
type(math.sqrt(sum(sum(Q0.T.dot(Q0) ** 2))))

float