# Recommender systems
In this notebook we will use a **collaborative filtering** algorithm and apply it on a movie rating dataset to recommend movies to users. Ratings are from 1 to 5. The dataset has $n_u = 943$ users and $n_m = 1682$ movies.

## Movie rating dataset
The dataset *ex8_movies.mat* contains the variables $Y$ and $R$. $Y$ is a matrix of size $n_m \times n_u$ which stores the ratings $y^{(i,j}$ from 1 to 5. $R$ is a binary-valued indicator matrix where $R(i,j) = 1$ if user $j$ gave a rating to movie $i$ and $R(i,j) = 0$ otherwise. Let's compute the average rating for the first movie (Toy Story).

In [1]:
import numpy as np
from scipy.io import loadmat

In [71]:
#load movie ratings
data = loadmat("ex8_movies.mat")
Y = data['Y']
R = data['R']

#load movie names
movies = np.empty((Y.shape[0]), dtype='object')
with open("movie_ids.txt") as movie_ids:
    lines = movie_ids.readlines()
    for index, line in enumerate(lines):
        movies[index] = line.strip('\n').split(sep=' ', maxsplit=1)[1]

In [73]:
#print average rating for first movie
print("Average rating {0}: {1:.1f}".format(movies[0], np.mean(Y[0, :][R[0, :] == 1])))

Average rating Toy Story (1995): 3.9


We will also be working with matrices $X$ and $Theta$. The $i$-the row of $X$ corresponds to the feature vector $x^{(i)}$ for the $i$-th movie and the $j$-th row of $Theta$ corresponds to one parameter vector $\theta^{(j)}$, for the $j$-th user. In this exercise we will use smaller versions of $X$ and $Theta$ so $x^{(i)} \in R^{10}$ and $\theta^{(j)} \in R^{10}$. Correspondingly, $X$ is a $n_m \times 10$ matrix and $Theta$ is a $n_u \times 10$ matrix.


In [75]:
params = loadmat("ex8_movieParams.mat")
X = params['X']
Theta = params['Theta']

## Collaborative filtering learning algorithm
First, we will implement the cost function. The collaborative filtering learning algorithm considers a set of $n$-dimensional prameter vectors $x^{(1)}, ..., x^{(n_m)}$ and $\theta^{(1)}, ..., \theta^{(n_u)}$. The model predicts the rating for movie $i$ by user $j$ as $y^{(i,j} = (\theta^{(j)})^T x^{(i)}$. Given a dataset that consists of ratings produced by users on movies, you wish to learn the parameter matrices $X$ and $Theta$ that produce the best fit, as measured by the squared error.

In [None]:
#cost function
def cofi_cost(X, Theta, Y, R, lamb):
    