# AiDM assignment 1
## Recommender system --- Matrix factorization 

## Read data
We load the ratings data into a matrix which has 4 columns. The first column gives the user id. The second column gives the movie id. The third column gives the rating which can only be a integer from 1 to 5. The fourth column gives the timestamp in a unit of second.

In [30]:
import numpy as np
data = np.genfromtxt('ml-1m/ratings.dat', delimiter= '::')

## Gradient descent
Next we create a list containing the unique user and movie ids, as well as a vector of 20 weights for each. We will train the network so that the dot product of a given user vector with a movie vector will be a prediction of the user's rating for that movie. In this way, the weights of a movie might reflect certain qualities whereas the weights of a user might reflect which qualities that particular user prefers.

In [None]:
user_list = np.unique(data[:,0])
movie_list = np.unique(data[:,1])
user_vector = np.random.uniform(size=(len(user_list),20))
movie_vector = np.random.uniform(size=(len(movie_list),20))

Next we create vectors of the indicies for the weights vectors for the users and movies in the original data from ratings.dat. This is important since weight vectors for *unique* movies is not the same as the movie id, since there some movies which have not been rated in the data and it would be unhelpful to have random, unconstrained wieght vectors for movies not rated in the data.

In [None]:
user_locator = np.zeros(len(data))
movie_locator = np.zeros(len(data))
for i in range(len(data)):
    user_locator[i] = np.where(user_list==data[i,0])[0][0]
    movie_locator[i] = np.where(movie_list==data[i,1])[0][0]
print(user_locator)
print(movie_locator)
print(data[:,1]) # Not the same! There are some unrated movies.

Next we train the weight vectors! Note that we force predictions to be between 1 and 5, and include a lambda regularization factor to counteract overfitting with large weights.

In [None]:
lrate = 0.001
lamb = 0.01
iterations = 10
total_error = np.zeros(iterations)
for count in range(iterations):
    for i in range(len(data)):
        est_rating = np.dot(user_vector[int(user_locator[i])],movie_vector[int(movie_locator[i])])
        if est_rating < 1 :
            est_rating = 1
        if est_rating > 5 :
            est_rating = 5
        error = data[i,2] - est_rating
        user_vector[int(user_locator[i])]  += \
        lrate * (error * movie_vector[int(movie_locator[i])] - lamb *  user_vector[int(user_locator[i])])
        movie_vector[int(movie_locator[i])] += \
        lrate * (error * user_vector[int(user_locator[i])] - lamb * movie_vector[int(movie_locator[i])])
        total_error[count] += abs(error)
    count += 1

Now that we have trained our network, let's take a look at how it predicts a few random entries.

In [None]:
for i in np.random.randint(len(data), size=10):
    print(i, np.dot(user_vector[int(user_locator[i])],movie_vector[int(movie_locator[i])]), data[i,2], )

Not bad! Now lets see how the error changes over the course of fitting.

In [None]:
total_error

Now we would like to test changing the length of our weight vectors. However, adding more weights might just make our network better at over-fitting. We need to randomly seperate our data into a training and testing set. We can even split it into multiple parts, train models on each separately, and average our result for a prediction - N Fold Cross Validation.