## E Navaneet Kumar  HW 7

# Matrix Factorization for Recommendation Systems

To predict user ratings for movies. The goal is to fill in the missing entries in a user-item ratings matrix based on observed ratings. We can do this by decomposing the original ratings matrix into two lower-dimensional matrices, representing latent user preferences and item attributes.

We start with an initial set of user ratings for a selection of movies. Not all users have rated all movies, resulting in a sparse matrix. Matrix factorization algorithm will predict ratings for the movies that each user hasn't rated, providing personalized recommendations for each user.


In [1]:
import numpy as np

def matrix_factorization(r, p, q, k, steps=5000, alpha=0.0002, beta=0.02):
    samples = [
        (i, j, r[i, j])
        for i in range(r.shape[0])
        for j in range(r.shape[1])
        if r[i, j] > 0
    ]

    for step in range(steps):
        for i, j, rating in samples:
            prediction = np.dot(p[i, :], q[j, :].T)
            e = (rating - prediction)

            p[i, :] += alpha * (e * q[j, :] - beta * p[i, :])
            q[j, :] += alpha * (e * p[i, :] - beta * q[j, :])

    return p, q


## Initial Ratings Matrix and Model Training

We initialize our ratings matrix `r` with user ratings for four movies. The matrix contains zeros where a user has not rated a movie. We then train our matrix factorization model with 5000 steps to predict these missing ratings.


In [2]:
# Initial Ratings Matrix
r = np.array([
    [5, 3, 0, 1],
    [4, 0, 0, 1],
    [1, 1, 0, 5],
    [1, 0, 0, 4],
    [0, 1, 5, 4]
])

num_row_r, num_col_r = r.shape
k = 2  # Number of latent factors
p = np.random.rand(num_row_r, k)
q = np.random.rand(num_col_r, k)

# Train the matrix factorization model
new_p, new_q = matrix_factorization(r, p, q, k, steps=5000, alpha=0.0002, beta=0.02)

# Predicted ratings after 5000 steps
r_predicted = np.dot(new_p, new_q.T)
print("Predicted Ratings with 5000 Steps:")
print(r_predicted.round(2))


Predicted Ratings with 5000 Steps:
[[5.   2.86 2.96 1.  ]
 [3.91 2.24 2.51 0.98]
 [1.04 0.87 5.09 4.86]
 [1.03 0.8  4.12 3.86]
 [1.74 1.22 4.76 4.23]]


## Analysis of Predictions after 5000 Steps

Above, we can see the predicted ratings matrix alongside the actual ratings matrix. The non-zero values in the actual ratings matrix closely match the corresponding values in the predicted ratings matrix, indicating that the function has performed well. For zero values in the actual ratings matrix, the predicted ratings matrix now includes values. These new values represent the model's predictions for the movies that a user has not yet rated.

To further improve the accuracy of our predictions, we extend the training with an additional 5000 steps, making a total of 10000 steps.


In [3]:
# Continue training the model with an additional 5000 steps
new_p, new_q = matrix_factorization(r, new_p, new_q, k, steps=10000, alpha=0.0002, beta=0.02)

# Predicted ratings after 10000 steps
r_predicted = np.dot(new_p, new_q.T)
print("\nPredicted Ratings with 10000 Steps:")
print(r_predicted.round(2))



Predicted Ratings with 10000 Steps:
[[4.95 2.96 3.18 1.  ]
 [3.96 2.39 2.76 1.  ]
 [1.02 0.96 5.9  4.94]
 [0.99 0.88 4.81 3.97]
 [1.27 1.04 4.96 3.99]]


## Conclusion after 10000 Steps

The non-zero values in the actual ratings matrix now match even more closely to the corresponding values in the predicted ratings matrix. By extending the training to 10000 steps, we have further improved the accuracy of the predictions. This demonstrates the effectiveness of matrix factorization in predicting missing ratings.
