# Collaborative Filtering Recommendation System

## Collaborative Filtering Overview

### How It Works
Collaborative filtering generates user parameter vectors and movie feature vectors by learning from existing ratings. The process involves the following steps:

1. **Data Representation**:
   - The ratings are stored in a matrix 𝑌, where each element 𝑦(𝑖,𝑗) represents the rating given by user j to movie i. If the movie has not been rated, the element is 0.
   - The matrix 𝐑 is a binary indicator matrix where each element 𝑟(𝑖,𝑗) is 1 if user j has rated movie i, and 0 otherwise.

2. **Vector Learning**:
   - Each user has a parameter vector 𝐰(𝑗) and a bias term 𝑏(𝑗).
   - Each movie has a feature vector 𝐱(𝑖).
   - These vectors are learned by using the existing ratings in 𝑌 as training data.

3. **Prediction**:
   - Once the vectors are learned, the predicted rating for user j on movie i is given by the dot product of 𝐰(𝑗) and 𝐱(𝑖), plus the bias term 𝑏(𝑗):
     \[
     \text{Predicted Rating} = 𝐰(𝑗) \cdot 𝐱(𝑖) + 𝑏(𝑗)
     \]
   - This allows the system to predict ratings for unrated movies and recommend movies to users based on these predictions.

### Training the Model
The training process involves optimizing the parameters to minimize the difference between the predicted and actual ratings. This is achieved using gradient descent, where the cost function includes a regularization term to prevent overfitting.

### Implementation in TensorFlow
The collaborative filtering algorithm is implemented in TensorFlow, utilizing a custom training loop to optimize the parameters. The key components include:
- **Cost Function**: Computes the error between predicted and actual ratings, including a regularization term.
- **Gradient Descent**: Iteratively updates the parameters to minimize the cost function.

By following these steps, the collaborative filtering recommendation system can effectively learn from user ratings and provide personalized movie recommendations.

---

In [52]:
import pandas as pd
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np

In [53]:
# Read the csv
rating=pd.read_csv('/kaggle/input/anime-recommendation-database-2020/rating_complete.csv')


Due to the large size of the data and the limitations of my computing memory, I decided to sample 10,000 rows to create the pivot table for ratings.

In [54]:
# Sample the data
rating = rating.sample(n=10000,random_state=44)

In [55]:
# Pivot the data
rating_pivot = rating.pivot(index='anime_id', columns='user_id', values='rating')


In [56]:
rating_pivot

user_id,25,53,64,119,162,169,216,331,372,409,...,353012,353023,353040,353055,353098,353134,353139,353181,353232,353294
anime_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,,,,,,,,,,,...,,,,,,,,,,
5,,,,,,,,,,,...,,,,,,,,,,
6,,,,,,,,,,,...,,,,,,,,,,
7,,,,,,,,,,,...,,,,,,,,,,
15,,,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
42958,,,,,,,,,,,...,,,,,,,,,,
42984,,,,,,,,,,,...,,,,,,,,,,
43467,,,,,,,,,,,...,,,,,,,,,,
43690,,,,,,,,,,,...,,,,,,,,,,


In [57]:
# Create a pivot that flagged the not null values in rating_pivot
binary_pivot = rating_pivot.notna().astype(int)

In [58]:
# Define the variable
Y=np.array(rating_pivot)
R=np.array(binary_pivot)

In [59]:
# Set dimensions
num_users,num_movies = Y.shape
num_features = 20

# Initialize X, W, and b with random values
X = np.random.rand(num_movies, num_features)
W = np.random.rand(num_users, num_features)
b = np.random.rand(1, num_users)

# Display shapes to confirm the correct initialization
print("Shape of Y:", Y.shape)
print("Shape of R:", R.shape)
print("Shape of X:", X.shape)
print("Shape of W:", W.shape)
print("Shape of b:", b.shape)

Shape of Y: (3395, 9588)
Shape of R: (3395, 9588)
Shape of X: (9588, 20)
Shape of W: (3395, 20)
Shape of b: (1, 3395)


Now that we have defined all the necessary variables, we will compute the average rating for movie 1.

In [60]:
#  From the matrix, we can compute statistics like average rating.
tsmean =  np.mean(Y[0, R[0, :].astype(bool)])
print(f"Average rating for movie 1 : {tsmean:0.3f} / 10" )

Average rating for movie 1 : 9.143 / 10


### Create Cost Function

In [61]:
def cofi_cost_func(X, W, b, Y, R, lambda_):
    """
    Returns the cost for the content-based filtering
    Args:
      X (ndarray (num_movies,num_features)): matrix of item features
      W (ndarray (num_users,num_features)) : matrix of user parameters
      b (ndarray (1, num_users)            : vector of user parameters
      Y (ndarray (num_movies,num_users)    : matrix of user ratings of movies
      R (ndarray (num_movies,num_users)    : matrix, where R(i, j) = 1 if the i-th movies was rated by the j-th user
      lambda_ (float): regularization parameter
    Returns:
      J (float) : Cost
    """
    nm, nu = Y.shape
    J = 0
    
    for i in range(nm):
        for j in range(nu):
            if R[i, j] == 1:
                pred_rating = np.dot(W[j, :], X[i, :]) + b[0, j]
                
                error = pred_rating - Y[i, j]
                
                J +=  error**2
    
    reg_X = lambda_ / 2 * np.sum(X**2)
    reg_W = lambda_ / 2 * np.sum(W**2)
    
    J =J/2
    J +=reg_X + reg_W     

    return J

In [62]:
# Reduce the data set size so that this runs faster
num_users_r = 1910
num_movies_r = 5 
num_features_r = 3

X_r = X[:num_movies_r, :num_features_r]
W_r = W[:num_users_r,  :num_features_r]
b_r = b[0, :num_users_r].reshape(1,-1)
Y_r = Y[:num_movies_r, :num_users_r]
Y_r = tf.where(tf.math.is_nan(Y_r), tf.zeros_like(Y_r), Y_r)
R_r = R[:num_movies_r, :num_users_r]

# Evaluate cost function
J = cofi_cost_func(X_r, W_r, b_r, Y_r, R_r, 0);
print(f"Cost: {J:0.2f}")

Cost: 192.95


In [63]:
# Evaluate cost function with regularization 
J = cofi_cost_func(X_r, W_r, b_r, Y_r, R_r, 1.5);
print(f"Cost (with regularization): {J:0.2f}")

Cost (with regularization): 1620.25


In [64]:
def cofi_cost_func_v(X, W, b, Y, R, lambda_):
    """
    Returns the cost for the content-based filtering
    Vectorized for speed. Uses tensorflow operations to be compatible with custom training loop.
    Args:
      X (ndarray (num_movies,num_features)): matrix of item features
      W (ndarray (num_users,num_features)) : matrix of user parameters
      b (ndarray (1, num_users)            : vector of user parameters
      Y (ndarray (num_movies,num_users)    : matrix of user ratings of movies
      R (ndarray (num_movies,num_users)    : matrix, where R(i, j) = 1 if the i-th movies was rated by the j-th user
      lambda_ (float): regularization parameter
    Returns:
      J (float) : Cost
    """
    j = (tf.linalg.matmul(X, tf.transpose(W)) + b - Y)*R
    J = 0.5 * tf.reduce_sum(j**2) + (lambda_/2) * (tf.reduce_sum(X**2) + tf.reduce_sum(W**2))
    return J

In [65]:
# Evaluate cost function
J = cofi_cost_func_v(X_r, W_r, b_r, Y_r, R_r, 0);
print(f"Cost: {J:0.2f}")

# Evaluate cost function with regularization 
J = cofi_cost_func_v(X_r, W_r, b_r, Y_r, R_r, 1.5);
print(f"Cost (with regularization): {J:0.2f}")

Cost: 192.95
Cost (with regularization): 1620.25


## Add my rating

In [66]:
my_ratings = np.zeros(num_movies)          #  Initialize my ratings

my_ratings[241] = 5 
my_ratings[2609] = 2;
my_ratings[929]  = 9 
my_ratings[246]  = 8   
my_ratings[216] = 3  
my_ratings[110] = 6  
my_ratings[382]  = 7  
my_ratings[366]  = 4  
my_ratings[622]  = 2  
my_ratings[988]  = 1  
my_ratings[263] = 10  
my_ratings[293] = 9  
my_ratings[793]  = 7   
my_rated = [i for i in range(len(my_ratings)) if my_ratings[i] > 0]

In [67]:
print('Y:',Y.T.shape)
print('my_ratings:',my_ratings.shape)

Y: (9588, 3395)
my_ratings: (9588,)


In [68]:
# Concatenate my_ratings with Y
Y = np.c_[my_ratings, Y.T]

# Update the indicator matrix R accordingly
R_new_user = (my_ratings != 0).astype(int)
R = np.c_[R_new_user, R.T]

Y=Y.T
R=R.T

Now that we have defined all the necessary variables, we will compute the average rating for movie 1.

---

To address the cold start problem for new users who have not provided any ratings, we can utilize rating normalization to generate a baseline prediction for users with no ratings.

### Why Normalize Ratings in Collaborative Filtering?
Normalization helps in collaborative filtering by ensuring that the ratings are adjusted to a common scale, reducing bias and improving the accuracy of predictions. Here's why it's important:
1. **Consistent Scale**: Normalizing ratings ensures that all user ratings are on a consistent scale, allowing the model to make fair comparisons across different users and movies.
2. **Bias Reduction**: It reduces the bias introduced by users who rate movies more generously or harshly compared to others.
3. **Cold Start Problem**: For new users with no ratings, normalization provides a baseline prediction by adjusting for the average rating, helping to make initial recommendations.

By normalizing ratings, we can create a more accurate and fair collaborative filtering model, enhancing its ability to make predictions for all users, including new ones.


In [69]:
def normalizeRatings(Y, R):
    """
    Normalize Y so that each movie has a mean rating of 0, and return the mean rating in Ymean.
    Args:
    Y -- (num_movies, num_users) matrix of movie ratings
    R -- (num_movies, num_users) matrix, where R(i, j) = 1 if and only if user j gave a rating to movie i

    Returns:
    Ynorm -- normalized Y matrix
    Ymean -- mean rating for each movie
    """
    num_movies = Y.shape[0]
    Ymean = np.zeros((num_movies, 1))
    Ynorm = np.zeros_like(Y)

    for i in range(num_movies):
        idx = np.where(R[i] == 1)[0]
        if len(idx) > 0:
            Ymean[i] = np.mean(Y[i, idx])
            Ynorm[i, idx] = Y[i, idx] - Ymean[i]
        else:
            Ymean[i] = 0
            Ynorm[i, :] = 0

    return Ynorm, Ymean

In [70]:
# Normalize the Dataset
Ynorm, Ymean = normalizeRatings(Y, R)

## Create a custom function using tensorflow

In [71]:
#  Useful Values
num_movies, num_users = Y.shape
num_features = 100

# Set Initial Parameters (W, X), use tf.Variable to track these variables
tf.random.set_seed(1234) # for consistent results
W = tf.Variable(tf.random.normal((num_users,  num_features),dtype=tf.float64),  name='W')
X = tf.Variable(tf.random.normal((num_movies, num_features),dtype=tf.float64),  name='X')
b = tf.Variable(tf.random.normal((1,          num_users),   dtype=tf.float64),  name='b')

# Instantiate an optimizer.
optimizer = keras.optimizers.Adam(learning_rate=1e-1)

In [72]:
iterations = 450
lambda_ = 0.5
for iter in range(iterations):
    with tf.GradientTape() as tape:

        # Compute the cost
        cost_value = cofi_cost_func_v(X, W, b, Ynorm, R, lambda_)

    # Use the gradient tape to automatically retrieve
    # the gradients of the trainable variables with respect to the loss
    grads = tape.gradient( cost_value, [X,W,b] )

    # Run one step of gradient descent by updating
    # the value of the variables to minimize the loss.
    optimizer.apply_gradients( zip(grads, [X,W,b]) )

    # Log periodically.
    if iter % 20 == 0:
        print(f"Training loss at iteration {iter}: {cost_value:0.1f}")

Training loss at iteration 0: 835064.5
Training loss at iteration 20: 155965.0
Training loss at iteration 40: 67725.2
Training loss at iteration 60: 33735.1
Training loss at iteration 80: 18275.8
Training loss at iteration 100: 10549.5
Training loss at iteration 120: 6458.7
Training loss at iteration 140: 4173.1
Training loss at iteration 160: 2815.6
Training loss at iteration 180: 1959.3
Training loss at iteration 200: 1393.6
Training loss at iteration 220: 1008.5
Training loss at iteration 240: 741.6
Training loss at iteration 260: 554.7
Training loss at iteration 280: 422.7
Training loss at iteration 300: 329.1
Training loss at iteration 320: 262.4
Training loss at iteration 340: 214.8
Training loss at iteration 360: 180.7
Training loss at iteration 380: 156.4
Training loss at iteration 400: 139.3
Training loss at iteration 420: 127.2
Training loss at iteration 440: 118.9


In [73]:
# Save the model variables
np.savez('cofi_model8.npz', W=W.numpy(), X=X.numpy(), b=b.numpy(), Ymean=Ymean)

## Test the model

In [76]:
import numpy as np
import pandas as pd

# Make a prediction using trained weights and biases
p = np.matmul(X, np.transpose(W)) + b

# Restore the mean by adding Ymean (broadcasting correctly)
# Since p is (2633, 5855) and Ymean is (2633, 1), broadcasting will automatically work
pm = p + Ymean

my_predictions = pm[:, 0]

# Sort predictions
ix = np.argsort(my_predictions)[::-1]  

# Sample movieList for demonstration
movieList = [f"Movie {i}" for i in range(num_movies)]

# Assume my_rated contains the indices of movies rated by the user
my_rated = np.where(my_ratings > 0)[0]

print('\nTop Predictions for New User:\n')
for i in range(17):
    j = ix[i]
    if j not in my_rated:
        print(f'Predicting rating {my_predictions[j]:0.2f} for movie {movieList[j]}')

print('\n\nOriginal vs Predicted ratings:\n')
for i in range(len(my_ratings)):
    if my_ratings[i] > 0:
        print(f'Original {my_ratings[i]}, Predicted {my_predictions[i]:0.2f} for {movieList[i]}')



Top Predictions for New User:

Predicting rating 8.67 for movie Movie 472
Predicting rating 8.67 for movie Movie 2771
Predicting rating 8.67 for movie Movie 1170
Predicting rating 8.67 for movie Movie 30
Predicting rating 8.67 for movie Movie 3132
Predicting rating 8.67 for movie Movie 701
Predicting rating 8.67 for movie Movie 2478
Predicting rating 8.67 for movie Movie 4
Predicting rating 8.67 for movie Movie 1708
Predicting rating 8.67 for movie Movie 2810
Predicting rating 8.67 for movie Movie 3337
Predicting rating 8.67 for movie Movie 3078
Predicting rating 8.67 for movie Movie 460
Predicting rating 8.67 for movie Movie 1496
Predicting rating 8.67 for movie Movie 1703
Predicting rating 8.67 for movie Movie 1342
Predicting rating 8.67 for movie Movie 2377


Original vs Predicted ratings:

Original 6.0, Predicted 5.33 for Movie 110
Original 3.0, Predicted 5.77 for Movie 216
Original 5.0, Predicted 7.13 for Movie 241
Original 8.0, Predicted 7.67 for Movie 246
Original 10.0, Predict

### Conclusion
This notebook demonstrates the implementation of a collaborative filtering recommendation system using TensorFlow. It covers data preprocessing, defining the cost function, training the model, and making predictions. Adjust the parameters and input data as needed for your specific use case.