# Collaborative Filtering

The goal of a collaborative filtering recommender system is to generate two vectors: For each user, a 'parameter vector' that embodies the tastes of a user. For each item, a feature vector of the same size which embodies some description of the item. The dot product of the two vectors plus the bias term should produce an estimate of the rating the user might give to that item.

* Existing ratings are provided in matrix form as Y.
* R is a matix of the same shape and has a 1 where items have been rated, otherwise 0.
* In both the above matrices, items are in rows, users in columns.
* Each user has a parameter vector W and bias b. Each item has a feature vector X. These vectors are simultaneously learned by using the existing user/item ratings as training data.

In [1]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

import tensorflow as tf
from tensorflow import keras

from tqdm.auto import trange

## Data

In [2]:
df = pd.read_csv('../input/book-recommendation-dataset/Ratings.csv')
df.head()

Unnamed: 0,User-ID,ISBN,Book-Rating
0,276725,034545104X,0
1,276726,0155061224,5
2,276727,0446520802,0
3,276729,052165615X,3
4,276729,0521795028,6


**Sampling**

In [3]:
isbn_counts = df['ISBN'].value_counts()
isbns = isbn_counts[isbn_counts > 200].index.values

user_counts = df['User-ID'].value_counts()
users = user_counts[user_counts > 200].index.values

len(isbns), len(users)

(193, 899)

In [4]:
df = df[df['ISBN'].isin(isbns) & df['User-ID'].isin(users)].drop_duplicates(subset=['ISBN', 'User-ID']).reset_index(drop=True)
ratings_scaler = MinMaxScaler(feature_range=(0.1, 1)).fit(df[['Book-Rating']])
df[['Book-Rating']] = ratings_scaler.transform(df[['Book-Rating']])
len(df), df['ISBN'].nunique(), df['User-ID'].nunique()

(21242, 193, 870)

**Preparing Y and R**

In [5]:
Y_df = pd.crosstab(index=df['ISBN'], columns=df['User-ID'], values=df['Book-Rating'], aggfunc=np.mean)
R_df = (~Y_df.isna()).astype(int)
Y_df = Y_df.fillna(0)

Y = Y_df.values
R = R_df.values

# Modelling

**Cost Function**

In [6]:
def cofi_cost_func(X, W, b, Y, R, lambda_):
    """
    Returns the cost for the content-based filtering
    Args:
      X (ndarray (num_items,num_features)) : matrix of item features
      W (ndarray (num_users,num_features)) : matrix of user parameters
      b (ndarray (1, num_users)            : vector of user parameters
      Y (ndarray (num_movies,num_users)    : matrix of user ratings of movies
      R (ndarray (num_movies,num_users)    : matrix, where R(i, j) = 1 if the i-th movies was rated by the j-th user
      lambda_ (float): regularization parameter
    Returns:
      J (float) : Cost
    """
    j = (tf.linalg.matmul(X, tf.transpose(W)) + b - Y)*R
    J = 0.5 * tf.reduce_sum(j**2) + (lambda_/2) * (tf.reduce_sum(X**2) + tf.reduce_sum(W**2))
    return J

**Constants, Variables and Optimizers**

In [7]:
num_movies, num_users = Y.shape
num_features = 100

tf.random.set_seed(19)
W = tf.Variable(tf.random.normal((num_users,  num_features),dtype=tf.float64),  name='W')
X = tf.Variable(tf.random.normal((num_movies, num_features),dtype=tf.float64),  name='X')
b = tf.Variable(tf.random.normal((1,          num_users),   dtype=tf.float64),  name='b')

optimizer = keras.optimizers.Adam(learning_rate=1e-1)

2022-08-13 06:54:31.562879: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


**Training**

In [8]:
iterations = 1_500
lambda_ = 1
for iter in trange(iterations):
    with tf.GradientTape() as tape:
        cost_value = cofi_cost_func(X, W, b, Y, R, lambda_)
        
    grads = tape.gradient( cost_value, [X,W,b] )
    optimizer.apply_gradients( zip(grads, [X,W,b]) )

    if (iter+1) % 100 == 0:
        print(f"Training loss after iteration {iter+1}: {cost_value:0.1f}")

  0%|          | 0/1500 [00:00<?, ?it/s]

Training loss after iteration 100: 4339.4
Training loss after iteration 200: 994.0
Training loss after iteration 300: 525.7
Training loss after iteration 400: 437.5
Training loss after iteration 500: 414.2
Training loss after iteration 600: 406.1
Training loss after iteration 700: 402.7
Training loss after iteration 800: 401.2
Training loss after iteration 900: 400.4
Training loss after iteration 1000: 400.0
Training loss after iteration 1100: 399.8
Training loss after iteration 1200: 399.7
Training loss after iteration 1300: 399.7
Training loss after iteration 1400: 399.7
Training loss after iteration 1500: 399.6


**Predictions**

In [9]:
preds = np.matmul(X.numpy(), np.transpose(W.numpy())) + b.numpy()
preds_df = pd.DataFrame(preds, columns=Y_df.columns, index=Y_df.index)

In [10]:
coords_1 =  np.argwhere(R == 1).tolist()
np.random.shuffle(coords_1)

print('Predictions for known ratings')
for r, c in coords_1[:5]:
    print(f'Actual: {Y[r, c]:.3}, Predicted: {preds[r, c]:.3}')

Predictions for known ratings
Actual: 0.1, Predicted: 0.0987
Actual: 0.1, Predicted: 0.169
Actual: 0.73, Predicted: 0.685
Actual: 0.1, Predicted: 0.122
Actual: 0.1, Predicted: 0.0788


In [11]:
coords_0 =  np.argwhere(R == 0).tolist()
np.random.shuffle(coords_0)

print('Predictions for known ratings')
for r, c in coords_0[:3]:
    print(f'Actual: {Y[r, c]:.3}, Predicted: {preds[r, c]:.3}')

Predictions for known ratings
Actual: 0.0, Predicted: 0.281
Actual: 0.0, Predicted: 0.256
Actual: 0.0, Predicted: 0.162
