Implement collaborative filtering to build a recommender for movies.

In [2]:
# Initial imports:

import numpy as np
import tensorflow as tf
from tensorflow import keras
from recsys_utils import *

As in notes, notattion is:

$r(i, j) = 1$ if user $j$ rated movie $i$, else $0$

$y(i, j) :=$ user $j$'s rating of movie $i$, if $r(i, j) == 1$

$w^{(j)}$: parameters for user $j$

$b^{(j)}$: parameter for user $j$

$x^{(i)}$: feature ratings for movie $i$

$n_u$ or num_users

$n_m$ or num_movies

$n$ or num_features

$X$: matrix of vectors $x^{(i)}$

$W$: matrix of vectors $w^{(j)}$

$b$: vector of bias parameters $b^{(j)}$

$R$: matrix of elements $r(i, j)$



In [3]:
# Load data:
X, W, b, num_movies, num_features, num_users = load_precalc_params_small()
Y, R = load_ratings_small()

# Brief overview
print(f"""
Y: {Y.shape}
R: {R.shape}
X: {X.shape}
W: {W.shape}
b: {b.shape}
num_movies: {num_movies}
num_features: {num_features}
num_users: {num_users}
""")


Y: (4778, 443)
R: (4778, 443)
X: (4778, 10)
W: (443, 10)
b: (1, 443)
num_movies: 4778
num_features: 10
num_users: 443



In [4]:
# From these base data, get average rating:
tsmean = np.mean(Y[0, R[0, :].astype(bool)])
print(f"Average rating for movie 1: {tsmean:0.3f} / 5")

Average rating for movie 1: 3.400 / 5


See notes for the cofi algorithm, it's a doozy.

Do note that the summation over $(i, j): r(i, j) = 1$ can be expressed as summation over the elements times $r(i, j)$, which filters out the 0-values anyway.

Initially, we start with a for-loop:

In [5]:
def cofi_cost_func_loop(X, W, b, Y, R, lambda_):
    """
    Returns the cost for the content-based filtering.
    Args:
        X (ndarray (num_movies, num_features)): matrix of item features
        W (ndarray (num_users, num_features)): matrix of user parameters
        b (ndarray (1, num_users): vector of user parameters
        Y (ndarray (num_movies, num_users)): matrix of user ratings
        R (ndarray (num_movies, num_users)): matrix where R(i, j) = 1 if user j has rated movie i
        lambda_ (float): regularization parameter
        
    Returns:
        J (float): Cost
    """
    
    nm, nu = Y.shape
    J = 0
    
    for j in range(nu):
        w = W[j, :]
        bj = b[0, j]
        
        for i in range(nm):
            x = X[i, :]
            r = R[i, j]
            y = Y[i, j]
            
            J += np.multiply(r, (np.dot(w, x) + bj - y) ** 2)
    
    J += lambda_ * (np.sum(np.power(W, 2)) + np.sum(np.power(X, 2)))
    J /= 2
    
    return J

In [7]:
# Test
nu_r = 4
nm_r = 5
n_r = 3
X_r = X[:nm_r, :n_r]
W_r = W[:nu_r, :n_r]
b_r = b[0, :nu_r].reshape(1, -1)
Y_r = Y[:nm_r, :nu_r]
R_r = R[:nm_r, :nu_r]

# Without reg
J_r = cofi_cost_func_loop(X_r, W_r, b_r, Y_r, R_r, 0)
print(J_r)
# Expect 13.67

# With reg
J_r = cofi_cost_func_loop(X_r, W_r, b_r, Y_r, R_r, 1.5)
print(J_r)
# Expect 28.09

13.670725805579915
28.09383799145902


Obviously, fuck loops. Try for vectorized!

In [25]:
def cofi_cost_func(X, W, b, Y, R, lambda_):
    """
    Returns the cost for the content-based filtering.
    Args:
        X (ndarray (num_movies, num_features)): matrix of item features
        W (ndarray (num_users, num_features)): matrix of user parameters
        b (ndarray (1, num_users): vector of user parameters
        Y (ndarray (num_movies, num_users)): matrix of user ratings
        R (ndarray (num_movies, num_users)): matrix where R(i, j) = 1 if user j has rated movie i
        lambda_ (float): regularization parameter
        
    Returns:
        J (float): Cost
    """
    
    # Non-reg
#     j = np.sum(np.power((np.matmul(X, np.transpose(W)) + b - Y) * R, 2))
    
    # Reg
#     J = 0.5 * (j + lambda_ * (np.sum(np.power(X, 2)) + np.sum(np.power(W, 2))))
    
    # ...Except we want this in a tensorflow custom loop, so it should all be tensorflowed.
    # I guess a simple comversion to tf.Tensor would work afterwards too, but eh
    j = (tf.linalg.matmul(X, tf.transpose(W)) + b - Y) * R
    J = 0.5 * (tf.reduce_sum(j ** 2) + lambda_ * (tf.reduce_sum(W ** 2) + tf.reduce_sum(X ** 2))) 
    
    return J
    

In [26]:
print(cofi_cost_func(X_r, W_r, b_r, Y_r, R_r, 0))
print(cofi_cost_func(X_r, W_r, b_r, Y_r, R_r, 1.5))

tf.Tensor(13.670725805579915, shape=(), dtype=float64)
tf.Tensor(28.09383799145902, shape=(), dtype=float64)


Now to actually train a model.
Try it on yourself, you bum.

In [19]:
movieList, movieList_df = load_Movie_List_pd()

my_ratings = np.zeros(num_movies)
# lol not building a frontend for this though, so just say toy story 3 is a 5/5
my_ratings[2700] = 5
# and persuasion is a 2/5
my_ratings[2609] = 2
# ...etc:

# We have selected a few movies we liked / did not like and the ratings we
# gave are as follows:
my_ratings[929]  = 5   # Lord of the Rings: The Return of the King, The
my_ratings[246]  = 5   # Shrek (2001)
my_ratings[2716] = 3   # Inception
my_ratings[1150] = 5   # Incredibles, The (2004)
my_ratings[382]  = 2   # Amelie (Fabuleux destin d'Amélie Poulain, Le)
my_ratings[366]  = 5   # Harry Potter and the Sorcerer's Stone (a.k.a. Harry Potter and the Philosopher's Stone) (2001)
my_ratings[622]  = 5   # Harry Potter and the Chamber of Secrets (2002)
my_ratings[988]  = 3   # Eternal Sunshine of the Spotless Mind (2004)
my_ratings[2925] = 1   # Louis Theroux: Law & Disorder (2008)
my_ratings[2937] = 1   # Nothing to Declare (Rien à déclarer)
my_ratings[793]  = 5   # Pirates of the Caribbean: The Curse of the Black Pearl (2003)

my_rated = [i for i in range(len(my_ratings)) if my_ratings[i] > 0]


print('\nNew user ratings:\n')
for i in range(len(my_ratings)):
    if my_ratings[i] > 0 :
        print(f'Rated {my_ratings[i]} for  {movieList_df.loc[i,"title"]}');


New user ratings:

Rated 5.0 for  Shrek (2001)
Rated 5.0 for  Harry Potter and the Sorcerer's Stone (a.k.a. Harry Potter and the Philosopher's Stone) (2001)
Rated 2.0 for  Amelie (Fabuleux destin d'Amélie Poulain, Le) (2001)
Rated 5.0 for  Harry Potter and the Chamber of Secrets (2002)
Rated 5.0 for  Pirates of the Caribbean: The Curse of the Black Pearl (2003)
Rated 5.0 for  Lord of the Rings: The Return of the King, The (2003)
Rated 3.0 for  Eternal Sunshine of the Spotless Mind (2004)
Rated 5.0 for  Incredibles, The (2004)
Rated 2.0 for  Persuasion (2007)
Rated 5.0 for  Toy Story 3 (2010)
Rated 3.0 for  Inception (2010)
Rated 1.0 for  Louis Theroux: Law & Disorder (2008)
Rated 1.0 for  Nothing to Declare (Rien à déclarer) (2010)


Add these ratings to Y and R, then normalize:

In [20]:
# Reload
Y, R = load_ratings_small()

# Add 'my' ratings to it
Y = np.c_[my_ratings, Y]
R = np.c_[(my_ratings != 0).astype(int), R] # ...or my_rated

# Normalize
Ynorm, Ymean = normalizeRatings(Y, R)

Then prep the model training, init params and default to Adam optimizer:

In [21]:
num_movies, num_users = Y.shape
num_features = 100

# Initial params (W, X) - remember to use tf.Variable to keep them tracked
tf.random.set_seed(1234)
W = tf.Variable(tf.random.normal((num_users, num_features), dtype=tf.float64), name='W')
X = tf.Variable(tf.random.normal((num_movies, num_features), dtype=tf.float64), name='X')
b = tf.Variable(tf.random.normal((1, num_users), dtype=tf.float64), name='b')

# Instantiate Adam
optimizer = keras.optimizers.Adam(learning_rate=1e-1)

2023-02-17 20:12:55.266668: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2023-02-17 20:12:55.268665: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-17 20:12:55.270936: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


Refer to notes to recall that the process doesn't map directly onto the neural net architecture that tensorflow has default support for, so we will need a custom loop.

We do still use tensorflow due to its quick derivative calculating, though!

In [27]:
iters = 200
lambda_ = 1

for i in range(iters):
    # GradientTape to track the Variable operations
    with tf.GradientTape() as tape:
        # fwd pass
        cost_value = cofi_cost_func(X, W, b, Ynorm, R, lambda_)
    
    # Use the tape to auto retrieve grads of trainable vars wrt loss
    grads = tape.gradient(cost_value, [X, W, b])
    
    # 1 step of grad descent by updating vals of vars to minimize loss
    optimizer.apply_gradients(zip(grads, [X, W, b]))
    
    # Logging periodically
    if i % 20 == 0:
        print(f"Training loss at iter {i}: {cost_value:0.1f}")

Training loss at iter 0: 2321191.3
Training loss at iter 20: 136168.7
Training loss at iter 40: 51863.3
Training loss at iter 60: 24598.8
Training loss at iter 80: 13630.4
Training loss at iter 100: 8487.6
Training loss at iter 120: 5807.7
Training loss at iter 140: 4311.6
Training loss at iter 160: 3435.2
Training loss at iter 180: 2902.1


Loss minimized, now compute ratings and recommend movies

In [28]:
p = np.matmul(X.numpy(), np.transpose(W.numpy())) + b.numpy()

pm = p + Ymean

my_preds = pm[:, 0]
ix = tf.argsort(my_preds, direction="DESCENDING")

for i in range(17):
    j = ix[i]
    if j not in my_rated:
        print(f"Predicting rating: {my_preds[j]} for movie {movieList[j]}")
        
print(f"\n\nOriginal vs Predicted ratings:\n")
for i in range(len(my_ratings)):
    if my_ratings[i] > 0:
        print(f"For movie: {movieList[i]}\nOriginal {my_ratings[i]}\nPredicted {my_preds[i]}\n")

Predicting rating: 4.486691770397451 for movie My Sassy Girl (Yeopgijeogin geunyeo) (2001)
Predicting rating: 4.479742429385017 for movie Martin Lawrence Live: Runteldat (2002)
Predicting rating: 4.477791782939328 for movie Memento (2000)
Predicting rating: 4.47009814826897 for movie Delirium (2014)
Predicting rating: 4.46874178703725 for movie Laggies (2014)
Predicting rating: 4.468021490016086 for movie One I Love, The (2014)
Predicting rating: 4.464647838177511 for movie Particle Fever (2013)
Predicting rating: 4.448565212570453 for movie Eichmann (2007)
Predicting rating: 4.448547909251748 for movie Battle Royale 2: Requiem (Batoru rowaiaru II: Chinkonka) (2003)
Predicting rating: 4.44846062310467 for movie Into the Abyss (2011)


Original vs Predicted ratings:

For movie: Shrek (2001)
Original 5.0
Predicted 4.8971371696530674

For movie: Harry Potter and the Sorcerer's Stone (a.k.a. Harry Potter and the Philosopher's Stone) (2001)
Original 5.0
Predicted 4.843374744146587

For movi

In [29]:
filter = (movieList_df["number of ratings"] > 20)
movieList_df['pred'] = my_preds
movieList_df = movieList_df.reindex(columns=["pred", "mean rating", "number of ratings", "title"])
movieList_df.loc[ix[:300]].loc[filter].sort_values("mean rating", ascending=False)

Unnamed: 0,pred,mean rating,number of ratings,title
1743,4.030965,4.252336,107,"Departed, The (2006)"
2112,3.985287,4.238255,149,"Dark Knight, The (2008)"
211,4.477792,4.122642,159,Memento (2000)
929,4.887053,4.118919,185,"Lord of the Rings: The Return of the King, The..."
2700,4.79653,4.109091,55,Toy Story 3 (2010)
653,4.357304,4.021277,188,"Lord of the Rings: The Two Towers, The (2002)"
1122,4.004469,4.006494,77,Shaun of the Dead (2004)
1841,3.980647,4.0,61,Hot Fuzz (2007)
3083,4.084633,3.993421,76,"Dark Knight Rises, The (2012)"
2804,4.434171,3.989362,47,Harry Potter and the Deathly Hallows: Part 1 (...
