
# Collaborative Filtering — my Exercise 1 notes (MovieLens small)

I'm coding this the way I do it myself: load the provided data, implement the cost function with loops (with regularization), 
check it, then do the same thing vectorized and compare. Keeping it close to the assignment flow and names.


In [None]:

import numpy as np
from recsys_utils import *  # provided with the assignment

# Load precomputed parameters and ratings (small set)
X, W, b, num_movies, num_features, num_users = load_precalc_params_small()
Y, R = load_ratings_small()

print("Y", Y.shape, "R", R.shape)
print("X", X.shape, "W", W.shape, "b", b.shape)



## Exercise 1 — cost function (loops) + regularization

This is the straightforward loop version. I’m only counting entries where `R[i,j] == 1`. 
Regularization is added on `X` and `W` to match the assignment.


In [None]:

def cofi_cost_func(X, W, b, Y, R, lambda_):
    """
    Returns the cost for the content-based filtering
    Args:
      X (ndarray (num_movies,num_features)): matrix of item features
      W (ndarray (num_users,num_features)) : matrix of user parameters
      b (ndarray (1, num_users)            : vector of user parameters
      Y (ndarray (num_movies,num_users)    : matrix of user ratings of movies
      R (ndarray (num_movies,num_users)    : matrix, where R(i, j) = 1 if the i-th movies was rated by the j-th user
      lambda_ (float): regularization parameter
    Returns:
      J (float) : Cost
    """
    nm, nu = Y.shape
    J = 0
    n = W.shape[1]
    ### START CODE HERE ### 
    W1 = np.transpose(W)
    for j in range(nu):
        w = W[j,:]
        b_j = b[0,j]
        for i in range(nm):
            x = X[i,:]
            y = Y[i,j]
            r = R[i,j]
            J += np.square(r * (np.dot(w,x) + b_j - y )) 
            
    
    J = J/2
    lambda_w = 0
    for j in range(nu):
        for k in range(n):
            lambda_w += W[j,k]**2
    lambda_x = 0
    for j in range(nm):
        for k in range(n):
            lambda_x += X[j,k]**2
            
    reg = (lambda_*(lambda_w + lambda_x))/2
    J = J + reg
    ### END CODE HERE ### 

    return J


In [None]:

lambda_ = 1.0
J_loop = cofi_cost_func(X, W, b, Y, R, lambda_)
print("Cost (loops):", float(J_loop))



## Vectorized cost function (same result, no loops)

Same signature and behavior; computes the same cost using matrix ops.


In [None]:

def cofi_cost_func_vec(X, W, b, Y, R, lambda_):
    """Vectorized version of cofi_cost_func."""
    pred = X @ W.T          # (num_movies × num_users)
    pred = pred + b         # b is (1, num_users); broadcasts over rows
    err  = (pred - Y) * R
    J = 0.5 * np.sum(err * err)
    J += 0.5 * lambda_ * (np.sum(W * W) + np.sum(X * X))
    return J


In [None]:

J_vec = cofi_cost_func_vec(X, W, b, Y, R, lambda_)
print("Cost (vectorized):", float(J_vec))
print("abs diff:", abs(float(J_vec) - float(J_loop)))



## Why this is cool (my take)

The model doesn’t know genres, tags, or any hand-crafted features. It doesn’t even know which “axes” matter. 
When we learn `X` (movie features) and `W` (user preferences), we’re letting the model **invent the axes** that best explain the ratings.  
Each dimension ends up capturing some latent factor that influences taste. We never label those factors; the training creates them implicitly.  
That’s why the same basic implementation can adapt to many domains beyond movies — whenever you have entities, users, and sparse feedback, 
these learned axes will reorganize themselves to fit the data.
