## What is Matrix Factorization?

- **Matrix factorization** is a mathematical technique commonly used in recommendation systems to factorize a user-item interaction matrix into lower-dimensional matrices representing users and items.
- The core concept behind matrix factorization is to approximate the original matrix by decomposing it into two or more matrices.
- This process helps uncover latent factors that influence user-item interactions.


### Let's Breakdown the process of Matrix Factorization in Simple Steps

- User-Item Interaction Matrix:

 - In recommendation systems, you start with a user-item interaction matrix, often denoted as R.
 - Rows correspond to users, columns correspond to items, and the entries $R_{i,j}$ represent user ratings, interactions, or preferences for items.

 <center><img src="https://drive.google.com/uc?id=1gGxM-opp1666trtmfMlc6JY4Ue46imbn"></center>

- Decomposition:

 - Matrix factorization decomposes this user-item interaction matrix R into two lower-dimensional matrices, typically denoted as U (user matrix) and I (item matrix).

 <center><img src="https://drive.google.com/uc?id=1t3poDcWoe-ewAi_QCgEXe1SBoSgqZGOB"></center>

- Dimensions:

 - The dimensions of the user matrix U are M x K, where M is the number of users, and K is the number of latent factors. The item matrix I has dimensions of K x N, where N is the number of items.

- Objective Function:

 - The goal of matrix factorization is to find the matrices U and V such that the product U * V approximates the original matrix R.
 - To achieve this, an objective function is defined, often using a loss function like Mean Squared Error (MSE) or a variant of it.

 > $Loss = ùö∫ (R_{ij}-(U_{i}*I_{j}))^{2}$

 - Here, U_i and I_j are the latent factor vectors for the i-th user and j-th item, respectively.

- Optimization:

 - The matrices U and I are optimized to minimize the loss function. This is typically done using optimization techniques like Gradient Descent or Alternating Least Squares (ALS).

 - We'll use Gradient descent to optimize the loss function.

 <center><img src="https://drive.google.com/uc?id=1fgyQ-KUppMDSxL1zs78mJSu06FyDaG-2"></center>

 - The optimization process updates the latent factor vectors in U and V iteratively to improve the approximation of the original matrix R.
 - The objective is to find U and V that provide the best fit to the observed user-item interactions.

- Prediction:

 - Once the optimization is complete, the factorized matrices U and V can be used to predict missing values in the original matrix R.
 - These predictions are used to generate recommendations for users.





In [None]:
# Import Libraries
import numpy as np
import pandas as pd
from datetime import datetime
from sklearn.preprocessing import StandardScaler

import warnings
warnings.filterwarnings('ignore')

In [None]:
!wget --no-check-certificate 'https://drive.google.com/uc?export=download&id=1Q9UJtrN_v_dS-garl5gQ1I_SotGhye_1' -O "movies.csv"
!wget --no-check-certificate 'https://drive.google.com/uc?export=download&id=1HOFWUAMFlYbd-gk1B2IyV2-hXDZI7gKR' -O "ratings.csv"
!wget --no-check-certificate 'https://drive.google.com/uc?export=download&id=1b7_yRRBs3s3atp1WQHN2GU577vxY8u_h' -O "users.csv"



--2025-12-10 06:59:50--  https://drive.google.com/uc?export=download&id=1Q9UJtrN_v_dS-garl5gQ1I_SotGhye_1
Resolving drive.google.com (drive.google.com)... 74.125.132.138, 74.125.132.101, 74.125.132.139, ...
Connecting to drive.google.com (drive.google.com)|74.125.132.138|:443... connected.
HTTP request sent, awaiting response... 303 See Other
Location: https://drive.usercontent.google.com/download?id=1Q9UJtrN_v_dS-garl5gQ1I_SotGhye_1&export=download [following]
--2025-12-10 06:59:51--  https://drive.usercontent.google.com/download?id=1Q9UJtrN_v_dS-garl5gQ1I_SotGhye_1&export=download
Resolving drive.usercontent.google.com (drive.usercontent.google.com)... 74.125.201.132, 2607:f8b0:4001:c01::84
Connecting to drive.usercontent.google.com (drive.usercontent.google.com)|74.125.201.132|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 515699 (504K) [application/octet-stream]
Saving to: ‚Äòmovies.csv‚Äô


2025-12-10 06:59:52 (102 MB/s) - ‚Äòmovies.csv‚Äô saved [515699/

In [None]:
movies = pd.read_csv('movies.csv')
ratings = pd.read_csv('ratings.csv')
users = pd.read_csv('users.csv')

print("Movies -> ", movies.shape)
print("Ratings -> ", ratings.shape)
print("Users -> ", users.shape)

Movies ->  (10329, 3)
Ratings ->  (105339, 4)
Users ->  (668, 3)


In [None]:
movies.head(3)

Unnamed: 0,movieId,title,genres
0,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
1,2,Jumanji (1995),Adventure|Children|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance


In [None]:
users.head()

Unnamed: 0,userId,age,time_spent_per_day
0,1,16,3.976315
1,2,24,1.891303
2,3,20,4.521478
3,4,23,2.095284
4,5,35,1.75986


In [None]:
ratings.head(3)

Unnamed: 0,userId,movieId,rating,timestamp
0,1,16,4.0,1217897793
1,1,24,1.5,1217895807
2,1,32,4.0,1217896246


- Converting the original user-item interaction data into a user-item matrix where rows represent users, columns represent movies, and the entries contain user ratings.
- Any missing values (unrated movies) are filled with 0, creating a matrix suitable for collaborative filtering in recommendation systems.

In [None]:
rm = ratings.pivot(index = 'userId', columns = 'movieId', values = 'rating').fillna(0)
rm.head()

movieId,1,2,3,4,5,6,7,8,9,10,...,144482,144656,144976,146344,146656,146684,146878,148238,148626,149532
userId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,5.0,0.0,2.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,3.0,0.0,3.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [None]:
# Sparsity of the matrix

(rm > 0).sum().sum() / (rm.shape[0] * rm.shape[1])

# Numerator: All the non-zero values in the table
#Denominator: No. of cells = rows * columns

np.float64(0.015272940801206305)

# Matrix factorization from Scratch

In [None]:
rm_small = rm.copy()
rm_small = rm_small[rm_small.columns[:100]]
rm_small = rm_small.head(100)

print(rm_small)
rm_small.head()

movieId  1    2    3    4    5    6    7    8    9    10   ...  100  101  102  \
userId                                                     ...                  
1        0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0   
2        5.0  0.0  2.0  0.0  3.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0   
3        0.0  0.0  0.0  0.0  3.0  0.0  3.0  0.0  0.0  0.0  ...  0.0  0.0  0.0   
4        0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0   
5        4.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0   
...      ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...   
96       5.0  0.0  4.0  0.0  3.0  4.0  3.0  0.0  1.0  0.0  ...  0.0  0.0  0.0   
97       3.5  4.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0   
98       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0   
99       0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0   
100      3.0  0.0  0.0  0.0 

movieId,1,2,3,4,5,6,7,8,9,10,...,100,101,102,103,104,105,107,108,110,111
userId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0
2,5.0,0.0,2.0,0.0,3.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,3.0,0.0,3.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0
5,4.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0


- Matrix factorization using stochastic gradient descent to minimize the cost function.
- It iteratively optimizes latent factor matrices P and Q to approximate the user-item interaction matrix R. After optimization, it can predict user-item interactions and compares them with actual ratings.

In [None]:
K = 2
P = np.random.normal(size = (rm_small.shape[0], K))
Q = np.random.normal(size = (rm_small.shape[1], K))

def matrix_factorization(R, P, Q , K, steps = 10000, alpha = 0.0002, beta = 0.02):
  Q = Q.T #Transpose of Q

  for step in range(steps):
    for i in range(len(R)): # loop through all users
      for j in range(len(R[i])): # loop through all movies of that user
          if R[i][j] == 0:
            continue
          eij = R[i][j] - np.dot(P[i,:], Q[:,j]) #Calculate prediction error

          for k in range(K): #Upadting the latent factors to optimize
            x = P[i][k]

            P[i][k] += alpha * (2 * eij * Q[k][j]) - beta * P[i][k]
            Q[k][j] += alpha * (2 * eij * x) - beta * Q[k][j]

    return P, Q.T


In [None]:
P_, Q_ = matrix_factorization(rm_small.values, P.copy(), Q.copy(), 2)

In [None]:
Q_

array([[-3.13630772e-01, -5.00471034e-01],
       [-4.02478570e-04, -1.39500832e+00],
       [-2.20395321e+00, -9.12192588e-01],
       [-1.62960176e+00,  6.53248929e-01],
       [-1.06031812e+00,  1.99861242e-01],
       [ 4.83642944e-01,  8.12942314e-03],
       [-1.14805413e+00,  1.72649396e-01],
       [-1.30753886e+00,  8.43149503e-02],
       [ 2.02547944e-02, -4.31780588e-01],
       [ 8.30814487e-01,  3.27670873e-01],
       [ 1.63124590e+00,  6.52555703e-01],
       [ 5.22429508e-01,  8.82772411e-01],
       [ 1.57645463e-01,  6.01153796e-01],
       [ 1.46421082e+00, -5.00943162e-01],
       [ 3.67069580e-01,  5.76011578e-01],
       [-1.75974591e+00,  1.96667257e-01],
       [ 6.87498554e-02, -4.35522869e-01],
       [-5.65655250e-01, -2.21491919e-01],
       [ 4.65679200e-01, -3.39263366e-01],
       [-9.69951457e-01,  1.19760845e+00],
       [ 1.69722315e-01, -1.82534836e-01],
       [ 9.86428083e-01,  7.74808680e-01],
       [-1.25631891e+00,  9.55987483e-01],
       [-1.

In [None]:
# Predicted Values vs Actual Ratings
print(np.dot(P_[4], Q_[36]), rm_small.values[4, 36])
print(np.dot(P_[1], Q_[0]), rm_small.values[1, 0])
print(np.dot(P_[1], Q_[2]), rm_small.values[1, 2])
print(np.dot(P_[3], Q_[17]), rm_small.values[3, 17])

-0.9903443885481151 0.0
0.22133225860179867 5.0
1.4679178937914588 2.0
-0.5025383199180044 0.0


In [None]:
from sklearn.metrics import mean_squared_error as mse

rm_ = np.dot(P_, Q_.T) # Predicted ratings matrix

print(mse(rm_small.values, rm_))

2.00262645302048


In [None]:
mse(rm_small.values[rm_small > 0], rm_[rm_small > 0])
# To filter only known ratings from original and predicted matrix


13.818499359703978

# Using cmfrec library -> Collective matrix factorization in recommender systems

- cmfrec library requires input in the form of dataframe not as sparse matrix.

- It required 3 columns UserId, ItemId, Rating.

In [None]:
rm_raw = ratings[['userId', 'movieId', 'rating']].copy()
rm_raw.head()

Unnamed: 0,userId,movieId,rating
0,1,16,4.0
1,1,24,1.5
2,1,32,4.0
3,1,47,4.0
4,1,50,4.0


In [None]:
# !pip install cmfrec

In [None]:
from cmfrec import CMF

In [None]:
rm_raw.rename(columns = {'userId': 'UserId', 'movieId':'ItemId', 'rating':'Rating'}, inplace = True)

In [None]:
rm_raw.head(2)

Unnamed: 0,UserId,ItemId,Rating
0,1,16,4.0
1,1,24,1.5


In [None]:
df = rm_raw.copy()

model = CMF(
    method="als",
    k=2,
    lambda_=0.1,
    user_bias=False,
    item_bias=False,
    verbose=False
)

Collective matrix factorization model
(explicit-feedback variant)


**An instance of the CMF model is created with various hyperparameters:**

- method="als": Specifies the alternating least squares (ALS) optimization method, commonly used for matrix factorization in recommendation systems.

- k=2: Sets the number of latent factors to 2, determining the dimensionality of the latent factor space.

- lambda_=0.1: Sets the regularization strength to 0.1. Regularization is used to prevent overfitting in the model.

- user_bias=False: Indicates that user bias terms are not included in the model. User bias represents a user's overall rating tendency.

- item_bias=False: Excludes item bias terms in the model. Item bias represents an item's overall rating tendency.

- verbose=False: Suppresses verbose output, controlling whether the model's training progress is displayed.

In [None]:
# model.fit(rm_raw)

model.fit(df)

model.A_.shape, model.B_.shape

((668, 2), (10325, 2))

CMF() is much faster than the Matrix Factorization from scratch because it is optimized.

In [None]:
model.A_

array([[ 0.6181083 , -1.296816  ],
       [ 1.0773844 ,  0.44746563],
       [ 0.7495232 , -0.2906971 ],
       ...,
       [ 0.72334623, -1.0197821 ],
       [ 0.64136606, -0.22247148],
       [-0.2533895 , -1.4885318 ]], dtype=float32)

In [None]:
model.B_

array([[ 0.75743455, -0.06085442],
       [-0.23201275,  1.1658956 ],
       [ 0.6025151 , -0.24058555],
       ...,
       [ 0.10675118,  0.6455457 ],
       [-0.05072206, -0.30672663],
       [ 0.001769  ,  0.01069749]], dtype=float32)