----

# **Diamentiality Reduction And Matrix Factorization**

## **Author**   :  **Muhammad Adil Naeem**

## **Contact**   :   **madilnaeem0@gmail.com**
<br>

----



### **Import Libraries**

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statistics
from math import sqrt
import random
import scipy.sparse as sp
from scipy.sparse.linalg import svds

from sklearn.model_selection import  train_test_split
from sklearn.metrics.pairwise import cosine_similarity, pairwise_distances
from sklearn.metrics import mean_squared_error

import warnings
warnings.filterwarnings('ignore')

### **Matrix Factorization for Collaborative Filtering**

This function, `matrix_factorization`, implements a matrix factorization technique to predict user-item ratings using stochastic gradient descent.

### Parameters:
- **R**: The user-item rating matrix.
- **P**: The user feature matrix (initially random).
- **Q**: The item feature matrix (initially random).
- **K**: Number of latent features.
- **steps**: Number of iterations for the optimization (default: 5000).
- **alpha**: Learning rate (default: 0.0002).
- **beta**: Regularization parameter (default: 0.2).

### Process:
1. The function transposes matrix `Q` for easier calculation.
2. It iterates through the specified number of steps, updating user and item feature matrices based on the error between actual ratings and predicted ratings.
3. For each non-zero entry in the rating matrix `R`, it computes the error and updates the user and item features using the learning rate and regularization.
4. It calculates the total error and checks if it falls below a threshold to potentially break early.
5. Finally, it returns the updated user and item feature matrices.

This method is commonly used in recommendation systems to learn latent factors that explain observed ratings.

In [4]:
def matrix_factorization(R, P, Q, K, steps=5000, alpha=0.0002, beta=0.2):
    Q = Q.T
    for step in range(steps):
        for i in range(len(R)):
            for j in range(len(R[i])):
                if R[i][j] > 0:
                    eij = R[i][j] - np.dot(P[i,:],Q[:,j])
                    for k in range(K):
                        P[i][k] = P[i][k] + alpha * (2 * eij * Q[k][j] - beta * P[i][k])
                        Q[k][j] = Q[k][j] + alpha * (2 * eij * P[i][k] - beta * Q[k][j])

        eR = np.dot(P,Q)
        e = 0
        for i in range(len(R)):
            for j in range(len(R[i])):
                if R[i][j] > 0:
                    e = e + pow(R[i][j] - np.dot(P[i,:],Q[:,j]), 2)
                    for k in range(K):
                        e = e + (beta/2) * ( pow(P[i][k],2) + pow(Q[k][j],2) )
        if e < 0.001:
            break
    return P, Q.T


### **Applying Matrix Factorization**

This code snippet demonstrates how to use the `matrix_factorization` function on a sample user-item rating matrix `R`.

1. **Rating Matrix**: `R` is defined as a numpy array representing user ratings:
   ```
   [
     [5, 3, 0, 1],
     [4, 0, 0, 1],
     [1, 1, 0, 5],
     [1, 0, 0, 4],
   ]
   ```

2. **Initialization**:
   - `N` is the number of users (rows in `R`).
   - `M` is the number of items (columns in `R`).
   - `K` is the number of latent features (set to 2).
   - `P` and `Q` are initialized with random values to represent user and item features, respectively.

3. **Matrix Factorization**:
   - The `matrix_factorization` function is called with the rating matrix `R`, and the feature matrices `P` and `Q`.
   - The resulting matrices `nP` and `nQ` represent the learned user and item features.

4. **Predicted Ratings**:
   - The predicted ratings matrix `nR` is calculated by multiplying `nP` and the transpose of `nQ`.

This process allows for the generation of predicted ratings based on the learned latent features, facilitating recommendations.

In [5]:
R = [
     [5, 3, 0, 1],
     [4, 0, 0, 1],
     [1, 1, 0, 5],
     [1, 0, 0, 4],
]

R = np.array(R)

N = len(R)
M = len(R[0])
K = 2

P = np.random.rand(N,K)
Q = np.random.rand(M,K)


nP, nQ = matrix_factorization(R, P, Q, K)
nR = np.dot(nP, nQ.T)

In [6]:
nR

array([[4.75324889, 2.88722617, 1.46054792, 1.01257229],
       [3.82602389, 2.33906783, 1.21670348, 0.99473299],
       [1.00485565, 0.9873359 , 1.33684069, 4.71227721],
       [0.99469024, 0.90806417, 1.13436368, 3.83786531]])