# Matrix Factorisation

Matrix factorization is a popular technique in recommendation systems, especially for collaborative filtering

- Matrix factorization assumes that any user-song interaction can be explained by latent factors. 
- For instance, in the context of songs, these factors might represent genres, moods, or other abstract features, but they aren't explicitly labeled.
- The matrix factorization methods work best when the user-song matrix is dense. For very sparse matrices, you might need to employ techniques like alternating least squares (ALS) or stochastic gradient descent (SGD) to handle the missing values effectively.
- Over time, as more user interactions are gathered, the model may need to be re-trained or updated to reflect new patterns in the data.

**Steps**

#### 1. **Prepare the Data**:
Start with the user-song matrix where each row represents a user, each column represents a song, and the values represent the user's rating of that song (or interaction strength).
#### 2. **Choose the Number of Factors (k)**:
Decide on the number of latent factors `k`. These represent abstract features that can explain the patterns in the ratings. A higher `k` will capture more subtle structures but may also overfit. The optimal value of `k` is often determined through cross-validation.
#### 3. **Factorize the Matrix**:
Use a matrix factorization technique like Singular Value Decomposition (SVD) to decompose the user-song matrix `R` into three matrices: `U`, `Σ`, and `V^T`.

- `U` (User matrix): Represents the relationship between users and the latent factors.
- `Σ` (Diagonal matrix): Represents the strength of each latent factor.
- `V^T` (Song matrix): Represents the relationship between songs and the latent factors.

Mathematically, $$ R \approx U \times \Sigma \times V^T $$

#### 4. **Generate Predicted Ratings**:
Using the factorized matrices, we can predict a user's rating for a song as follows:
$$ \text{predicted rating} = \text{row of U for user} \times \Sigma \times \text{column of } V^T \text{ for song} $$

#### 5. **Recommend Songs**:
To recommend songs to a specific user:

- Compute the predicted ratings for all songs for that user using the factorized matrices.
- Sort the songs based on the predicted ratings.
- Recommend the top-N songs that the user hasn't interacted with (or rated) yet.

#### 6. **Others**:
-To ensure the recommendations are good, split your data into training and testing sets. Train the matrix factorization model on the training set and evaluate its performance on the test set. Common metrics include RMSE (Root Mean Square Error) and MAE (Mean Absolute Error).
- Based on the performance on the test set, you might need to:
    - Adjust the number of latent factors `k`.
    - Use regularization techniques to prevent overfitting if you're using methods that support it (e.g., SVD++ or matrix factorization techniques from libraries like `surprise`).


In [20]:
import numpy as np
import pandas as pd
from surprise import Dataset, Reader, SVD
from surprise.model_selection import train_test_split
from surprise import accuracy
from src.utils import load_dataset

# Create user-song matrix
dataframe = load_dataset()
user_song_matrix = dataframe.pivot_table(index='User_Name', columns='Song', values='Star_Rating', fill_value=0)
df = user_song_matrix.stack().reset_index()
df.columns = ['User', 'Song', 'Rating']

reader = Reader(rating_scale=(df.Rating.min(), df.Rating.max()))  # Define the rating scale based on your dataset
data = Dataset.load_from_df(df[['User', 'Song', 'Rating']], reader)

# Split data into train and test set
trainset, testset = train_test_split(data, test_size=0.25)

# Use Singular Value Decomposition (SVD) for matrix factorization
model = SVD()
model.fit(trainset)
predictions = model.test(testset)

# Evaluate the model
rmse = accuracy.rmse(predictions)
print(f"RMSE: {rmse}")

def get_top_n_recommendations(predictions, n=10):
    """Return the top-N recommendation for each user from a set of predictions."""
    
    top_n = {}
    for uid, iid, true_r, est, _ in predictions:
        top_n.setdefault(uid, []).append((iid, est))

    # Then sort the predictions for each user and retrieve the N highest ones.
    for uid, user_ratings in top_n.items():
        user_ratings.sort(key=lambda x: x[1], reverse=True)
        top_n[uid] = user_ratings[:n]

    return top_n

top_n_recommendations = get_top_n_recommendations(predictions, n=10)


RMSE: 1.4163
RMSE: 1.4163442377854456


In [22]:
top_n_recommendations['Alice']

[('Song292', 2.439305897868699),
 ('Song253', 1.8549174882774002),
 ('Song87', 1.7108502215512469),
 ('Song7', 1.6820684334196407),
 ('Song67', 1.5834276396147273),
 ('Song272', 1.5168783496560199),
 ('Song82', 1.433099071529223),
 ('Song223', 1.3914276189153707),
 ('Song146', 1.3502125351920484),
 ('Song211', 1.320084119557865)]

# Main Problems with Matrix Factorization:

1. **Cold Start Problem**: MF struggles with new users or items (songs, in your case) that have no or very few ratings. Because MF is purely collaborative, it can't recommend items to new users or suggest new items to any users until enough data about these new entities accumulates.

2. **Sparsity**: User-item matrices are typically very sparse, meaning most users have not rated most items. Even though MF deals relatively well with sparsity compared to other methods, extreme sparsity can still be a challenge.

3. **Scalability**: Matrix factorization methods can be computationally expensive, particularly as the size of the user-item matrix grows. This can be mitigated with stochastic gradient descent or other optimization approaches, but it's still a concern for very large datasets.

4. **Overfitting**: Especially with deep matrix factorization or when the latent factor number is high, the model can overfit to the training data. Regularization methods and techniques like dropout can help mitigate this.

5. **Static Model**: The classic MF does not inherently consider temporal dynamics. User preferences can change over time, and newer models like time-aware matrix factorization attempt to address this, but the basic MF does not.

6. **Lack of Interpretability**: Unlike content-based methods, MF lacks interpretability. It's hard to explain why a particular recommendation was made based solely on latent factors.


# RMSE (Root Mean Square Error):

The value of RMSE depends on the scale of your ratings, as the values presented above are not normalised. If your song ratings are on a scale from 1 to 5, then an RMSE of 0.5 means that the model's predictions are off by half a rating point on average. In the context of a 1-5 rating scale:

- An **RMSE of 0.5** or lower could be considered good.
- An **RMSE of 1.0** would mean that you're off by a full rating point on average, which might be acceptable but not ideal.
- An **RMSE higher than 1.0** could be problematic.

However, it's crucial to compare RMSE to some baseline (like a naive recommender) and possibly other metrics, depending on your application. Also, low RMSE doesn't always mean that users will be satisfied with the recommendations. For instance, the model might provide accurate but obvious and uninteresting recommendations. That's why it's often helpful to combine RMSE with other evaluation metrics and real-world user testing.