# Recommender Systems

## Approaches to collaborative filtering

**User – user approaches**: find the users that are most similar to myself (based 
upon only those items that are rated for both of us), and predict scores for other 
items based upon the average 

**Item – item approaches**: find the items most similar to a given item (based upon 
all users rated both items), and predict scores for other users based upon the 
average

**Matrix factorization approaches**: find some low-rank decomposition of the 𝑋
matrix that agrees at observed values

## User-User Pearson correlation

In [1]:
import numpy as np
import pandas as pd

# Example user-item matrix with ratings
ratings_matrix = np.array([
    [5, 3, 0, 1],
    [4, 0, 0, 1],
    [1, 1, 0, 5],
    [1, 0, 0, 4],
    [0, 1, 5, 4],
])

# Convert the matrix to a DataFrame for easier handling
ratings_df = pd.DataFrame(ratings_matrix, columns=["Item1", "Item2", "Item3", "Item4"])

user_similarity = ratings_df.T.corr(method='pearson')

print("User-User Pearson Correlation Matrix:\n", user_similarity)

User-User Pearson Correlation Matrix:
           0         1         2         3         4
0  1.000000  0.774291 -0.186441 -0.178683 -0.978839
1  0.774291  1.000000  0.019854  0.162791 -0.628768
2 -0.186441  0.019854  1.000000  0.972828  0.221028
3 -0.178683  0.162791  0.972828  1.000000  0.258904
4 -0.978839 -0.628768  0.221028  0.258904  1.000000


In [2]:
# Predict function to compute weighted ratings
def predict_rating(user_index, item_index, ratings_df, similarity_matrix):
    user_means = ratings_df.replace(0, np.nan).mean(axis=1).values

    # Get the ratings for the item from all other users
    item_ratings = ratings_df.iloc[:, item_index]
    
    # Get the similarity of the target user with all other users
    user_similarities = similarity_matrix[user_index]
    
    # Filter for only users who rated the item and exclude self-similarity
    non_zero_indices = item_ratings[item_ratings > 0].index
    similarities = user_similarities[non_zero_indices]
    ratings = item_ratings[non_zero_indices]
    means = user_means[non_zero_indices]
    
    # If no other user has rated this item, return 0 as prediction
    if len(similarities) == 0:
        return 0
    
    # Weighted sum of ratings and normalize by similarity sum
    
    ratings_weighted = ratings - means
    weighted_sum = np.dot(similarities, ratings_weighted)
    
    sum_of_similarities = np.sum(np.abs(similarities))

    predicted = weighted_sum / sum_of_similarities if sum_of_similarities != 0 else 0

    return ratings.mean() + predicted

print(predict_rating(4, 0, ratings_df, user_similarity))

1.0331985858244794


## Matrix factorization

In [3]:
import numpy as np

X = np.array([[1, 1], [2, 1], [3, 1.2], [4, 1], [5, 0.8], [6, 1]])

from sklearn.decomposition import NMF
model = NMF(n_components=2, init='random', random_state=0)
W = model.fit_transform(X)
H = model.components_
nR = np.dot(W,H)
print(nR)

[[1.00063558 0.99936347]
 [1.99965977 1.00034074]
 [2.99965485 1.20034566]
 [3.9998681  1.0001321 ]
 [5.00009002 0.79990984]
 [6.00008587 0.999914  ]]
