### 📊 Create a Recommendation System based on Item Factors and User Factors

In this section, we’ll build a simple recommendation system using **Linear Regression**. We'll assume we already have:

- **User Factors**: A matrix representing user preferences (e.g., taste profiles).
- **Item Factors**: A matrix representing item characteristics (e.g., genres, style).


#### 🔧 Steps:
1. Load the precomputed user and item factor matrices.
2. Create Linear regression
3. Predict ratings for an item


In [76]:
import os
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

from sklearn.linear_model import LinearRegression

#### Data

This example uses a synthetic dataset of reviews from five individuals and five albums.  The dataset is loaded and displayed below. Two additional columns `lofi` and `slick` are included to rate the nature of the music. 


In [85]:
reviews = pd.read_csv('data/sample_reviews.csv', index_col=0)
reviews.head()

Unnamed: 0,Alfred,Mandy,Lenny,Joan,Tino,slick,lofi
Michael Jackson,3.0,,2.0,3.0,1.0,8,2
Clint Black,4.0,9.0,5.0,,1.0,8,2
Dropdead,,,8.0,9.0,,2,9
Anti-Cimex,4.0,3.0,9.0,4.0,9.0,2,10
Cardi B,4.0,8.0,,9.0,5.0,9,3


#### Create recommendatio for one user


In [86]:
X = reviews[reviews['Tino'].isna() == False] [['slick', 'lofi']]
y = reviews[reviews['Tino'].isna() == False] ['Tino']
new_X = reviews[reviews['Tino'].isna()] [['slick', 'lofi']]
tino_lr = LinearRegression().fit(X, y)
tino_dd_predict = tino_lr.predict(new_X)

output_dataframe = pd.DataFrame(tino_dd_predict.reshape(1,1), columns=["Predicted Rating"], index=["Tino"]).map(lambda x:f"{x:0.2f}")

output_dataframe

Unnamed: 0,Predicted Rating
Tino,6.71


#### Tino's user vector


In [99]:

tino_vector = tino_lr.coef_

pd.DataFrame(tino_vector.reshape(1, 2), columns = ['slick', 'lofi'], index = ['Tino']).map(lambda x: f"{x:0.2f}")

Unnamed: 0,slick,lofi
Tino,1.71,2.29


#### Create recommendation for all users

In [100]:
reviews_df_full = reviews.copy()


for column in reviews.columns.drop(['slick', 'lofi']):
    
    not_null_mask = reviews[column].isna() == False
    null_mask = reviews[column].isna()
      
    # X is all row values with not null cell
    X = reviews[not_null_mask] [['slick', 'lofi']]
    # y is all row values with not null cell
    y = reviews[not_null_mask] [column]
    # X_new is all row values with null cell
    X_new = reviews[null_mask] [['slick', 'lofi']]

    if not X_new.empty:
        #Create Linear Regression and Predict
        lr = LinearRegression().fit(X, y)
        prediction = lr.predict(X_new)

        reviews_df_full.loc[reviews[column].isna(), column] = prediction

    # Add data tp the reviews_df_full Dataframe
reviews_df_full


Unnamed: 0,Alfred,Mandy,Lenny,Joan,Tino,slick,lofi
Michael Jackson,3.0,9.0,2.0,3.0,1.0,8,2
Clint Black,4.0,9.0,5.0,4.664444,1.0,8,2
Dropdead,3.75,3.857143,8.0,9.0,6.714286,2,9
Anti-Cimex,4.0,3.0,9.0,4.0,9.0,2,10
Cardi B,4.0,8.0,4.916667,9.0,5.0,9,3


In [101]:
for col in reviews_df_full.columns:
    if col not in ['slick', 'lofi']:
        reviews_df_full[col] = reviews_df_full[col].clip(upper = 5.0)

reviews_df_full.map(lambda x: f"{x: 0.2f}")

Unnamed: 0,Alfred,Mandy,Lenny,Joan,Tino,slick,lofi
Michael Jackson,3.0,5.0,2.0,3.0,1.0,8.0,2.0
Clint Black,4.0,5.0,5.0,4.66,1.0,8.0,2.0
Dropdead,3.75,3.86,5.0,5.0,5.0,2.0,9.0
Anti-Cimex,4.0,3.0,5.0,4.0,5.0,2.0,10.0
Cardi B,4.0,5.0,4.92,5.0,5.0,9.0,3.0
