<a href="https://colab.research.google.com/github/lucarenz1997/recommender_systems/blob/main/Modell-Based-Coll-Filtering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a href="https://colab.research.google.com/github/lucarenz1997/recommender_systems/blob/main/Modell-Based-Coll-Filtering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Modell-Based-Collaborative-Filtering
** Authors **: Rafaella and Luca

This notebook explores the development and evaluation of a music recommendation system, comparing different collaborative filtering methods. ALS serves as the baseline model, while SVD is tested for potential improvements in accuracy, precision, recall, and F1-score.

## Setup

In [76]:
!pip install --no-cache-dir scikit-surprise
!pip install implicit



In [77]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
import tensorflow as tf
from tensorflow.keras.layers import Input, Embedding, Flatten, Dense, Concatenate, Dropout
from tensorflow.keras.models import Model
from sklearn.preprocessing import LabelEncoder, MinMaxScaler
from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score, roc_curve, auc
from surprise import Dataset, Reader, SVD
from surprise.model_selection import train_test_split
import implicit
from surprise import accuracy
from collections import defaultdict
import matplotlib.pyplot as plt
from scipy.sparse import csr_matrix
from pyspark.ml.recommendation import ALS
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, when
from pyspark.ml.evaluation import BinaryClassificationEvaluator
from sklearn.metrics import f1_score
from google.colab import drive
drive.mount('/content/drive')
import warnings
# Suppress all warnings
warnings.filterwarnings("ignore")


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Load and Preprocess Data

This section loads the dataset, selects relevant columns, and encodes media_id for compatibility with the Surprise library.

In [78]:
#Load data
data_sample_prep = pd.read_csv("/content/drive/MyDrive/Recommender/sample_preprocessed.csv")
data_sample_prep.head(100)

Unnamed: 0,genre_id,ts_listen,media_id,album_id,context_type,release_date,platform_name,platform_family,media_duration,listen_type,...,days_since_release,genre_popularity,media_popularity,artist_popularity,album_popularity,songs_listened,song_popularity_7d,artist_popularity_7d,album_popularity_7d,month
0,10,2016-11-12 22:01:41,3092645,299421,1,2002-12-31,2,1,198,1,...,5065,12408,1,45,17,24,0,6,1,11
1,1129,2016-11-10 02:28:23,2247915,224543,0,2005-12-05,0,0,223,0,...,3993,249,11,309,16,66,2,46,2,11
2,10,2016-11-02 07:41:53,917717,103376,0,2005-08-22,0,0,201,0,...,4090,12408,13,21,17,87,1,1,1,11
3,0,2016-11-24 17:23:28,132625720,14101012,0,2016-09-23,0,0,187,0,...,62,168707,324,584,465,10,35,81,69,11
4,7,2016-11-11 11:55:23,921901,103798,0,1998-01-07,1,0,264,0,...,6883,42397,32,138,32,11,7,25,7,11
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,7,2016-11-01 16:42:12,2225892,222779,22,2004-12-31,2,1,262,0,...,4323,42397,1,1,1,7,0,0,0,11
96,723,2016-11-10 04:25:50,6744852,623660,0,2010-06-14,0,0,173,0,...,2341,730,1,3,1,104,0,0,0,11
97,7,2016-11-21 10:03:31,63103296,6197720,16,2011-06-13,2,1,265,0,...,1988,42397,2,148,32,5,1,26,8,11
98,297,2016-11-03 12:02:38,1044131,114005,3,1995-04-25,0,0,224,0,...,7863,6518,145,222,176,36,34,46,38,11


## ALS

### Initialize Spark and Prepare Data

Before making predictions, we first set up a Spark session and load the dataset. This includes selecting relevant columns and converting the is_listened field to an integer for ALS compatibility.

In [79]:
# Create Spark session
spark = SparkSession.builder.appName("MusicRecommender").getOrCreate()

# Select relevant columns for the model
columns = ["user_id", "media_id", "is_listened"]
data_sample_prep_sel_als = data_sample_prep[columns]

# Convert is_listened from Boolean to Integer
data_sample_prep_sel_als["is_listened"] = data_sample_prep_sel_als["is_listened"].astype(int)

# Convert to a Spark DataFrame
df_spark = spark.createDataFrame(data_sample_prep_sel_als)

Min-Max scaling is unnecessary for ALS because its predictions are dot products of latent factors, representing relative preferences rather than absolute values. Scaling can distort these relationships and affect recommendation quality.

### Split Data into Training and Testing Sets

We split the dataset into 80% training data and 20% test data to train the ALS model and evaluate its performance.

In [80]:
# Train-Test Split
train, test = df_spark.randomSplit([0.8, 0.2], seed=42)

### Train the ALS Recommendation Model

We define an Alternating Least Squares (ALS) model, setting key parameters such as number of iterations, regularization, and cold-start strategy to optimize the recommendations.

In [81]:
# Create ALS model
als = ALS(
    maxIter=10,
    regParam=0.1,
    userCol="user_id",
    itemCol="media_id",
    ratingCol="is_listened",
    coldStartStrategy="drop"
)

# Train the model
model = als.fit(train)

### Generate Predictions and Optimize F1 Score

The trained model generates predictions on the test set. Since ALS produces continuous values, we fine-tune a threshold to convert them into binary classifications (0 or 1) using the F1-score as the evaluation metric.

In [82]:
# Perform predictions on the test set
predictions = model.transform(test)

# F1 score optimization: Adjust threshold
best_threshold = 0.0
best_f1 = 0.0
thresholds = np.arange(0.1, 1.0, 0.1)

actual = predictions.select("is_listened").toPandas().values.flatten()

for threshold in thresholds:
    pred = predictions.select("prediction").toPandas().values.flatten()
    pred = np.where(pred >= threshold, 1, 0)
    f1 = f1_score(actual, pred)
    if f1 > best_f1:
        best_f1 = f1
        best_threshold = threshold

# Final predictions with the best threshold
final_pred = np.where(predictions.select("prediction").toPandas().values.flatten() >= best_threshold, 1, 0)

### Evaluate Model Performance

We calculate key metrics including accuracy, precision, and recall to assess the quality of our recommendation system.

In [83]:
# Calculation of evaluation metrics
accuracy = accuracy_score(actual, final_pred)
precision = precision_score(actual, final_pred)
recall = recall_score(actual, final_pred)

# Display evaluation results
print(f"Optimized Threshold: {best_threshold}, Highest F1 Score: {best_f1}")
print(f"Accuracy: {accuracy}, Precision: {precision}, Recall: {recall}")

Optimized Threshold: 0.2, Highest F1 Score: 0.8042624789680314
Accuracy: 0.6973525360901872, Precision: 0.7157789652944976, Recall: 0.9177081764794459


In [84]:
# Compute Evaluation Metrics
accuracy = accuracy_score(actual, final_pred)
precision = precision_score(actual, final_pred)
recall = recall_score(actual, final_pred)
f1 = f1_score(actual, final_pred)

# Optimized threshold output
print("\n" + "=" * 50)
print("Optimized ALS Model Evaluation")
print("=" * 50)
print(f"Optimal Threshold: {best_threshold:.2f}")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1 Score: {f1:.4f}")
print("=" * 50)


Optimized ALS Model Evaluation
Optimal Threshold: 0.20
Accuracy: 0.6974
Precision: 0.7158
Recall: 0.9177
F1 Score: 0.8043


### Recommend Songs with Star Ratings

We generate personalized song recommendations for users using the trained ALS model. Songs are assigned 1 to 3 stars (⭐) based on their predicted score, while low-confidence recommendations receive no rating.

In [85]:
def recommend_songs_with_star_ratings(user_id, model, best_threshold, top_k=10):
    """
    Recommends Top-K songs for a user with a star rating system.
    Low-confidence predictions (below the best threshold) receive no rating.
    """
    # Get recommendations from ALS model
    user_recommendations = model.recommendForAllUsers(top_k)
    user_rec = user_recommendations.filter(col("user_id") == user_id).select("recommendations").collect()

    # Return empty DataFrame if no recommendations exist
    if not user_rec:
        return pd.DataFrame(columns=["Recommended Media_IDs", "Predicted Score", "Star Rating"])

    # Extract media IDs and predicted scores
    recommendations = user_rec[0]["recommendations"]
    media_ids = [row["media_id"] for row in recommendations]
    scores = np.array([row["rating"] for row in recommendations])

    # Normalize scores using Min-Max Scaling
    min_score, max_score = scores.min(), scores.max()
    normalized_scores = (scores - min_score) / (max_score - min_score) if max_score > min_score else scores

    # Define thresholds for star ratings
    percentile_33 = np.percentile(normalized_scores, 33)
    percentile_66 = np.percentile(normalized_scores, 66)

    def score_to_star_rating(score, raw_score):
        if raw_score < best_threshold:
            return ""  # No rating if below threshold
        elif score < percentile_33:
            return "⭐"
        elif score < percentile_66:
            return "⭐⭐"
        else:
            return "⭐⭐⭐"

    # Assign star ratings
    star_ratings = [score_to_star_rating(score, raw_score) for score, raw_score in zip(normalized_scores, scores)]

    # Create DataFrame with recommendations
    recommendations_df = pd.DataFrame({
        "Recommended Media_IDs": media_ids,
        "Predicted Score": scores,
        "Normalized Score": normalized_scores,
        "Star Rating": star_ratings
    })

    return recommendations_df

# Example: Recommend songs for a user
user_id_example = 123  # Adjust user ID
recommended_songs_df = recommend_songs_with_star_ratings(user_id_example, model, best_threshold)

# Display recommendations
from IPython.display import display
display(recommended_songs_df)

# Stop Spark session
spark.stop()


Unnamed: 0,Recommended Media_IDs,Predicted Score,Normalized Score,Star Rating
0,94727052,0.810078,1.0,⭐⭐⭐
1,2275917,0.810078,1.0,⭐⭐⭐
2,2275911,0.810078,1.0,⭐⭐⭐
3,917421,0.810078,1.0,⭐⭐⭐
4,72273753,0.793355,0.842736,⭐⭐
5,7270018,0.793355,0.842736,⭐⭐
6,3148015,0.770897,0.631545,⭐⭐
7,14681034,0.718009,0.134188,⭐
8,41292391,0.717604,0.130379,⭐
9,4253913,0.703739,0.0,⭐


## SVD

In this SVD-based collaborative filtering model, the binary variable is_listened (0 = not listened, 1 = listened) is used as implicit feedback. Min-Max scaling is unnecessary since SVD efficiently processes binary inputs and does not rely on distance metrics that require normalization.

In [86]:
#Encode media_id for Surprise compatibility
item_encoder = LabelEncoder()
data_sample_prep["media_id_encoded"] = item_encoder.fit_transform(data_sample_prep["media_id"])

#Keep only necessary columns
data_sample_prep_sel = data_sample_prep[['user_id', 'media_id_encoded', 'is_listened']]

#Convert to Surprise dataset format
reader = Reader(rating_scale=(0, 1))  # Binary scale (0 = not listened, 1 = listened)
data = Dataset.load_from_df(data_sample_prep_sel[['user_id', 'media_id_encoded', 'is_listened']], reader)


### Train the SVD Model

This section splits the dataset into training and test sets, initializes the SVD model, and trains it.

In [87]:
# Split training and test data
trainset, testset = train_test_split(data, test_size=0.2)

# Initialize and train the SVD model
model = SVD(n_factors=50, reg_all=0.1, n_epochs=20)
model.fit(trainset)

# Generate Predictions on the Test Set
testset = trainset.build_testset()
predictions = model.test(testset)

### Optimize Classification Threshold

Since SVD outputs continuous scores, we must convert them into binary recommendations (0 or 1). This section finds the best threshold to maximize the F1-score.

In [88]:
# Optimize Threshold for Best F1 Score
best_threshold = 0.5
best_f1 = 0
thresholds = np.arange(0.1, 1.0, 0.1)

for threshold in thresholds:
    y_true = [true_r for (_, _, true_r, _, _) in predictions]
    y_pred = [1 if est >= threshold else 0 for (_, _, _, est, _) in predictions]

    f1 = f1_score(y_true, y_pred, zero_division=1)

    if f1 > best_f1:
        best_f1 = f1
        best_threshold = threshold

print(f"\n**Optimal Threshold for F1 Score: {best_threshold:.2f} with F1: {best_f1:.4f}**")



**Optimal Threshold for F1 Score: 0.50 with F1: 0.8693**


### Define SVD Recommendation Function with Star Ratings

This function generates Top-K song recommendations for a user, applies Min-Max normalization, and assigns a star rating.

In [89]:
def recommend_songs_svd(user_id, model, data, item_encoder, best_threshold, top_k=10):
    """
    Generates Top-K recommended songs for a user using an SVD model with a star rating system.
    Predictions below the best threshold are filtered out.

    :param user_id: User ID for whom recommendations are generated.
    :param model: The trained SVD recommendation model.
    :param data: The dataset containing user-media interactions.
    :param item_encoder: LabelEncoder for `media_id` to decode item IDs to original values.
    :param best_threshold: The optimal threshold for valid predictions.
    :param top_k: Number of songs to recommend.
    :return: DataFrame with recommended `media_id`s, predicted scores, and star ratings.
    """

    # Get all unique media_id values (items)
    all_items = data["media_id_encoded"].unique()

    # Generate predictions for the user on all items
    predictions = [model.predict(user_id, media_id) for media_id in all_items]

    # Extract estimated scores
    media_ids = [pred.iid for pred in predictions]
    scores = np.array([pred.est for pred in predictions])

    # Sort scores in descending order and select the Top-K items
    top_indices = np.argsort(scores)[::-1][:top_k]
    top_items = np.array(media_ids)[top_indices]
    top_scores = scores[top_indices]

    # Normalize scores using Min-Max Scaling (0 to 1 range)
    scaler = MinMaxScaler()
    top_scores_normalized = scaler.fit_transform(top_scores.reshape(-1, 1)).flatten()

    # Define percentiles for star ratings
    percentile_33 = np.percentile(top_scores_normalized, 33)  # 1⭐ cutoff
    percentile_66 = np.percentile(top_scores_normalized, 66)  # 2⭐ cutoff

    # Assign star ratings dynamically, but remove if below best_threshold
    def score_to_star_rating(score, raw_score):
        if raw_score < best_threshold:
            return ""  # No rating if below best threshold
        elif score < percentile_33:
            return "⭐"
        elif score < percentile_66:
            return "⭐⭐"
        else:
            return "⭐⭐⭐"

    star_ratings = [score_to_star_rating(score, raw_score) for score, raw_score in zip(top_scores_normalized, top_scores)]

    # Convert `media_id_encoded` back to original values using item_encoder
    recommended_songs = item_encoder.inverse_transform(top_items)

    # Create DataFrame for recommendations
    recommendations_df = pd.DataFrame({
        "Recommended Media_IDs": recommended_songs,
        "Predicted Score": top_scores,
        "Normalized Score": top_scores_normalized,
        "Star Rating": star_ratings  # Some rows may have an empty rating
    })

    return recommendations_df


### Evaluate Model Performance

This section evaluates the model using Accuracy, Precision, Recall, and F1-score.

In [90]:
# Evaluate Model with Optimal Threshold
y_true = [true_r for (_, _, true_r, _, _) in predictions]
y_pred = [1 if est >= best_threshold else 0 for (_, _, _, est, _) in predictions]

accuracy_val = accuracy_score(y_true, y_pred)
precision_val = precision_score(y_true, y_pred, zero_division=1)
recall_val = recall_score(y_true, y_pred, zero_division=1)
f1_val = f1_score(y_true, y_pred, zero_division=1)

# Print Evaluation Results with Optimal Threshold
print("\n" + "=" * 50)
print("SVD Model Evaluation with Optimal Threshold:")
print("=" * 50)
print(f"Optimal Threshold: {best_threshold:.2f}")
print(f"Accuracy: {accuracy_val:.4f}")
print(f"Precision: {precision_val:.4f}")
print(f"Recall: {recall_val:.4f}")
print(f"F1 Score: {f1_val:.4f}")


SVD Model Evaluation with Optimal Threshold:
Optimal Threshold: 0.50
Accuracy: 0.8068
Precision: 0.8100
Recall: 0.9379
F1 Score: 0.8693


### Example Usage: Generate Recommendations for a User

This section calls the recommendation function and displays the top song recommendations for a user.

In [91]:
# Example: Recommend songs for a user with the best model
user_id_example = 123  # Adjust user ID as needed
recommended_songs_df = recommend_songs_svd(user_id_example, model, data_sample_prep, item_encoder, best_threshold)

# Display recommendations
from IPython.display import display
display(recommended_songs_df)

Unnamed: 0,Recommended Media_IDs,Predicted Score,Normalized Score,Star Rating
0,867060,0.694164,1.0,⭐⭐⭐
1,132286790,0.670421,0.387619,⭐⭐⭐
2,129636210,0.667136,0.302893,⭐⭐⭐
3,4708087,0.666768,0.293402,⭐⭐⭐
4,101326476,0.665663,0.264885,⭐⭐
5,133595562,0.664409,0.232553,⭐⭐
6,7420885,0.662092,0.172786,⭐⭐
7,1115061,0.6616,0.160103,⭐
8,132144404,0.65692,0.039379,⭐
9,123368950,0.655393,0.0,⭐


## Conclusion: SVD vs. ALS Performance Comparison

The ALS model, used as the baseline, performs worse than SVD across all key metrics, with lower accuracy (0.8043 vs. 0.8693), precision, recall, and F1-score. Additionally, the significantly lower optimal threshold for ALS (0.2) vs. SVD (0.5) suggests that ALS predictions are more biased towards lower values, making SVD the better choice for this dataset.