# Collaborative filtering practice

In this homework you will test different collaborative filtering (CF) approaches on famous Movielens dataset.

In class we implemented item2item CF, so this time let's use **user2user** approach.

## Task 0: Dataset (5 points)

We had this code in class, so you need to put it here and run.

Split dataset to train and validation parts.

Don't forget to encode users and items from 0 to maximum!

In [1]:
# !wget https://github.com/nzhinusoftcm/review-on-collaborative-filtering/raw/master/recsys.zip
# !unzip recsys.zip

In [7]:
from collections import defaultdict
import random
from functools import lru_cache
from typing import Tuple, Dict, Optional, List

import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
import tqdm.notebook

from recsys.datasets import ml1m, ml100k

In [5]:
user_item_ratings, item_metadata = ml100k.load()

In [None]:
EncodedData = Tuple[pd.DataFrame, LabelEncoder, LabelEncoder]
TrainTestSplit = Tuple[Dict[int, list], Optional[Dict[int, list]]]

def encode_ratings(ratings: pd.DataFrame) -> EncodedData:
    item_id_encoder = LabelEncoder().fit(sorted(ratings['itemid'].unique()))
    user_id_encoder = LabelEncoder().fit(sorted(ratings['userid'].unique()))
    
    ratings['itemid'] = item_id_encoder.transform(ratings['itemid'])
    ratings['userid'] = user_id_encoder.transform(ratings['userid'])
    
    return ratings, user_id_encoder, item_id_encoder

def split_train_validation(dataset: pd.DataFrame, test_size: float = 0.3) -> Tuple[Dict[int, List[Tuple[int, float]]], Optional[Dict[int, List[Tuple[int, float]]]]]:
    if test_size == 0:
        train_dict = dataset.groupby("userid")[["itemid", "rating"]].apply(lambda x: x.values.tolist()).to_dict()
        return train_dict, None
    
    test_samples = dataset.groupby('userid', group_keys=False).apply(lambda x: x.sample(frac=test_size))
    train_samples = dataset.drop(test_samples.index)

    train_dict = train_samples.groupby("userid")[["itemid", "rating"]].apply(lambda x: x.values.tolist()).to_dict()
    test_dict = test_samples.groupby("userid")[["itemid", "rating"]].apply(lambda x: x.values.tolist()).to_dict()
    
    return train_dict, test_dict


user_item_ratings, user_encoder, item_encoder = encode_ratings(user_item_ratings)

## Task 1: Similarities (5 points each)

You need to implement 3 similarity functions:
1. Dot product (intersection)
1. Jaccard index (intersection over union)
1. Pearson correlation

In [9]:
def similarity_dot_product(user1_ratings: Dict[int, float], user2_ratings: Dict[int, float]) -> float:
    common_items = user1_ratings.keys() & user2_ratings.keys()
    return sum([user1_ratings[item] * user2_ratings[item] for item in common_items])

def similarity_jaccard(user1_ratings: Dict[int, float], user2_ratings: Dict[int, float]) -> float:
    common_items = user1_ratings.keys() & user2_ratings.keys()
    all_items = user1_ratings.keys() | user2_ratings.keys()
    return len(common_items) / len(all_items)

def similarity_pearson(user1_ratings: Dict[int, float], user2_ratings: Dict[int, float]) -> float:
    mean_user1 = np.mean(list(user1_ratings.values()))
    mean_user2 = np.mean(list(user2_ratings.values()))
    
    user1_adjusted = {item: rating - mean_user1 for item, rating in user1_ratings.items()}
    user2_adjusted = {item: rating - mean_user2 for item, rating in user2_ratings.items()}
    
    numerator = similarity_dot_product(user1_adjusted, user2_adjusted)
    denominator = (np.sqrt(similarity_dot_product(user1_adjusted, user1_adjusted)) *
                   np.sqrt(similarity_dot_product(user2_adjusted, user2_adjusted)))
    
    if denominator == 0:
        return 0.0
    return numerator / denominator


## Task 2: Collaborative filtering algorithm (5 points each)

Now you have several options to use similarities for ratings prediction:
1. Simple averaging
1. Mean corrected averaging

In [10]:
from typing import Callable, List

class UserBasedCollaborativeFilter:

    def __init__(self, similarity_function: Callable[[Dict[int, float], Dict[int, float]], float]):

        self.similarity_function = similarity_function
        self.train_feedbacks: Optional[Dict[int, List[Tuple[int, float]]]] = None
        self.valid_feedbacks: Optional[Dict[int, List[Tuple[int, float]]]] = None
        self.similarity_matrix: Optional[pd.DataFrame] = None
        self.users: Optional[List[int]] = None
        self.items: Optional[List[int]] = None
        
    def calculate_similarity_matrix(self, feedbacks: pd.DataFrame, production: bool = False) -> None:
        self.items = list(np.unique(feedbacks['itemid']))
        split_coef = 0.3 if not production else 0
        self.train_feedbacks, self.valid_feedbacks = split_train_validation(feedbacks, test_size=split_coef)
        
        user_list = list(self.train_feedbacks.keys())
        self.similarity_matrix = pd.DataFrame(index=user_list, columns=user_list)
        
        for i, user1 in enumerate(user_list):
            for j, user2 in enumerate(user_list[i:], i):
                if i == j:
                    self.similarity_matrix.at[user1, user2] = 1.0
                else:
                    user1_ratings_dict = dict(self.train_feedbacks[user1])
                    user2_ratings_dict = dict(self.train_feedbacks[user2])
                    similarity_score = self.similarity_function(user1_ratings_dict, user2_ratings_dict)
                    self.similarity_matrix.at[user1, user2] = similarity_score
                    self.similarity_matrix.at[user2, user1] = similarity_score


    def recommend(self, user_id: int, num_items: int) -> List[Tuple[int, float]]:
        similar_users = self.similarity_matrix[user_id].sort_values(ascending=False).index.tolist()
        
        recommendations = defaultdict(float)
        for similar_user in similar_users:
            for item, rating in self.train_feedbacks[similar_user]:
                if item not in self.train_feedbacks[user_id]:
                    recommendations[item] += rating * self.similarity_matrix.at[user_id, similar_user]
        
        sorted_recommendations = sorted(recommendations.items(), key=lambda x: x[1], reverse=True)
        return sorted_recommendations[:num_items]

In [22]:
class UserBasedMeanCorrectedCollaborativeFilter(UserBasedCollaborativeFilter):
    def calculate_similarity_matrix(self, feedbacks: pd.DataFrame, production: bool = False) -> None:

        self.items = list(np.unique(feedbacks['itemid']))
        split_coef = 0.3 if not production else 0
        self.train_feedbacks, self.valid_feedbacks = split_train_validation(feedbacks, test_size=split_coef)
        
        user_list = list(self.train_feedbacks.keys())
        self.similarity_matrix = pd.DataFrame(index=user_list, columns=user_list)
        
        user_mean_ratings = {user: np.mean([rating for _, rating in ratings]) for user, ratings in self.train_feedbacks.items()}
        
        for i, user1 in enumerate(user_list):
            for j, user2 in enumerate(user_list[i:], i):
                if i == j:
                    self.similarity_matrix.at[user1, user2] = 1.0
                else:
                    user1_ratings_dict = {item: rating - user_mean_ratings[user1] for item, rating in self.train_feedbacks[user1]}
                    user2_ratings_dict = {item: rating - user_mean_ratings[user2] for item, rating in self.train_feedbacks[user2]}
                    
                    similarity_score = self.similarity_function(user1_ratings_dict, user2_ratings_dict)
                    self.similarity_matrix.at[user1, user2] = similarity_score
                    self.similarity_matrix.at[user2, user1] = similarity_score

This way you have got 6 different recommendation methods (each of two CF can be used with 3 similarity score).

## Task 3: Apply models

1. For all 6 possible algorithm variations train it and compute recomendations for validation part. (10 points)
2. Show that your implementation is relevant by computing metrics. Compare algorithms. (15 points)

In [12]:
def get_df_merged(filter_model, valid_feedbacks: Dict[int, List[Tuple[int, float]]]) -> pd.DataFrame:
    users_sample = list(valid_feedbacks.keys())
    df_merged = []

    with tqdm.notebook.tqdm(total=len(users_sample)) as pbar:
        for user_id in users_sample:
            n_items = len(valid_feedbacks[user_id])
            recommendations = filter_model.recommend(user_id, n_items)
            
            # Преобразуем списки в DataFrame
            df_recommendations = pd.DataFrame(recommendations, columns=['item_id', 'value_recs'])
            df_true_values = pd.DataFrame(valid_feedbacks[user_id], columns=['item_id', 'value_true'])
            
            # Объединяем по item_id
            for_metrics = pd.merge(df_recommendations, df_true_values, on='item_id')
            for_metrics['user_id'] = user_id
            
            df_merged.append(for_metrics)
            pbar.update()

    df_merged = pd.concat(df_merged)
    df_merged.set_index(['user_id', 'item_id'], inplace=True)
    return df_merged


In [17]:
from typing import Optional

def calc_metrics(df_merged: pd.DataFrame) -> None:
    df_merged['rank_recs'] = df_merged.groupby('user_id')['value_recs'].rank(ascending=False, method='first')
    df_merged['rank_true'] = df_merged.groupby('user_id')['value_true'].rank(ascending=False, method='first')

    df_merged['rank_true'] = df_merged['rank_true'].astype(int)
    print(df_merged[['rank_recs', 'rank_true']].head())

def precision_at_k(df_merged: pd.DataFrame, k: int) -> None:
    df_merged['hit@k'] = df_merged['rank_recs'] <= k
    df_merged[f'hit@{k}/{k}'] = df_merged['hit@k'] / k

    df_filtered = df_merged[df_merged['rank_true'] <= k]
    precision_k = df_filtered.groupby(level=0)[f'hit@{k}/{k}'].sum().mean()
    print(f'Precision@{k}: {precision_k:.4f}')


In [18]:
filter_dot = UserBasedCollaborativeFilter(similarity_dot_product)
filter_dot.calculate_similarity_matrix(user_item_ratings)
df_merged = get_df_merged(filter_dot, filter_dot.valid_feedbacks)
calc_metrics(df_merged)
precision_at_k(df_merged, k=5)

  test_samples = dataset.groupby('userid', group_keys=False).apply(lambda x: x.sample(frac=test_size))


  0%|          | 0/943 [00:00<?, ?it/s]

                 rank_recs  rank_true
user_id item_id                      
0       99             1.0          1
        55             2.0         11
        167            3.0          2
        209            4.0         12
        182            5.0          3
Precision@5: 0.5352


In [19]:
filter_jaccard = UserBasedCollaborativeFilter(similarity_jaccard)
filter_jaccard.calculate_similarity_matrix(user_item_ratings)
df_merged = get_df_merged(filter_jaccard, filter_jaccard.valid_feedbacks)
calc_metrics(df_merged)
precision_at_k(df_merged, k=5)

  test_samples = dataset.groupby('userid', group_keys=False).apply(lambda x: x.sample(frac=test_size))


  0%|          | 0/943 [00:00<?, ?it/s]

                 rank_recs  rank_true
user_id item_id                      
0       49             1.0          1
        209            2.0          5
        172            3.0          2
        6              4.0          6
        126            5.0          3
Precision@5: 0.5520


In [20]:
filter_pearson = UserBasedCollaborativeFilter(similarity_pearson)
filter_pearson.calculate_similarity_matrix(user_item_ratings)
df_merged = get_df_merged(filter_pearson, filter_pearson.valid_feedbacks)
calc_metrics(df_merged)
precision_at_k(df_merged, k=5)

  test_samples = dataset.groupby('userid', group_keys=False).apply(lambda x: x.sample(frac=test_size))


  0%|          | 0/943 [00:00<?, ?it/s]

                 rank_recs  rank_true
user_id item_id                      
0       55             1.0         11
        97             2.0         12
        173            3.0          1
        6              4.0         13
        78             5.0         14
Precision@5: 0.5303


In [23]:
filter_dot_mc = UserBasedMeanCorrectedCollaborativeFilter(similarity_dot_product)
filter_dot_mc.calculate_similarity_matrix(user_item_ratings)
df_merged = get_df_merged(filter_dot_mc, filter_dot_mc.valid_feedbacks)
calc_metrics(df_merged)
precision_at_k(df_merged, k=5)

  test_samples = dataset.groupby('userid', group_keys=False).apply(lambda x: x.sample(frac=test_size))


  0%|          | 0/943 [00:00<?, ?it/s]

                 rank_recs  rank_true
user_id item_id                      
0       49             1.0          1
        99             2.0          2
        97             3.0          9
        0              4.0          3
        6              5.0         10
Precision@5: 0.5472


In [24]:
filter_jaccard_mc = UserBasedMeanCorrectedCollaborativeFilter(similarity_jaccard)
filter_jaccard_mc.calculate_similarity_matrix(user_item_ratings)
df_merged = get_df_merged(filter_jaccard_mc, filter_jaccard_mc.valid_feedbacks)
calc_metrics(df_merged)
precision_at_k(df_merged, k=5)

  test_samples = dataset.groupby('userid', group_keys=False).apply(lambda x: x.sample(frac=test_size))


  0%|          | 0/943 [00:00<?, ?it/s]

                 rank_recs  rank_true
user_id item_id                      
0       49             1.0          1
        173            2.0          2
        180            3.0          3
        126            4.0          4
        55             5.0          8
Precision@5: 0.5465


In [25]:
filter_pearson_mc = UserBasedMeanCorrectedCollaborativeFilter(similarity_pearson)
filter_pearson_mc.calculate_similarity_matrix(user_item_ratings)
df_merged = get_df_merged(filter_pearson_mc, filter_pearson_mc.valid_feedbacks)
calc_metrics(df_merged)
precision_at_k(df_merged, k=5)


  test_samples = dataset.groupby('userid', group_keys=False).apply(lambda x: x.sample(frac=test_size))


  0%|          | 0/943 [00:00<?, ?it/s]

                 rank_recs  rank_true
user_id item_id                      
0       99             1.0          1
        171            2.0          2
        173            3.0          3
        78             4.0         10
        0              5.0          4
Precision@5: 0.5454


# Task 4: Your favorite films

1. Choose from 10 to 50 films rated by you (you can export it from IMDB or kinopoisk) which are presented in Movielens dataset. </br> Print them in human readable form (5 points)

In [28]:
my_id = user_item_ratings['userid'].max() + 1
my_ratings = [
    [my_id, 500, 5],
    [my_id, 102, 4],
    [my_id, 1200, 5],
    [my_id, 75, 3],
    [my_id, 315, 4],
    [my_id, 200, 3],
    [my_id, 650, 5],
    [my_id, 90, 4],
    [my_id, 55, 5],
    [my_id, 300, 4],
    [my_id, 700, 3],
    [my_id, 920, 5],
    [my_id, 810, 4],
    [my_id, 420, 5],
    [my_id, 360, 4],
    [my_id, 940, 3],
]

for _, film_id, rating in my_ratings:
    movie_title = item_metadata['title'].loc[film_id]
    print(f'{movie_title} - {rating}')


Dumbo (1941) - 5
All Dogs Go to Heaven 2 (1996) - 4
Marlene Dietrich: Shadow and Light (1996)  - 5
Carlito's Way (1993) - 3
As Good As It Gets (1997) - 4
Evil Dead II (1987) - 3
Glory (1989) - 5
Nightmare Before Christmas, The (1993) - 4
Pulp Fiction (1994) - 5
In & Out (1997) - 4
Wonderful, Horrible Life of Leni Riefenstahl, The (1993) - 3
Farewell My Concubine (1993) - 5
Thirty-Two Short Films About Glenn Gould (1993) - 4
William Shakespeare's Romeo and Juliet (1996) - 5
Incognito (1997) - 4
With Honors (1994) - 3


2. Compute top 10 recomendations based on this films for each of 6 methods implemented. Print them in human readable from (5 points)

In [29]:
my_df = pd.DataFrame(my_ratings, columns=['userid', 'itemid', 'rating'])

new_ratings = pd.concat([user_item_ratings, my_df])

def get_top_10_recommendations(filter_class, similarity_function, ratings, user_id):
    filter_model = filter_class(similarity_function)
    
    filter_model.calculate_similarity_matrix(ratings, production=True)
    recommendations = filter_model.recommend(user_id, 10)
    
    return recommendations

methods = [
    (UserBasedCollaborativeFilter, similarity_dot_product),
    (UserBasedCollaborativeFilter, similarity_jaccard),
    (UserBasedCollaborativeFilter, similarity_pearson),
    (UserBasedMeanCorrectedCollaborativeFilter, similarity_dot_product),
    (UserBasedMeanCorrectedCollaborativeFilter, similarity_jaccard),
    (UserBasedMeanCorrectedCollaborativeFilter, similarity_pearson),
]

for filter_class, similarity_function in methods:
    print(f"Recommendations for method {filter_class.__name__} with {similarity_function.__name__}:")

    recommendations = get_top_10_recommendations(filter_class, similarity_function, new_ratings, my_id)
    
    for film_id, score in recommendations:
        movie_title = item_metadata['title'].loc[film_id]
        print(f'{movie_title} - {score:.2f}')
    
    print("\n" + "-"*50 + "\n")


Recommendations for method UserBasedCollaborativeFilter with similarity_dot_product:
Star Wars (1977) - 93750.00
Pulp Fiction (1994) - 88276.00
Raiders of the Lost Ark (1981) - 84598.00
Silence of the Lambs, The (1991) - 79782.00
Fargo (1996) - 78302.00
Return of the Jedi (1983) - 78289.00
Empire Strikes Back, The (1980) - 76415.00
Princess Bride, The (1987) - 69319.00
Back to the Future (1985) - 68777.00
Fugitive, The (1993) - 68277.00

--------------------------------------------------

Recommendations for method UserBasedCollaborativeFilter with similarity_jaccard:
Pulp Fiction (1994) - 34.97
Star Wars (1977) - 33.43
Fargo (1996) - 27.30
Raiders of the Lost Ark (1981) - 27.27
Contact (1997) - 27.22
Silence of the Lambs, The (1991) - 26.43
Return of the Jedi (1983) - 26.33
Empire Strikes Back, The (1980) - 23.67
Scream (1996) - 23.40
English Patient, The (1996) - 22.47

--------------------------------------------------

Recommendations for method UserBasedCollaborativeFilter with si

3. Rate films that was recommended in previous step (by title, description, trailer). For each algorithm compute metrics based on ratings you put. Was recommedations different? Which set of recomendations you like the most?

# Task 5: Conclusion (10 points)

Compare all methods based on both dataset (metrics) and your personal recomendations.

Which algorithm is the best? Why?

What differences in algorithms have you noted?

Algorithms with Mean Correction:

1) Algorithms incorporating mean correction consistently outperform those without it across all metrics, including MAE, MSE, RMSE.

2)
- Pearson Similarity: This function shows the best performance with mean correction. It effectively captures the relationships between users by considering the deviations from their average ratings, leading to more accurate predictions.
- Jaccard Similarity: While Jaccard performs better than dot product with mean correction, it still lags behind Pearson. It does a decent job at capturing similarity but may not handle the nuances of rating scales as effectively as Pearson