# Recommender systems

__Recommender systems__ are software algorithms or techniques that provide personalized recommendations to users, suggesting items or content that they are likely to find interesting, useful, or relevant. These systems are widely used in various industries, including e-commerce, entertainment, social media, and content streaming platforms.

The concept behind recommender systems is based on the idea of leveraging user preferences, historical data, and patterns to predict and suggest __items__ that __a user__ might enjoy or benefit from. By analyzing user behavior, such as past purchases, ratings, search history, and interactions with items or content, recommender systems aim to understand user preferences and make accurate recommendations.

We can use large language models (LLMs) as recommender systems. E.g. you may ask ChatGPT about what movie to watch based on what movies you like or ask for a music that are similar to the provided example.

You can use the following prompts:

```
I want you to act as a movie recommender system. I will provide you with a list of movies that I like, and your goal is to recommend 10 other movies that I will also enjoy.
```


```
I want you to act as a song recommender. I will provide you with a song and you will create a playlist of 10 songs that are similar to the given song. And you will provide a playlist name and description for the playlist. Do not choose songs that are same name or artist. Do not write any explanations or other words, just reply with the playlist name, description and the songs. My first song is "...".
```

However, language models like ChatGPT typically don't have a built-in knowledge base. Therefore, they may recommend items that don't actually exist.

Furthermore, most recommender systems are designed for specific business domains where the items are not publicly available.

If you have a small number of items, you can directly provide them to ChatGPT for better recommendations. For example, you can use the following prompt:

```
I want you to act as an online shop recommender system. My shop has ten products with the following descriptions:

Product: Wireless Bluetooth Earbuds
Description: Enjoy immersive sound with these wireless Bluetooth earbuds. They provide crystal-clear audio and a comfortable fit, perfect for music lovers on the go.

Product: Portable Power Bank
Description: Stay charged wherever you are with this portable power bank. It offers high-capacity charging, multiple ports, and a compact design, making it ideal for travel or emergencies.

Product: Smart Fitness Tracker
Description: Keep track of your fitness goals with this smart fitness tracker. It monitors your heart rate, steps, sleep patterns, and more, while providing real-time notifications and a sleek design.

Product: Programmable Robot Toy
Description: Spark your child's imagination with this programmable robot toy. It teaches coding concepts through play, enabling kids to explore STEM subjects in a fun and interactive way.

Product: Waterproof DSLR Camera Bag
Description: Protect your valuable camera gear with this waterproof DSLR camera bag. It features padded compartments, adjustable dividers, and a durable exterior, ensuring your equipment stays safe and dry.

Product: Electric Air Fryer
Description: Cook healthier meals with this electric air fryer. It uses hot air circulation to fry food with little to no oil, resulting in crispy and delicious dishes while reducing fat intake.

Product: Portable Bluetooth Speaker
Description: Take your music anywhere with this portable Bluetooth speaker. It delivers high-quality sound, has a long battery life, and is water-resistant, making it perfect for outdoor adventures.

Product: Wireless Charging Pad
Description: Simplify charging with this wireless charging pad. Compatible with Qi-enabled devices, it allows you to power up your phone or other devices by simply placing them on the pad.

Product: Virtual Reality Headset
Description: Immerse yourself in virtual worlds with this virtual reality headset. It offers a 360-degree panoramic view, adjustable straps for comfort, and compatibility with most smartphones.

Product: Instant Pot Pressure Cooker
Description: Save time in the kitchen with this versatile Instant Pot pressure cooker. It combines multiple functions like pressure cooking, slow cooking, sautéing, and more, making meal preparation a breeze.

Your goal is to recommend items from this list based on the previous purchases. I will provide you with list of purchases for specific user and you will give me the product name.
```

Due to the limited size of the LLM's memory, we cannot pass a large number of items. Therefore, we still need to train different models.

Let's consider the Movielens-1m dataset. It is a benchmark dataset used for recommender systems and machine learning research. It consists of approximately 1 million ratings from 6,000 users on movies. The dataset includes user demographics, movie titles, and genre information. It's widely used for evaluating recommendation algorithms and collaborative filtering techniques.

Here are the main components of the MovieLens-1M dataset:

1. _Ratings_: The dataset includes approximately 1 million movie ratings provided by around 6,000 users. Each rating is represented by a triplet consisting of a user ID, a movie ID, and a rating value on a scale of 1 to 5. These ratings were collected from the MovieLens website between 2000 and 2003.

2. _Users_: The dataset provides demographic information about the users, such as age, gender, occupation, and ZIP code. User IDs are anonymized to preserve privacy.

3. _Movies_: The dataset contains details about the movies, including their titles, genres, and release year. Each movie is identified by a unique movie ID.

4. _Genre Information_: MovieLens-1M includes a list of predefined genres, and each movie is associated with one or more genres from this set. The genre information allows researchers to explore the relationship between user preferences and movie genres.

In [None]:
! wget https://files.grouplens.org/datasets/movielens/ml-1m.zip
! unzip ml-1m.zip

--2023-05-18 16:03:55--  https://files.grouplens.org/datasets/movielens/ml-1m.zip
Распознаётся files.grouplens.org (files.grouplens.org)… 128.101.65.152
Подключение к files.grouplens.org (files.grouplens.org)|128.101.65.152|:443... соединение установлено.
HTTP-запрос отправлен. Ожидание ответа… 200 OK
Длина: 5917549 (5,6M) [application/zip]
Сохранение в: «ml-1m.zip.1»


2023-05-18 16:04:22 (349 KB/s) - «ml-1m.zip.1» сохранён [5917549/5917549]

Archive:  ml-1m.zip
replace ml-1m/movies.dat? [y]es, [n]o, [A]ll, [N]one, [r]ename: 

In [1]:
import pandas as pd

# Load ratings data
ratings_file = 'ml-1m/ratings.dat'
ratings_cols = ['user_id', 'movie_id', 'rating', 'timestamp']
ratings = pd.read_csv(ratings_file, sep='::', engine='python', names=ratings_cols, encoding='latin-1')

# Load users data
users_file = 'ml-1m/users.dat'
users_cols = ['user_id', 'gender', 'age', 'occupation', 'zipcode']
users = pd.read_csv(users_file, sep='::', engine='python', names=users_cols, encoding='latin-1')

# Load movies data
movies_file = 'ml-1m/movies.dat'
movies_cols = ['movie_id', 'title', 'genres']
movies = pd.read_csv(movies_file, sep='::', engine='python', names=movies_cols, encoding='latin-1')

# Example usage: Print the first few rows of each dataframe
print("Ratings:")
print(ratings.head())

print("\nUsers:")
print(users.head())

print("\nMovies:")
print(movies.head())


Ratings:
   user_id  movie_id  rating  timestamp
0        1      1193       5  978300760
1        1       661       3  978302109
2        1       914       3  978301968
3        1      3408       4  978300275
4        1      2355       5  978824291

Users:
   user_id gender  age  occupation zipcode
0        1      F    1          10   48067
1        2      M   56          16   70072
2        3      M   25          15   55117
3        4      M   45           7   02460
4        5      M   25          20   55455

Movies:
   movie_id                               title                        genres
0         1                    Toy Story (1995)   Animation|Children's|Comedy
1         2                      Jumanji (1995)  Adventure|Children's|Fantasy
2         3             Grumpier Old Men (1995)                Comedy|Romance
3         4            Waiting to Exhale (1995)                  Comedy|Drama
4         5  Father of the Bride Part II (1995)                        Comedy


Before we train any model, it is important to properly define the train and validation subsamples. The choice of splitting strategy depends on your business and recommendation process. In recommender systems, we usually follow two types of train/test split:

1. __Split by user/items__: This strategy involves organizing the data in a way that all observations for a specific user or item are either in the train or test set. This approach is useful for evaluating how well the model performs in a cold-start scenario. For instance, when dealing with an app where user profiles change frequently, and there is limited historical data, you still want to provide recommendations based on user characteristics. This method is typically applied in content-based or hybrid models.

2. __Temporal Split__: When the data has a temporal aspect, such as user ratings or interactions over time, it is crucial to consider the temporal order of the data. In this case, you can split the data into training and testing sets using a cutoff point. All interactions that occur before the cutoff are used for training, while interactions that happen after the cutoff are used for testing. This ensures that the model is evaluated on the most recent user-item interactions.

Let us follow the second one.

In [2]:
ratings_sorted = ratings.sort_values('timestamp')

# Calculate the cutoff timestamp for the temporal split
cutoff_ratio = 0.8  # 80% of data for training
cutoff_index = int(len(ratings_sorted) * cutoff_ratio)
cutoff_timestamp = ratings_sorted.iloc[cutoff_index]['timestamp']

# Split ratings into training and testing sets based on the cutoff timestamp
train_ratings = ratings_sorted[ratings_sorted['timestamp'] < cutoff_timestamp]
test_ratings = ratings_sorted[ratings_sorted['timestamp'] >= cutoff_timestamp]

There are generally three main types of recommender systems:

1. __Content-Based Filtering__: This approach focuses on the attributes or characteristics of items and users. It analyzes the features and properties of items (e.g., product descriptions, genres, or tags) and user profiles (e.g., user demographics, preferences, or historical interactions) to recommend items that match a user's preferences. Content-based filtering is particularly useful when there is limited or no historical data available about user preferences.

2. __Collaborative Filtering__: This approach is based on the assumption that users with similar preferences in the past will have similar preferences in the future. Collaborative filtering methods analyze the behavior of a large group of users to find patterns and make recommendations. It can be further divided into two subtypes:

    1. __User__-Based Collaborative Filtering: It recommends items to a user based on the interests and preferences of users who are similar to them.

    2. __Item__-Based Collaborative Filtering: It recommends items that are similar to the ones a user has already interacted with or shown interest in.

3. __Hybrid Methods__: These methods combine collaborative filtering and content-based filtering to take advantage of their respective strengths. By leveraging both user behavior and item attributes, hybrid recommender systems can provide more accurate and diverse recommendations.


The common factor in these methods is __similarity__. Therefore, we need a way to measure the similarity between users or items. For example, we can encode features of users and items to find similar ones. This is an example of content-based filtering.

In [3]:
import numpy as np

user_features = users[['gender', 'age']].values
user_features = np.hstack([pd.get_dummies(users['occupation']).values])

To find the closests users we can utilize nearest neighbors search

In [76]:
from sklearn.neighbors import NearestNeighbors
from tqdm import trange

# Create a Nearest Neighbors model
k = 3  # Number of nearest neighbors to consider
nn_model = NearestNeighbors(n_neighbors=k, metric="cosine")
nn_model.fit(user_features)

def user_based_knn_preds(ratings, users, user_features, nn_model, all_items):
    predictions = []
    for target_user_index in trange(users.shape[0]):
        target_user_features = user_features[target_user_index].reshape(1, -1)
        # Find the nearest neighbors of the target user
        distances, indices = nn_model.kneighbors(target_user_features)
        
        # Get the user IDs of the nearest neighbors
        nearest_user_ids = users.iloc[indices[1:, 0]]['user_id']
        target_rated_items = ratings[ratings['user_id'] == users.iloc[target_user_index].user_id]['movie_id'].values

        # Filter ratings data for the nearest users
        user_ratings = ratings[ratings['user_id'].isin(nearest_user_ids)]
        rated_items = user_ratings['movie_id'].values
        unrated_items = np.setdiff1d(rated_items, target_rated_items)
        # Calculate average ratings of unrated items
        avg_ratings = user_ratings.groupby('movie_id')['rating'].mean().reset_index()

        # Sort unrated items by average rating in descending order
        sorted_items = avg_ratings[avg_ratings['movie_id'].isin(unrated_items)]
        sorted_items = sorted_items.sort_values('rating', ascending=False)
        # Get the top recommended items
        top_items = sorted_items.head(20)  # Change the number of recommendations as desired
        preds = top_items.movie_id.tolist()
        if len(preds) < 20:
            preds.extend(np.random.choice(list(all_items - set(rated_items)), size=10-len(preds)))
        predictions.append(preds)
    return np.array(predictions)

all_items = set(ratings.movie_id)
predicted_ratings = user_based_knn_preds(train_ratings, users, user_features, nn_model, all_items)

100%|██████████████████████████████████████| 6040/6040 [00:38<00:00, 158.45it/s]


Metrics for recommender systems can be splitted into two groups:
1. Offline Metrics:

    * Offline metrics are used to evaluate recommender systems based on historical data without considering real-time user interactions or business impact.
    * These metrics help to understand the performance of the model during the training and validation phases, providing insights into its effectiveness in learning from the data.
    * Examples of offline metrics include precision, recall, F1 score, mean average precision (MAP), mean absolute error (MAE), root mean squared error (RMSE), and others.
    * These metrics assess the accuracy, coverage, ranking quality, and prediction errors of the recommender system based on the historical data.
    * Offline metrics are useful for comparing different models, tuning hyperparameters, and assessing the general performance of the recommender system.
2. Online Metrics:

    * Online metrics are focused on evaluating the performance of recommender systems in real-world scenarios and measure their impact on user engagement and business goals.
    * These metrics consider real-time user interactions, such as clicks, purchases, likes, dislikes, time spent in the app, conversion rates, revenue generated, and other user engagement signals.
    * Online metrics are more directly aligned with business objectives and provide insights into how the recommender system influences user behavior and generates value.
    * Examples of online metrics include click-through rate (CTR), conversion rate, revenue per user, engagement rate, retention rate, and others.
    * Online metrics enable organizations to assess the effectiveness of their recommender systems in driving user engagement, user satisfaction, and business outcomes.

While offline metrics are valuable during the development and testing phases of a recommender system, online metrics provide a more comprehensive evaluation of the system's performance in real-world usage. It's important to consider both types of metrics to understand the strengths and limitations of the recommender system and make informed decisions regarding model selection, algorithm improvements, and business strategies.

Let us implement the __HitRate@K__. It measures the proportion of users for whom at least one relevant item is present in the top-K recommendations. It focuses on whether the recommended items contain at least one relevant item for each user.

In [83]:
def calculate_hit_rate_at_k(test_ratings, predicted_ratings):
    hits = 0
    for user_id, user_ratings in test_ratings.groupby('user_id'):
        actual_ratings = set(user_ratings['movie_id'])
        top_K_predicted = predicted_ratings[users[users.user_id == user_id].index[0]]

        if any(movie_id in actual_ratings for movie_id in top_K_predicted):
            hits += 1

    hit_rate_at_K = hits / len(test_ratings['user_id'].unique())
    return hit_rate_at_K

print(f"HitRate@20: {calculate_hit_rate_at_k(test_ratings, predicted_ratings):.2f}")

HitRate@20: 0.22


__Mean Average Precision at K (MAP@K)__ is an evaluation metric commonly used in information retrieval tasks, including recommender systems. It measures the average precision at different recall levels within the top-K recommendations.

Here's a step-by-step explanation of MAP@K:

1. Precision:

    * Precision is a metric that quantifies the accuracy of the recommendations. It calculates the proportion of relevant items among the recommended items at a given position.
    * Precision is computed as the number of relevant items found in the recommendations divided by the total number of items recommended at that position.
2. Average Precision:

    * Average Precision (AP) extends the concept of precision by considering multiple positions within the recommendation list.
    * For each user, AP is calculated by summing the precisions at relevant positions and dividing it by the total number of relevant items.
    * AP ranges from 0 to 1, with higher values indicating better performance.
3. Mean Average Precision at K:

    * MAP@K is the average of the AP values across all users in the evaluation set.
    * It provides a single score that represents the average performance of the recommender system in terms of precision at various recall levels within the top-K recommendations.

To calculate MAP@K, follow these steps:

1. For each user in the evaluation set:

    * Determine the set of relevant items based on the ground truth or user feedback.
    * Retrieve the top-K recommendations generated by the recommender system.
2. Calculate the precision at each position up to K for each user:

    * Check if the recommended item at each position is relevant.
    * If it is, calculate the precision by dividing the number of relevant items found so far by the position.
3. Calculate the average precision (AP) for each user:
    * Sum the precisions at relevant positions and divide by the total number of relevant items.
4. Compute the mean of the average precision (AP) values across all users to obtain MAP@K.

MAP@K provides a measure of the quality of the ranking order within the top-K recommendations. It considers both the relevance of the items and their positions, providing a comprehensive evaluation of the recommender system's performance in terms of precision at different recall levels.

__Note__: The value of K in MAP@K represents the cutoff position within the recommendation list. It determines the length of the list to consider when calculating the metric.

In [78]:
def calculate_map_at_k(test_ratings, predicted_ratings):
    average_precisions = []
    for user_id, user_ratings in test_ratings.groupby('user_id'):
        actual_ratings = set(user_ratings['movie_id'])
        top_K_predicted = predicted_ratings[users[users.user_id == user_id].index[0]]

        precision = 0
        num_correct = 0

        for j, movie_id in enumerate(top_K_predicted):
            if movie_id in actual_ratings:
                num_correct += 1
                precision += num_correct / (j + 1)

        if num_correct > 0:
            average_precision = precision / num_correct
            average_precisions.append(average_precision)
    return np.mean(average_precisions)

print(f"MAP@10: {calculate_map_at_k(test_ratings, predicted_ratings):.2f}")

MAP@10: 0.30


Looks like simple similarity between user features yields some results. However, such an approach does not account for user behavior.

To incorporate user behavior, we need to encode interactions between users and items. The common practice is to utilize a user-item interaction matrix. This matrix captures user interactions with items and represents them in a matrix format, where users are rows and items are columns. By utilizing this matrix, we can capture user behavior and preferences, enabling personalized recommendations.

The user-item interaction matrix serves as a foundation for various recommendation algorithms and facilitates efficient processing and analysis of user-item interactions.

In [81]:
# Create the user-item interaction matrix
interaction_matrix = train_ratings.pivot(index='user_id', columns='movie_id', values='rating')

interaction_matrix = interaction_matrix.reindex(movies.movie_id.tolist(), axis=1)
interaction_matrix = interaction_matrix.reindex(users.user_id.values)

# Fill missing values with 0 (indicating no interaction)
interaction_matrix = interaction_matrix.fillna(0)

# Convert the interaction matrix to a NumPy array if needed
interaction_matrix = interaction_matrix.to_numpy()

Let us use this matrix as vectors for similarity search

In [82]:
# Create a Nearest Neighbors model
k = 3  # Number of nearest neighbors to consider
nn_model = NearestNeighbors(n_neighbors=k, metric="cosine")
nn_model.fit(interaction_matrix)

predicted_ratings = user_based_knn_preds(train_ratings, users, interaction_matrix, nn_model, all_items)
print(f"HitRate@10: {calculate_hit_rate_at_k(test_ratings, predicted_ratings):.2f}")
print(f"MAP@10: {calculate_map_at_k(test_ratings, predicted_ratings):.2f}")

100%|███████████████████████████████████████| 6040/6040 [09:49<00:00, 10.24it/s]


HitRate@10: 0.22
MAP@10: 0.31


The interaction matrix plays a crucial role in improving the performance of recommendations. However, these vectors are often sparse and high-dimensional, which can lead to poor performance when using nearest neighbor (NN) search due to the curse of dimensionality.

To overcome this issue, models with latent factors or embeddings are commonly used, with matrix factorization being a popular approach. By decomposing the interaction matrix into lower-dimensional latent factors or embeddings, these models effectively reduce sparsity and address the curse of dimensionality. They capture the underlying relationships between users and items, resulting in better recommendations compared to simple nearest neighbor methods.

Matrix factorization and latent factor models help overcome the challenges posed by sparse and high-dimensional user-item interaction matrices. They provide a more accurate and efficient way to model user preferences and item characteristics, enhancing the performance of recommender systems.

In [79]:
import scipy.sparse as sp

# Convert interaction matrix to sparse form
sparse_interactions = sp.csr_array(interaction_matrix)

# Perform matrix factorization using SVD
user_factors, _, item_factors = sp.linalg.svds(sparse_interactions, k=16)

We can similarly utilize nearest neighbors search

In [80]:
# Create a Nearest Neighbors model
k = 3  # Number of nearest neighbors to consider
nn_model = NearestNeighbors(n_neighbors=k, metric="cosine")
nn_model.fit(user_factors)

predicted_ratings = user_based_knn_preds(train_ratings, users, user_factors, nn_model, all_items)
print(f"HitRate@10: {calculate_hit_rate_at_k(test_ratings, predicted_ratings):.2f}")
print(f"MAP@10: {calculate_map_at_k(test_ratings, predicted_ratings):.2f}")

100%|██████████████████████████████████████| 6040/6040 [00:36<00:00, 165.66it/s]


HitRate@10: 0.23
MAP@10: 0.33


Or directly predict ratings in initial matrix using dot product

In [75]:
%%time
predicted_ratings = user_factors.dot(item_factors)
predicted_ratings = np.argsort(predicted_ratings)[:, -20:]

print(f"HitRate@10: {calculate_hit_rate_at_k(test_ratings, predicted_ratings):.2f}")
print(f"MAP@10: {calculate_map_at_k(test_ratings, predicted_ratings):.2f}")

HitRate@10: 0.42
MAP@10: 0.25
CPU times: user 3.03 s, sys: 253 ms, total: 3.28 s
Wall time: 1.43 s


We also can use latent representations to find similar items.

In [64]:
# Create a Nearest Neighbors model
k = 10  # Number of nearest neighbors to consider
nn_model = NearestNeighbors(n_neighbors=k, metric="cosine")
nn_model.fit((item_factors / (item_factors ** 2).sum(axis=0)[np.newaxis, :]).T)

def find_similar_movies(title, item_factors, nn_model, movies):
    idx = movies[movies.title == title].index
    _, preds = nn_model.kneighbors(item_factors[idx])
    print(f"Because you watched {title}:")
    for i in preds[0][1:]:
        print("\t", movies.iloc[i].title)

find_similar_movies("Toy Story (1995)", (item_factors / (item_factors ** 2).sum(axis=0)[np.newaxis, :]).T, nn_model, movies)

Because you watched Toy Story (1995):
	 Babe (1995)
	 Toy Story 2 (1999)
	 Aladdin (1992)
	 Groundhog Day (1993)
	 Babe: Pig in the City (1998)
	 Mighty Peking Man (Hsing hsing wang) (1977)
	 Wrong Trousers, The (1993)
	 Bug's Life, A (1998)
	 Grand Day Out, A (1992)


This vectors do not about film genres and preserves only the information about user-item interactions.