# Recommender Systems 

## Recommender Systems: Why And How?

Recommender systems are algorithms providing personalized suggestions for items that are most relevant to each user. With the massive growth of available online contents, users have been inundated with choices. It is therefore crucial for web platforms to offer recommendations of items to each user, in order to increase user satisfaction and engagement.

![image.png](attachment:image.png)

YouTube recommends videos to users, to help them discover and watch content relevant to them in the middle of a huge number of available contents. (Image by Author)

## Explicit Feedback vs. Implicit Feedback

In recommender systems, machine learning models are used to predict the `rating rᵤᵢ of a user u on an item i`. At inference time, we recommend to each user u the items l having highest predicted rating rᵤᵢ.

We therefore need to collect user feedback, so that we can have a ground truth for training and evaluating our models. An important distinction has to be made here between `explicit feedback` and `implicit feedback`.

![image.png](attachment:image.png)

        Explicit vs. implicit feedback for recommender systems. (Image by Author)


<b>Explicit feedback</b> is a rating explicitly given by the user to express their satisfaction with an item. Examples are: number of stars on a scale from 1 to 5 given after buying a product, thumb up/down given after watching a video, etc. This feedback provides `detailed information` on how much a user liked an item, but it is `hard to collect` as most users typically don’t write reviews or give explicit ratings for each item they purchase.

<b>Implicit feedback</b>, on the other hand, assume that user-item interactions are an indication of preferences. Examples are: purchases/browsing history of a user, list of songs played by a user, etc. This feedback is `extremely abundant`, but at the same time it is `less detailed` and `more noisy` (e.g. someone may buy a product as a present for someone else). However, this noise becomes negligible when compared to the sheer size of available data of this kind, and `most modern Recommender Systems tend to rely on implicit feedback`.

![image-2.png](attachment:image-2.png)

User-item rating matrix for explicit feedback and implicit feedback datasets. (Image by Author)




## Content-Based vs. Collaborative Filtering Approaches

Recommender system can be classified according to the kind of information used to predict user preferences as<b> Content-Based </b>or<b> Collaborative Filtering</b>.

![image.png](attachment:image.png)

Content-Based vs. Collaborative Filtering approaches for recommender systems. (Image by author)

## Content-Based Approach

Content-based filtering uses item features to recommend other items similar to what the user likes, based on their previous actions or explicit feedback.

Content-based methods describe users and items by their known metadata. Each item i is represented by a set of relevant tags—e.g. movies of the IMDb platform can be tagged as“action”, “comedy”, etc. Each user u is represented by a user profile, which can created from known user information—e.g. sex and age—or from the user’s past activity.

To train a Machine Learning model with this approach we can use a k-NN model. For instance, if we know that user u bought an item i, we can recommend to u the available items with features most similar to i.

The advantage of this approach is that items metadata are known in advance, so we can also apply it to Cold-Start scenarios where a new item or user is added to the platform and we don’t have user-item interactions to train our model. The disadvantages are that we don’t use the full set of known user-item interactions (each user is treated independently), and that we need to know metadata information for each item and user.


## Collaborative Filtering Approach

Collaborative filtering is a technique that can filter out items that a user might like on the basis of reactions by similar users.

Collaborative filtering methods do not use item or user metadata, but try instead to leverage the feedbacks or activity history of all users in order to predict the rating of a user on a given item by inferring interdependencies between users and items from the observed activities.

To train a Machine Learning model with this approach we typically try to cluster or factorize the rating matrix rᵤᵢ in order to make predictions on the unobserved pairs (u,i), i.e. where rᵤᵢ = “?”. In the following of this article we present the Matrix Factorization algorithm, which is the most popular method of this class.

The advantage of this approach is that the whole set of user-item interactions (i.e. the matrix rᵤᵢ) is used, which typically allows to obtain higher accuracy than using Content-Based models. The disadvantage of this approach is that it requires to have a few user interactions before the model can be fitted.