# BLU12 - Learning Notebook - Part 1 of 3 - Taxonomy of Recommender Systems

# 1 The Taxonomy of a Recommender System (RS)

We have learned about the fundamental building blocks of an RS. It's time to learn how an RS fits into the bigger picture.

For that, we need a taxonomy, i.e., principles that allow us to classify different RS, according to essential characteristics.

![recommender_systems_framework_taxonomy](../media/recommender_systems_framework_taxonomy.png)

Some characteristics are by nature (i.e., Domain, Purpose, Context), some are by design (i.e., Raters, Interfaces, Privacy, and Trustworthiness).

Those by nature set the rules and boundaries, constraints and opportunities, in which the RS operates.

The design of the RS encapsulates the multi-disciplinary decisions that shape how it works and the interactions with the real world.

Before zooming in into the principles of our taxonomy, two preliminary notes on algorithms and explainability.

# 2 Domain

In which domain is the RS providing recommendations. 

Examples include (but can be any other thing):
* Search (Google)
* Books and consumer products (Amazon)
* Friends (Facebook)
* Content (Facebook, Medium, YouTube)
* Jobs (Linkedin)
* Accommodation (Airbnb, Booking)
* Music (Spotify, Pandora)
* Movies (Netflix)
* Dating (Tinder).

The domain constrains the RS in different ways. We highlight three.

## 2.1 Recurrent recommendations

Depending on the domain, we may recommend new items (e.g., movies) or re-recommend old ones (e.g., groceries).

## 2.2 Bundles 

We may be recommending individual items (e.g., a piece of clothing) or bundles of items (e.g., shop the look).

## 2.3 Sequences

For bundles, we consume some items sequentially (e.g., a playlist), and others we don't.

# 3 Purpose

What exactly do we want to achieve with the RS?

Traditional purposes of an RS are boosting sales and improving cart value, retention, and engagement.

Nonetheless, other use cases exist, such as information, training, and education, among others.

Again, the ultimate goal frames the RS we need to build. 

# 4 Context

The context is the situation within which the recommendation happens, and that can help explain it.

We focus on four types of context.

## 4.1 User

The user context includes, among others:
* What the user is doing
* Where the user is
* Whom the user is with and people nearby
* Level of attention and/or interruption
* Location
* Time
* Available resources.

In general, the user can be commuting alone, working, browsing and online or offline shopping.

## 4.2 Automatic consumption

We focus on a particular way of interacting with the context, which is automatic consumption.

Some RS have automatic consumption, e.g., a new song starts playing on Spotify, or a new video starts playing on YouTube.

In case of automatic consumption, the ability to blend with the context is essential.

# 5 Raters

Whose opinions are considered, i.e., who feeds the RS?

Until now, we assumed users and raters as a single entity, but sometimes this isn't the case.

An RS can use on expert opinions (e.g., stylists, doctors), all the users, or similar users, among others.
    
# 6 Privacy and trustworthiness

Regarding privacy, some to consider:
* Protecting personal information
* Right to be forgotten
* Possibility of private browsing (i.e., incognito)
* Deniability of preferences, users being able to change their profile.

# 7 Trustworthiness

Honesty is a hot-topic in RS.

Watch out for bias, shady business rules, vulnerability to external manipulation and lack of transparency.

Reputation mechanisms are also on the rise, especially with all the fake news buzz.

# 8 Interfaces

## 8.1 Input 

The main types of input for RS are explicit and implicit feedback.

### 8.1.1 Explicit Feedback

Feedback is explicit when the user provides the opinion about an item on a rating scale.

### 8.1.2 Implicit Feedback

Implicit feedback occurs when the RS infers opinions based on user actions (e.g., clicks), generating unary data.

Later in this BLU, we learn how to work with unary data.

## 8.2 Output

Frequently, the RS either returns a prediction or top-$N$ lists of recommended items. 

Although this is not mandatory, predicting ratings is associated with explicit feedback.

In the presence of implicit feedback, the RS tends to a top-$N$ list, as predictions are less interpretable.

We can present both predictions and lists of recommended items explicitly or organically.

# 9 Recommendation Algorithms

The recommendation algorithms are a critical decision when building an RS.

Nonetheless, we algorithms are fluid and subject to optimization, while other principles are rigid.

Also, the choice of the algorithm is contingent on the principles of our system, but also on the existing tech and research.

These are the main distinctions between algorithms.

## 9.1 Personalization level

We can label an RS concerning personalization as:

* Generic or not personalized
* Demographic
* Ephemeral if they match the current activity
* Persistent if they consider long-term interest.

## 9.2 Algorithms

The main types of algorithms are:
* Non-personalized ([BLU10](https://github.com/LDSSA/batch2-blu10))
* Memory-based ([BLU11](https://github.com/LDSSA/batch2-blu11))
    * Content-based filtering
    * Collaborative filtering
* Model-based (covered in [BLU08](https://github.com/LDSSA/batch2-blu08), especially the [SVD and PCA notebook](https://github.com/LDSSA/batch2-blu08/blob/master/Learning%20Notebooks/BLU08%20-%20Learning%20Notebook%20-%20Part%202%20of%203%20-%20SVD%20and%20PCA.ipynb))
    * Latent-factor models (matrix factorization).

Matrix factorization, especially SVD, covered in the NLP specialization, is a good option for RS because: 
* It compresses the dimension space
* Uncovering latent-factors
* And reducing sparsity.

## 9.3 Ratings Scaling

Despite the global rating scale, each user has its scale, i.e., some users are more critic than others.

These different scales make it harder for the RS identify liked/disliked items, and provide recommendations in the appropriate scale.

Also, it makes ratings comparable across all users.

### 9.3.1 Mean-centering

Mean-centering determines if a rating is positive or negative by comparing it to the user mean rating.

A raw rating $r_{ui}$ is transformed to a mean-centered one $h(r_{ui})$ by subtracting to $r_{ui}$ the average $\mu_u$ of the ratings given by $u$ to $i \in I_u$.

$$h(r_{ui}) = r_{ui} - \mu_u$$

We can adapt our algorithms to use $h(r_{ui})$, instead of the raw rating $r_{ui}$.

### 9.3.2 Z-score

The z-score considers both the mean $\mu_u$ and the spread, measured as the standard deviation $\sigma_u$, in individual rating scales.

$$h(r_{ui}) = \frac{r_{ui} - \mu_u}{\sigma_u}$$

Again, we can adapt our algorithms to use the z-score instead of $r_{ui}$.

### 9.3.3 Normalization

We already referred to normalization, which is to ensure that all vectors have the same length.

When we use the cosine similarity, we embed normalization in the similarity computations.

## 9.4 Selection of Neighbors

### 9.4.1 Top-*N* filtering

An essential hyperparameter to tune is the maximum amount of neighbors to use in the predictions.

Some libraries (Surprise!) allow us to use parameter sweep to explore the hyperparameter space.

(Something to explore on our own, perhaps.)

### 9.4.2 Threshold filtering

Another option is to filter neighbors using a similarity threshold.

### 9.4.3 Negative filtering

Finally, positive similarities have the most predictive power.

In that sense, excluding neighbors with negative similarities might be a sensible thing to do.

## 9.5 Amplified Weights

Another strategy is to use amplified similarity weights, come prediction time.

Similarities $w_{uv}$ and $w_{ij}$ can be amplified as $w_{uv}^\alpha$ and $w_{ij}^\alpha$, respectivelly, where $\alpha > 1$.

This amplification ensures greater importance to the closest neighbors.

## 9.6 Drifting Preferences

Most user-item interactions have a time-drifting nature.

In that sense, some RS use the timestamps, as we've seen in the [BLU10](https://github.com/LDSSA/batch2-blu10), to weight user actions and ratings and decay old profiles.