# Intro to Recommender Systems

## 1 Framework

We need a general framework to reason about recommender systems.

![recommender-systems-framework.png](attachment:recommender-systems-framework.png)

### 1.1 Basic Model

The general framework we are going to use includes three fundamental components:

1. Users 
* Items 
* Data

#### 1.1.1 Users

The *people* in our system, that generate data and will receive item recommendations.

#### 1.1.2 Items

The *things* in our system, with which the users interact and that we want to recommend to them.

#### 1.1.3 Data

The different ways an user can express opinions about our items, i.e. where users and items *meet*.

The data we used in our recommender systems can be:

* *Explicit* - when a user shares his opinion about an item (e.g. ratings, reviews, upvotes and/or downvotes, etc.)
* *Implicit data* - when we *infer* opinions from user actions such as clicks, purchase history.

##### Explicit Data

Typically, you have explicit data when you explicitly ask users what they think about a given item.

This is very common in aggregators of ephemeral items with a short lifespan and low-cost to rate, e.g. [Reddit](https://medium.com/hacking-and-gonzo/how-reddit-ranking-algorithms-work-ef111e33d0d9) and [Hacker News](https://medium.com/hacking-and-gonzo/how-hacker-news-ranking-algorithm-works-1d9b0cf2c08d).

Most of the times, explicit data comes in a continous scale and is composed of ratings, which can have a few problems:

* Different users can have different rating scales
* Ratings may change over time, even for the same user (i.e. *drifting* preferences)
* Ratings are heavily influenced by other things such as consumption, memory and expectations.

##### Implicit Data

We assume that user actions are based on *underlying preferences*, so we collect the actions without explicitly asking for preferences.

Implicit data often times takes the form of *unary data*: positive-only interactions such as clicks or purchases, i.e. we don't have *negative* feedback.

One of the big limitations of click data is that when an user doesn't click on an item, we don't know if he didn't like it, or simply didn't see it.

Something often used to emulate ratings with implicit data is the *time spent* on a given item, or page, in a continuous scale.

## 2 Types of Recommenders

1. Non-personalized recommenders
    * Users who bought X also bought Y
    * Association rules
* Personalized recommenders
    * Content-based filtering: uses item attributes and data to create user profiles
    * Collaborative filtering: uses users' interactions with items

## 3 Extended Model

We can combine the 3 basic components (users, items and data) to build non-personalized recommenders, such as *trending topics*, or *most popular* items.

But what about more sophisticated approaches?

There are three optional components that can be used to extend our model, applied in specific recommenders:

* User attributes
* Item attributes
* User model.

#### 1.2.1 User Attributes

Ways to *describe* our users, such as demographics.

#### 1.2.2 Item Attributes

Ways to *describe* our items, such as text/content, tags or metadata. 

This is the key content-based recommenders, that assume that if an user likes an item, then he likes its attributes, and thus other items with *similar content*.

#### 1.2.3 User Model

User models are a represention of user preferences. 

This is used in content-based recommenders, that infer a model from data and item properties, and match them against unseen items.