# Practical Challenges for Recommenders

## Objectives

Know 4 common problems affecting recommender systems and ways to address them.

1. Wrong metric
1. Cold start
1. Hard to incorporate all useful data
1. Speed and stability

# Wrong Metric

What cost function does our matrix factorization recommender use?

$$ \sum_{u,i} (R_{u,i} - p_u q_i)^2 $$

Is it _actually_ important to minimize this squared error? If I'm trying to recommend 5 things you're most likely to buy, is it important that I know exactly how likely you are to buy?

# Alternative Metrics

* Average _true_ rating of top N recommendations.
* Approximate true objective with something optimizable... like $\sum r_ie_i^2$.
    * This would penalize bad misses on things rated highly

# Cold Start Problem

Matrix factorization requires information about items' latent features and users' affinity for those features, this requires data to learn well, but what about new users or new items?

| Method | Affected by Cold Start? |
|--------|------------|
| Item-Item | highly |
| Factorization | moderately |
| Popularity | observable factors/popularity | 

# Cold Start
## Item-Item


$$ R_{u, i}  = \frac{\sum_{j\in items, N} s_{i,N} R_{u,N}}{\sum_{j \in items,N}(|s_{i,N}|)} $$



Why is it hard to get good results from this in a cold-start scenario?

# Cold Start

## Factorization

$$ r_{u,i} = p_u^T q_i $$

Why is it hard to compute a good rating from this method in a cold-start scenario?

# Cold Start
## Observed Factors

$$ r_{u, i} = \sum_j \beta_{u, j} x_{i,j} $$

When can you expect good results from this method in a cold-start scenario?

# Cold Start
## Popularity

What kind of cold-start would we expect to use popularity?

# Useful Data Not Captured by SVD

* Overall item quality
* Overall user ratings
* Implicit feedback (what a user has rated may matter in addition to how they rated it)
* Observed features of users or items.

# Bias Terms

We don't need a fancy model to capture the average rating of an item, or the average rating of a user:

$$ r_{u,i} = b_i + b_u + p_u^Tq_i $$

Usually also capture a global bias term too:

$$ r_{u,i} =  \mu + b_i + b_u + p_u^Tq_i $$

# Implicit Feedback

We can incorporate information about _what_ a user has rated in addition to how they rated it.

This is known as SVD++

$$ r_{u,i} =  \mu + b_i + b_u + q_i^T(p_u + |N(u)|^{-1/2} \sum_{j\in N(u)} y_j) $$

Where $N(u)$ is the set of all items for which a users provided an implicit preference.

The final implicit preference term allows us to adjust our ratings when a user has not rated a movie. For example, a user who has never rated Star Wars Episode III, could be expected to have a lower rating for Episode IV.

# Observed User or Item Features

Diagram.

Note similarity to OLS.

# Speed and Performance

* Serialize the Model
* Vectorize your operations
* Work in batches (if possible)


# In practice

* Spark implements a Matrix Factorization model, but not any of the advanced things we've seen here.
* GraphLab implements almost all of what we've seen, but is not open-source.
* Many companies who do large-scale recommendations probably roll their own solutions.