# Introduction to Recommender Systems
> A starter blog for recommender systems 

- toc: true
- badges: true
- comments: true
- categories: [jupyter]

## Introduction
I am learning about Recommendation Systems from this popular book/course [Mining Massive Datasets](http://www.mmds.org/) taught by Jure Leskovec, Anand Rajaraman and Jeff Ullman. Most of the content here is a summary of one of the chapters in the book

Recommendation Systems are broadly classified into 2 groups:
* **Content Based Systems**: Recommend items to customer x similar to previous items rated highly by x. Examples: 1) Movie recommendations based on genre, cast and language etc. 2) News based on the similar topics/content. 
* **Collaborative Filtering Systems**: Recommends based on the similarity between items and/or users. The items recommended to an user based on the items liked/reviews by similar users 

In any recommendation systems, there exists 2 entities: **users** and **items**. Users have preference for certain type of items. The data about an user and item is represented as **utility matrix** for any given user item pair. The element in the matrix represents the degree of preference of the user for the item. Rows represent users and columns represent items. 

**Utility Matrix** is sparse. Most of the elements are unknown i.e we don't have any information about the user-item pair. It is not necessary to fill all the unknown entries in thr matrix. It is only useful to fill highly likely entries in each row.

## Long Tail Problem
Why are recommendation systems necessary ?. A phenomenon called long tail makes them necessary. Traditionally, physical delivery systems have scarcity of resources and stores have limited shelf space. It is not possible to show all his choices. On the other hand, online stores can show all the choices to the customers. There can be millions of products in the cateogry you are interested in. Suddenly, we have the problem of **abundance**. The distinction between physical and online stores is called long tail phenomenon. This problem forces the institutions to recommend items to users.

## Content Based Recommendation Systems
These type of systems focus on properties of items. You find similar items to the item liked/purchased/read by the user. Similarity is measured based on properties of items.

### Item Profiles
We construct a profile for each item based on its properties. Profile can be a record or collection of records based on important characterstics of the items. Example: Profile of a movie can be the following features
    1. Cast 
    2. Director 
    3. Genre 
    4. Language 
    5. Year of release
    
Sometimes items can be document/images/webpage. There are different ways of extracting features from the item based on its type. For example, extracting features from documents starts from eliminating stop words, then compute TFIDF socre for each word in the documents. The words with the highest score characterise the document. We will use different type of metrics to measure similarty (Jaccard Distance, Cosine Similiarty etc). Items profiles can be represented as vectors with different components. 

### User Profiles
We need to create vectors with components that descrive user's preferences. It can be just weighted average of rated item profiles. 

### Recommend Items to Users
We can estimate the degree to which an user would prefer a new item by computing similarity between the user's and item's profiles (vector representations).

$$ u(x,i) = cos(x,i) = \frac{x.i}{||x||.||i||} $$ 
for an user profile $x$ and item profile $i$ 

### Pros and Cons of Content Based Approach
Pros:
* No need for data on other users
* Able to recommend users with unique taste
* Able to recommend new and unpopular items 
* Provide reasoning behing recommendations 

Cons:
* Creating features is hard 
* Cannot recommend new users 
* Never recommend users outside his interest 
* Not using quality judgment of most other users

## Collabarative Filterting
Instead of looking at properties of item, we focus on similarity of the user ratings for the 2 items. Users are similar if their vectors are close accoring to some distance measure. The process of indentifying similar users and recommending what similar users like is called collabarative filtering. 

### Measuring Similarity
There are different distance measures
* Jaccard Distance: It ignores the values and focus only on set of items rated. 
$$ J(r_x,r_y) = \frac{r_x \cap r_y}{ r_x \cup r_y} $$

* Cosine Similarity: Missing values are treated as 0 which means "negative" rating/liking. Larger the value, more similar the users are 
$$ cos(r_x,r_y) = \frac{r_x.r_y}{||r_x||.||r_y||}$$

* Pearson Correlation Cofficient: For the items rated by both users. 
    $$ sim(x,y) = \frac{{}\sum_{s} (r_{xs} - \overline{r_x})(r_{ys} - \overline{r_y})}
{\sqrt{\sum_{s} (r_{xs} - \overline{r_x})^2(r_{ys} - \overline{r_y})^2}} $$

In order to solve problem of treating missing entries as 0 ("negative"), you can normalize the the entries by subtracting the mean of the ratings of the user (row in the utility matrix). 

### Making Predictions
Let $r_x$ be the vector of user $x's$ ratings. And $N$ be the set of $k$ users most similar to $x$ who have rated the item $i$. 

$$ r_{xi} = \frac{1}{k} \sum_{y=1}^{N} r_{yi} $$ 

or $$ r_{xi} =  \frac{\sum_{y=1}^{N} s_{xy} r_{yi}}{\sum_{y=1}^{N} s_{xy}} $$  where 
$$ s_{xy} = sim(x,y) $$

So far, we have looked at finding the most similar users which is called **User-Use Collaborative Filtering**. Another view is **Item-Item based approach**. You can find the $m$ items most similar to item $I$ for some value of $m$ and take the average rating among the $m$ items of the ratings that $U$ has given. As for user-user similarity,we consider only those items among the m that U has rated, and it is probably wise to normalize item ratings first.

In practice, **Item-Item** approach works better than **User-User** approach since items are simpler but users have complex taste

### Pros and Cons of Collaborative Filtering
Pros:
* Works for any kinf of item, no feature creation needed 

Cons:
* Cold Start Problem
* Utility matrix is sparse
* First rater problem: Cannot recommend an item that has not been rated 
* Cannot recommend items to someone with unique taste

## Latent Factor Models