## Recommendation Systems Overview

### Introduction

Recommendation Systems...
* Help users find related content
* Help users explore new items
* Improve user decision making

For producers, recommendation systems...
* Increase user engagement
* Learn more about customers
* Change user behavior

Recommendation Systems provide a way to **model people's preferences**

## Content-Based or Collaborative

**Content-based filtering** uses item features to recommend new items similar to what the user has liked in the past.

**Collaborative Filtering** uses similarities between users and items simutaneously to determine recommendations.

**Knowledge-based** recommender systems use explicit knowledge about the user, items, and recommendation criteria.

**Deep neural networks**
![](images/dnn.png)

At inference time, we can apply this model to rate previously unseen videos and recommend to the user the video with the highest score

### Recommendation System Pitfalls

The user space and product space are **sparse** and **skewed**

The user space and product space are sparse:
* Most items are rated by fewer users.
* Most users rate only a small fraction of items.

The user space and product space are skewed:
* Some properties are very popular
* Some users are very profilic.

The **cold start problem** occurs when there aren't enough interactions for users or items.

Explicit user feedback is often rare or unobservable

**Implicit feedback** is much more readily available

![](images/implicit_feedback.png)

## Building a Simple Vector-Based Model

### Content-Based Recommendation Systems

**Content-based filtering** uses item features to recommend new items similar to what the user has liked in the past.

### Similarity Measures

An embedding is a map from our collection of items to some finite dimensional vector space.

A **similarity measure** is a metric for items in an embedding space.

dot product

cosine similarity

### Building a User Vector

![](images/user_vector.png)
![](images/user_vector2.png)

### Making Recommendations Using a User Vector

![](images/user_vector_recommend.png)
![](images/user_vector_recommend2.png)

### Making Recommendations for Many Users

![](images/matrix.png)
![](images/create_matrix.png)
![](images/many_user1.png)
![](images/many_user_code1.png)
(users, movies, features)
![](images/many_user2.png)
![](images/many_user_code2.png)
![](images/many_user3.png)
![](images/many_user_code3.png)
![](images/many_user4.png)
![](images/many_user5.png)

## Building a Content-Based Recommendation System with a Neural Network
### Using Neural Networks for Content-Based Recommendation Systems
![](images/kurier.png)

## ALS, a Matrix Factorization Algorithm For Collaborative Filtering

### Types of User Feedback Data

Content-based recommendation use similarities between items in an embedding space

What if we don't know the best factors to compare with?

Collaborative filtering learns latent factors and can explore outside user's personal bubble

Collaborative filtering recommendations use similarities between **items and user simultaneously** in an embedding space

Start from a user-interaction matrix where rows are users and items are columns
* Sometimes these ratings are explicit
* Most often these rating are implicit

If we were creating a YouTube video recommender system where we had “like” and “dislike” data and also the duration a video was watched, which feedback would be considered explicit and which would be considered implicit?

Explicit, Implicit

> 正确 
Correct! Like/dislike is explicit because it speaks to how the user feels about video and watch duration is implicit because it doesn't directly speak to how a user feels about a video.


### Embedding Users and Items

We can organize items by similarity in two dimensions

Simply take the dot product between users and items in embedding space
![](images/based_on_user.png)
![](images/based_on_item.png)

Each user and items is a d-dimensional point within an embedding space.

Embeddings can be **learned from data**.

We're compressing the data to find the best generalities to rely on, called **latent factors**.

![](images/factorization.png)

### Factorization Approaches

Collaborative filtering is usually carried out using matrix factorization
* Factorize user-interactions matrix into user-factors and item-factors.
* Given user ID, multiply by item-factors to get predicted rating for all items.
* Return top k rated items for this user.

Alternating least squares(ALS)

Unobserved pairs
![](images/SVD.png)
![](images/ALS.png)
![](images/WALS.png)

There are many ways to handle unobserved user-interaction matrix pairs. _______ explicitly sets all missing values to zero. _______ simply ignores missing values. _____ uses weights instead of zeros that can be thought of as representing _____.

**SVD, ALS, WALS, low confidence**


### The ALS Algorithm

The WALS Estimator in TensorFlow does not need any labels; it just needs the ratings matrix organized into rows and columns

![](images/WALS_feeding_tf.png)
![](images/WALS_weighted.png)
![](images/ALS_algorithm.png)
![](images/ALS_solving_quiz.png)

### Preparing Input Data for ALS

#### Getting data to the ALS algorithm

Feed the ALS algorithm whole rows(or columns) at a time, but because knowing which stage it's in is difficult, feed both!
![](images/iterative_algorithm.png)

![](images/map_id.png)
Save the mapping to persistent storage because you'll need to map input values to the mapped values!

![](images/precomputing_Gramian.png)

![](images/WALS_map_quiz.png)

## Implementing ALS in TensorFlow

### Creating Sparse Tensors For Efficient WALS Input

Because WALS requires whole rows or columns, the data has to be preprocessed to provide SparseTensors of rows/columns

![](images/preprocess_SparseTensors.png)
![](images/sparse_merge.png)

If we want to recommend items for a user, when we are writing out to the TF Record file:

Our key train feature should be _________.

Our indices train feature should be _________.

Our values train feature should be _________.

**E. user, itemID, rating**

A. item, userID, rating  --> recommend users for an item

### Instantiating a WALS Estimator: From Input to Estimator

![](images/WALS_estimator.png)
![](images/input_fn.png)

The WALS MF estimator takes the INPUT_ROWS and INPUT_COLS as features. If we have our items_for_user and users_for_item TF Records, which filename and vocab_size should we use for each of these features, respectively?

**items_for_user, nitems, users_for_item, nusers**


### Instantiating a WAL Estimator: Decoding TFRecords

![](images/decode_SparseTensor.png)

In our decode_example function, we use VarLenFeature indices and values because of ______, we perform a sparse_merge because of ______, and we concatenate the key because of ______.

**having many ratings per row/col, needing SparseTensors, batching**

we concatenate the key to the indices tensor because batching overwrites the first dimension of the indices with the batch index, so we use this trick and remap keys function later to correct it.

### Instantiating a WALS Estimator: Recovering Keys

Remap key to SparseTensor to fix re-indexing after batching

![](images/remap.png)
![](images/remap2.png)
![](images/remap3.png)
![](images/remap4.png)
![](images/remap5.png)
![](images/remap6.png)

![](images/remap_keys_quiz.png)
![](images/after_remap_keys.png)

### Instantiating a WALS Estimator: Training and Prediction

flip recommendations from users to items
![](images/project_row.png)

![](images/train_and_evaluate.png)

![](images/finding_top_k.png)
![](images/top_k_batch_predict.png)

We saw how to recommend the top K items for users, but what if we wanted to instead **recommend the top K users for items**? What would be the correct change in our batch prediction function?
```
topk = tf.squeeze(tf.map_fn(lambda ____: 
              find_top_k(____, _____,args['topk']), _____,
              dtype=tf.int64))
```

**item, item, user_factors, item_factors**

### Issues with Collaborative Filtering

Our batch predictions were problematic

What we really need are visitorId and contentId in our prediction files, not the enumerated userId and itemId

We should use TensorFlow-Transform to:
1. Create the group-by dataset.
2. Create the vocabulary of visitorId->userId
3. Use the vocabulary when doing predictions

![](images/transform_assets.png)
![](images/convert_back.png)

We want a scalable way to generate predictions that directly tie back to the original data and not just enumerated indices. We should use TensorFlow-Transform to first create ____________, then create ____________, and lastly create ____________.

**group-by datasets, vocabularies, predictions**

### Productionized WALS Demo

https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/deepdive/10_recommend/wals_tft.ipynb

### Cold Starts

#### Collaborative filtering seems powerful... but it's not without its drawbacks.
Pros:
* No domain knowledge
* Serendipity
* Great starting point

Cons:
* Fresh items/users?
* Context features?

#### WALS is thus a way to get user and item embeddings
* These embeddings are created solely from user behavior.
* Would be nice to also use knowledge about the item(content-based) and knowledge about the user(knowledge-based).
* How would we combine multiple predictors?

![](images/how_to_hybrid.png)
![](images/cold_start_strategies_quiz.png)