# Recommendation system

## Scope

- Give a user and context (time, location, etc) predict probability of engagement for each movie, and order movies.
- Will use implicit feedback (user watched the movie or not) rathen explicit feedback (user rated the movie) to gather large training data.

## Metrics

Online
- Engagement rate: (user clicked a movie / total number of sessions)
- Videos watched: count videos user watch at least for some time.
- Session watch time: overall time that user spent watching movies based on recommendation in a session.

Offline
- Mean Average Precision (mAP @ N)
    - $AP@N = \dfrac{1}{n}\displaystyle\sum_{k=1}^{N}P(k)rel(k)$
    - $P(k)$ = precision up to $k$
    - Precision = number of relevant recommendations / total number of recommendations
    - rel(k) = whether $k^{th}$ item is relevant or not
    - N = length of recommendation list
    - m = number of movies relevant to user based on historical data
    - Measures how system performs overall.

- Mean Average Recall (mAR @ N)
    - Recall = number of relevant recommendations / number of all movies
    - Measures how many top recommendation (based on historical data) that system can put in the recommendation list.
    
- F1 score = 2 * (mAP*mAR) / (mAP+mAR)

## Architecture

<img src="img/recommendation_system1.png" style="width:800px;height:600px;">

## Feature engineering

<img src="img/recommendation_system2.png" style="width:1000px;height:200px;">

User
- age
- gender
- language
- country
- average_session_time
- last_genre_watched
- user_actor_histogram: histogram showing historical interaction between users and actors in movies.
- user_genre_histogram
- user_language_histogram

Context
- season_of_the_year
- upcoming_holiday
- days_to_upcoming_holiday
- time_of_day
- day_of_week
- device

Media
- public-platform-rating
- revenue
- time_passed_since_release_date
- time_on_platform
- media_watch_history
- genre
- movie_duration
- content_set_time_period
- content_tags
- show_season_number
- country_of_origin
- release_country
- release_year
- release_type
- maturity_rating

User-media
- user_genre_historical_interaction_3months
- user_genre_historical_interaction_1year
- user_and_movie_embedding_similarity
- user_actor
- user_director
- user_language_match
- user_age_match

Sparse feature
- movie_id
- title_of_media
- synopsis
- original_title
- distributor
- creator
- original_language
- director
- first_release_year
- music_composer
- actors

## Candidate generation

- Select top $k$ movies to recommend to user.
- Focuses on recall.

### Collaborative filtering
- Find users simialr to active user based on historical watches.

<img src="img/recommendation_system3.png" style="width:300px;height:300px;">

1. Nearest neighborhood
- Computationally expensive.

<img src="img/recommendation_system4.png" style="width:500px;height:300px;">

- Consider $n$ by $m$ matrix of user $n_{i}$ and movie $m_{j}$
- 1: user watched the movie.
- 0: user ignored the movie.
- empty: no impression yet.

<img src="img/recommendation_system5.png" style="width:500px;height:300px;">

- Task is to predict the feedback for movies that users haven't watched.
- Compute (for example) cosine similarity between user $i$ and other users, and select top $k$ similar users. (Nearest neighbors)
- Then, take weighted average of feedback from top $k$ similar users for movie $j$.

2. Matrix factorization

- Use latent vector $M$ such that
    - User profile matrix $n$ by $M$.
    - Media profile matrix $M$ by $m$. 
- Latent vector $M$ can be considered as features of users or movies.
- Initialize user and movie vectors randomly. 
- For each known feedback value $f_{ij}$, predict feedback by taking dot product between user profile vertor $n_{i}$ and movie profile vector $m_{j}$. 
- Difference betweeen actual and predicted will be the error.
    - $e_{ij} = f_{ij} - n_{i} \cdot m_{j}$
- Use stochastic gradient descent to update user and movie latent vectors.

<img src="img/recommendation_system6.png" style="width:500px;height:300px;">

### Content-based filtering
- Make recommendations based on content of media that user had already interacted with.

Two options for recommending media to user
1. Similarity with historical interactions.
2. Similarity between media and user profiles.

### Embedding-based similarity
- Use deep learning to generate latent vectors/embeddings to represent both movies and users.

## Training data generation

- User watched 80% or more of the movie? positive example
- User watched 10% or less of the movie? negative example
- Between 10% and 80%? uncertain example
- Make sure to downsample over-represented class.

## Ranking

- Probability of user watching a media.
- Focuses on precision.

### Logistic regression
- When training data is limited.

<img src="img/recommendation_system7.png" style="width:500px;height:300px;">

### Deep learning
- When 100M data is available.

Two sparse features to consider
1. Videos user watched in the past.
2. User's search terms.

How to feed these into network
<img src="img/recommendation_system8.png" style="width:1000px;height:700px;">

Start with 2-3 hidden layers with RELU.

<img src="img/recommendation_system9.png" style="width:1000px;height:700px;">