# Types of Recommender Systems

In this notebook, we'll go through a brief overview of the types of recommender systems out there.

---
# 4 Types of Recommender Systems

1. Rank-Based
    - Recommend the most popular items to every user.
2. Knowledge-Based
    - Recommend items based on what the user has filtered, e.g. Adidas --> Shoes --> Ultraboost
3. Content-Based 
    - Recommend items based on customer's previous purchases or browsing history, e.g. They browsed for a bike and you recommend similar items to bikes like helmet, bell, knee-guard, wheels ... (*Note that no information about the ranking that the user gives the item is used which is different from Item-Based Collaborative Filtering*)
4. Collaborative Filtering
    - Neighborhood-Based methods
        1. Define a similarity metric
            - Similarity:
                - Pearson's correlation coefficient (Cosine Similarity of Centered data)
                - Spearman's correlation coefficient
                - Kendall's Tau
            - Distance:
                - Euclidean Distance
                - Manhattan Distance
        2. Choose Perspective
            - User-Based
                - Recommend items based on what similar __users__ ranked highly
            - Item-Based
                - Recommend similar __items__ to those that the user has ranked highly
    - Model-Based methods
        1. Matrix Factorization
            - Singular Value Decomposition(SVD)
            $$ \large{\mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^T} $$
                - $\mathbf{U}$ gives information about how users are related to latent features. 
                - $\boldsymbol{\Sigma}$ gives information about how much latent features matter towards recreating the user-item matrix. 
                - $\mathbf{V}^T$ gives information about how much each movie is related to latent features.
                    - FunkSVD (*for Matrices with Missing Values*)
                    $$\large{\mathbf{A} = \mathbf{U} \mathbf{V}^T}$$
                    $$J = \mathop \sum \limits_{i,j} w_{i,j} \cdot \left( \mathbf{A}_{i,j} - {u_i}\times{v_j^T}\right)^2 + \lambda \left( ||\mathbf{U}||_2 + ||\mathbf{V}||_2 \right)$$
                    $$\text{where } {w_{i,j}} = \left\{ {\begin{array}{*{20}{c}}{\begin{array}{*{20}{c}}1&{{\mathbf{A}_{i,j}}\;is\;known}\end{array}}\\{\begin{array}{*{20}{c}}0&{{\mathbf{A}_{i,j}}\;is\;unknown}\end{array}}\end{array}} \right.$$
                    - $\mathbf{U}$ gives information about how users are related to latent features.
                    - $\mathbf{V}^T$ gives information about how much each movie is related to latent features.
                        1. Randomly Initialize ${\mathbf{U}}$ and ${\mathbf{V}}$
                        2. Use Optimization method
                            1. Gradient Descent to minimize MSE on the known values of ${\mathbf{A}}$ and predicted values via 
                                - `v_new = v_old + learn_rate*2*(actual - pred)*u_old`
                                - `u_new = u_old + learn_rate*2*(actual - pred)*v_old` 
                                    - Where pred is the value you computed above as the dot product of the row for the user, and column for the movie. Then `u_old` is the existing value in the ${\mathbf{U}}$ matrix and `v_old` is the corresponding value in the ${\mathbf{V}}$ matrix that was multiplied by the `u` value when computing the dot product.
                            2. Alternating Least Squares by fixing either $\mathbf{U}$ or $\mathbf{V}$ and solving OLS for the other. When we fix either one, e.g. $\mathbf{U}$, $||\mathbf{A}-\mathbf{U}\times{\mathbf{V}^T}||_2 = \mathop \sum \limits_{i,j}\left({\mathbf{A}_{i,j}-{u_i}\times{v_j}}\right) = ||y - X\beta||_2 + L_2\text{Regularization} = \text{Ridge Regression}$, therefore 
                            $$\forall{u_i}: J\left({u_i}\right) = ||\mathbf{A}_i - {u_i}\times{\mathbf{V}^T}||_2 + \lambda \cdot ||u_i||_2$$ $$\forall{v_j}: J\left({v_j}\right) = ||\mathbf{A}_i - \mathbf{U}\times{v_j^T}||_2 + \lambda \cdot ||v_j||_2$$
            - Non-negative Matrix Factorization
        2. Neural Networks
            - Restricted Boltzmann Machines
            - Sequential Recommendation engines
                - GRU4Rec
                - Caser
                - TransRec
                - SASRec
                - [BERT4Rec](https://arxiv.org/pdf/1904.06690.pdf)
            - Deep Autoencoders
            - Variational Autoencoders
            - Neural Matrix Factorization
            - Deep Collaborative

---
# Cold Start Problem

Collaborative filtering using FunkSVD still isn't helpful for new users and new movies. In order to recommend these items, we need to implement content based and ranked based recommendations along with FunkSVD.

---
# Goal of Recommender Systems

1. Diversity
2. Coverage
3. Serendipity
4. Novelty

---
## Resources

- [FunkSVD]()
- [Singular Values ~ **SQRT(**Eigenvalues**)**](https://math.stackexchange.com/questions/127500/what-is-the-difference-between-singular-value-and-eigenvalue)
- [Why Eigenvalues are Variances along its Eigenvector](https://math.stackexchange.com/questions/2147211/why-are-the-eigenvalues-of-a-covariance-matrix-equal-to-the-variance-of-its-eige)
- [Recommender Systems are not all about accuracy](https://gab41.lab41.org/recommender-systems-its-not-all-about-the-accuracy-562c7dceeaff)
- [Overview of why we use matrix factorization to solve co-clustering](https://datasciencemadesimpler.wordpress.com/tag/alternating-least-squares/)
- [ALS Recommender System](https://towardsdatascience.com/prototyping-a-recommender-system-step-by-step-part-2-alternating-least-square-als-matrix-4a76c58714a1)
- [Collaborative filtering](https://www.youtube.com/watch?v=wDxTWp3KMMs)
- [Collaborative Variational Autoencoder for Recommender Systems](http://eelxpeng.github.io/assets/paper/Collaborative_Variational_Autoencoder.pdf)
- [How Variational Autoencoders make classical recommender systems obsolete.](https://medium.com/snipfeed/how-variational-autoencoders-make-classical-recommender-systems-obsolete-4df8bae51546)