# WRIT 340E: Advanced Writing for Engineers

---
# Illumin Article

### Journey into Recommender Systems - How does Netflix / Spotify / Target / Amazon know what to recommend me and what's next for the future of recommendations?

By: Chengyi (Jeff) Chen

### Abstract

*Recommender systems (as the name discretely suggests) fall under a category of systems that filter information from consumers to suggest products with the maximum likelihood of being purchased. As the amount of data attainable from consumers explodes, better and more personalized recommendations can and have been built to increase the likelihood of purchase even more. This article seeks to elucidate the world of recommendation systems and how they have grown to make more personalized recommendations for users of large corporations such as Netflix, Spotify, Target, Amazon, YouTube.*

### Recommender Systems

Your Generic Coffee Shop Owner Bill: "*Morning Tom! Your usual Americano?*"

Tom: "*Yes please!*"

Your Generic Coffee Shop Owner Bill: "*How about a sandwhich to pair with your drink?*"

Recommendations are everywhere. And where there exists recommendations, there must exist a system to make those recommendations. Why didn't Bill recommend another Americano to Tom? Or maybe another drink like an espresso? Maybe Bill has identified that drinks and sandwiches are complements and tend to "go together" as observed from past transactions. But we can't let Bill do all the work right? How does Netflix, Spotify, Target, Amazon, YouTube, and many others do it? Let us discover how the applications and platforms that we all love make recommendations because these recommendations have a huge impact on our user experience.

### Popularity

Let's examine YouTube's landing page and see if there're any clues as to what methods they're using to recommend videos:

<img src='./data/youtube.png' style='border: 5px solid black; border-radius: 5px;'/>

We observe that as a new user, YouTube recommends the most popular videos. This is probably the simplest recommender system we can build. However, the definition of "popularity" differs between different companies and across products. In the eyes of YouTube, the most "popular" videos might be the ones with the highest number of views, or videos with the most number of likes, or videos that have the most shares over the past week or month (Note that time is indeed an important factor in making recommendations too, and we'll address more complex ways to handle this temporal dependency later on when we talk about model-based recommendation systems). This method, however, has several disadvantages. Most importantly, there is a lack of personalization in these recommendations. What if you're a biology researcher that only watches YouTube for videos about DNA sequencing? "*Superfan Brad Pitt Distracts Ellen While Sitting in the...*" might be highly irrelevant to you. Space on a landing page is a highly valuable piece of real estate that you want to fill with videos that the user is likely to click on. In the eyes of Netflix, "Personalization enables us to find an audience even for relatively niche videos that would not make sense for broadcast TV models because their audiences would be too small to support significant advertising revenue, or to occupy a broadcast or cable channel time slot" [1]. So how do we make more personalized recommendations?

### Collaborative Filtering

Let's find a way to make more personalized recommendations to our users. To do this, we'll use **Collaborative Filtering**, which asserts that users who are similar purchase similar products and hence, **collaborative**. Consequently, our job can be categorized into 2 perspectives - **User-based** or **Item-based**. Imagine we have 2 very similar users, Tom and Harry. **User-based** recommendations would involve recommending Harry new products that Tom had given positive reviews for. **Item-based** recommendations would involve recommeding items similar to the ones that Harry had reviewed positively.

#### Memory-based Techniques: User-based and Item-based

But how do we find "similar" users or items? In order to understand how we'll quantify what "similarity" will be, let's take a look at what type of data we'll have when trying to build a recommendation system.

$$
\begin{aligned}
X_{n\times m} &= 
\underbrace{
\begin{bmatrix}
x_{11} & x_{12} & \ldots & x_{1j} & \ldots & x_{1m} \\ 
x_{21} & x_{22} &        &        &        &        \\ 
x_{31} &        & \ddots &        &        &        \\ 
\vdots &        &        & x_{ij} &        & \vdots \\ 
       &        &        &        & \ddots &        \\ 
x_{n1} &        & \ldots &        &        & x_{nm} \\ 
\end{bmatrix}
}_{n\,\text{rows (users)}\,\times\,m\,\text{columns (items)}}
\end{aligned}
$$

Suppose that we're trying to build a simple recommendation system for Amazon. Each row of the matrix above would represent each user, while each column would represent each product on Amazon that the user **may or may not have rated**. The entries within the matrix are then the ratings / reviews a user has given for a specific product. Our job now is to then define a **similarity metric**, a way to score how similar each pair of users or products are. A very common **similarity metric** is the cosine similarity, which is given by:

$$
\begin{aligned}
cos\theta &= \frac{\mathbf{\vec{a}} \cdot \mathbf{\vec{b}}}{\vert\vert\mathbf{\vec{a}}\vert\vert\vert\vert\mathbf{\vec{b}}\vert\vert}
\end{aligned}
$$

For example, to implement User-based collaborative filtering, for every possible pair of users $a$ and $b$ (represented by the row vectors $\mathbf{\vec{a}} = \begin{bmatrix} x_{a1} & x_{a2} & x_{a3} & \ldots & x_{am} \end{bmatrix}$ and $\mathbf{\vec{b}} = \begin{bmatrix} x_{b1} & x_{b2} & x_{b3} & \ldots & x_{bm} \end{bmatrix}$), the cosine similarity is as follows:

$$
\begin{aligned}
cos(\theta) &= \frac{\begin{bmatrix} x_{a1} & x_{a2} & x_{a3} & \ldots & x_{am} \end{bmatrix} \cdot \begin{bmatrix} x_{b1} \\ x_{b2} \\ x_{b3} \\ \vdots \\ x_{bm} \end{bmatrix}}{\sqrt{x_{a1}^2 + x_{a2}^2 + x_{a3}^2 + \ldots + x_{am}^2}\sqrt{x_{b1}^2 + x_{b2}^2 + x_{b3}^2 + \ldots + x_{bm}^2}} \\
&= \frac{x_{a1}x_{b1} + x_{a2}x_{b2} + x_{a3}x_{b3} + \ldots + x_{am}x_{bm}}{\sqrt{x_{a1}^2 + x_{a2}^2 + x_{a3}^2 + \ldots + x_{am}^2}\sqrt{x_{b1}^2 + x_{b2}^2 + x_{b3}^2 + \ldots + x_{bm}^2}}
\end{aligned}
$$

Following the steps above, we're calculating to see whether the vectors (users) lie in the same direction. The smaller the $\theta$ value ($\theta < 90^\circ $ between the two lines in the image below) means that the 2 users are highly similar.

<img src='https://i2.wp.com/dataaspirant.com/wp-content/uploads/2015/04/cosine.png' style='border: 5px solid black; border-radius: 5px;'/>

After calculating the cosine similarity for each pair of users, we store them in **memory**, which is also the reason why we categorize user-based and item-based collaborative filtering as memory-based techniques. When we want to recommend a product to a user like Harry, we'll sort all the users that have the highest cosine similiarity to Harry in decreasing order and sequentially find new products that the most similar users, also known as Harry's **nearest neighbors**, have reviewed positively and recommend them to Harry [3].

There is however, a problem of **Scalability** with these memory-based collaborative filtering techniques. As the number of computations we have to handle grows with both the number of users and products, we'll need a more scalable solution [11].

#### Model-based Techniques: Matrix Factorization and Temporal

Introducing Singular Value Decomposition (SVD):

$$
\begin{aligned}
X_{n\times m} 
&= 
U_{n\times r} \cdot S_{r\times r} \cdot V^\top_{r\times m} \\
\begin{bmatrix}
x_{11} & x_{12} & x_{1j} & \ldots & x_{1m} \\ 
x_{21} & x_{22} &        &        &        \\ 
x_{31} &        & x_{ij} &        &        \\ 
\vdots &        &        & \ddots & \vdots \\
x_{n1} &        &        &        & x_{nm} \\ 
\end{bmatrix} 
&=
\begin{bmatrix}
u_{11} & u_{12} & u_{1j} & \ldots & u_{1r} \\ 
u_{21} & u_{22} &        &        &        \\ 
u_{31} &        & u_{ij} &        &        \\ 
\vdots &        &        & \ddots & \vdots \\
u_{n1} &        &        &        & u_{nr} \\ 
\end{bmatrix} 
\cdot
\begin{bmatrix}
s_{11} & 0      & 0      & \ldots & 0      \\ 
0      & s_{22} &        &        &        \\ 
0      &        & s_{ii} &        &        \\ 
\vdots &        &        & \ddots & \vdots \\
0      &        &        &        & s_{rr} \\ 
\end{bmatrix} 
\cdot
\begin{bmatrix}
v_{11} & v_{12} & v_{1j} & \ldots & v_{1m} \\ 
v_{21} & v_{22} &        &        &        \\ 
v_{31} &        & v_{ij} &        &        \\ 
\vdots &        &        & \ddots & \vdots \\
v_{r1} &        &        &        & v_{rm} \\ 
\end{bmatrix} 
\end{aligned}
$$

SVD is a matrix factorization technique that splits a matrix $X$ into 3 others, $U$ (row vectors representing users), $S$ (A matrix full of zeroes except for non-zero entries in the diagonal), and $V^\top$ (column vectors representing products, items, movies, or etc...). The most crucial matrix $S$

However, there are still some issues that come with this approach. Remember how we said at the start that users **may or may not** have rated a particular item? In platforms such as Amazon, there is an overwhelmingly larger number of products that users have not rated as opposed to the number of products they have purchased. This is also known as **Sparsity**. Because of how unlikely it is to find users that have both purchased very similar items due to the sheer mass of products available for purchase in Amazon, the idea of using "similarity" between users or items degrade in it's usefulness. Another reason why **Sparsity** can be problematic can be observed in the case of movies - "... when the movies are not too popular or are just released, then the items will have few ratings or will not have at all. Therefore, for an algorithm to find the nearest neighbor and create a recommendation for a user will be extremely difficult, and the accuracy of the output will be really low" [11].

### Content-based Filtering

### Hybrid Filtering

### Future of Recommender Systems

However, we need to care about serendipity of the recommendations too - no one wants to watch the same video for eternity even if its the most popular.

### References

[1] Gomez-Uribe, Carlos; Hunt, Neil. (2015). *The Netflix Recommender System: Algorithms, Business Value, and Innovation* [Online]. Available: http://delivery.acm.org/10.1145/2850000/2843948/a13-gomez-uribe.pdf?ip=68.180.70.23&id=2843948&acc=OA&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2EE5B8A747884E71D5&__acm__=1567825268_41bb18002e4f22845f3426c1f2b54cd9
- This source will be referenced to explain the details of how the Netflix Recommendation System works and what Netflix has done to continue improving on the model.

[2] Yehuda Koren. (2010). *Factor in the neighbors: Scalable and accurate collaborative filtering* [Online]. Available: http://courses.ischool.berkeley.edu/i290-dm/s11/SECURE/a1-koren.pdf
- This source will be referenced in the section on *The Future of Recommender Systems* when we talk about how critical it is for recommendation systems to scale in size as the number of users grow.

[3] Leidy Esperanza Molina Fernandez. (2018). *Recommendation System for Netflix* [Online]. Available: https://beta.vu.nl/nl/Images/werkstuk-fernandez_tcm235-874624.pdf
- A bulk of the content with regards to traditional recommender systems and how they work will come from here.

[4] Pedro G. Campo, Fernando Díez, Iván Cantador. (2013). *Time-aware recommender systems: a comprehensive survey and analysis of existing evaluation protocols* [Online]. Available: https://link.springer.com/article/10.1007/s11257-012-9136-x
- To be used in the section on *Model-based Techniques* to describe how some recommendation systems also account for timings between user interactions with the website to generate different recommendations.

[5] Matthew Hindman. (2018). *The Internet Trap: How the Digital Economy Builds Monopolies and Undermines Democracy* [Online]. Available: https://www.jstor.org/stable/j.ctv36zrf8
- To be used in the abstract to give context about the significance of recommendation systems and how multinational conglomerates are using these to "*grab all the profits from the attention economy*".

[6] Shuai Zhang, Lina Yao, Aixin Sun, Yi Tay. (2018). *Deep Learning based Recommender System: A Survey and New Perspectives* [Online]. Available: https://arxiv.org/pdf/1707.07435.pdf
- To be referenced in the section on *The Future of Recommender Systems* to explain the state-of-the-art methods of recommending products.

[7] Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu. (2016). *Wavenet: A Generative Model for Raw Audio* [Online]. Available: https://arxiv.org/pdf/1609.03499.pdf
- To be referenced in the section on *The Future of Recommender Systems* to explain the state-of-the-art methods of recommending products.

[8] Balazs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, Domonkos Tikk. (2016). *Session-based Recommendations with Recurrent Neural Networks* [Online]. Available: https://arxiv.org/pdf/1511.06939.pdf
- To be referenced in the section on *The Future of Recommender Systems* to explain the state-of-the-art methods of recommending products that uses session data from users.

[9] Paul Covington, Jay Adams, Emre Sargin. (2016). *Deep Neural Networks for YouTube Recommendations* [Online]. Available: https://www.semanticscholar.org/paper/Deep-Neural-Networks-for-YouTube-Recommendations-Covington-Adams/760948698540118031e590fbc884fcea209f9104
- To be referenced in the abtract to explain how Youtube recommends videos to its users.

[10] Michael Jahrer, Andreas Töscher, Robert Legenstein. (2010). *Combining Predictions for Accurate Recommender
Systems* [Online]. Available: http://elf-project.sourceforge.net/CombiningPredictionsForAccurateRecommenderSystems.pdf
- To be referenced in *The Future of Recommender Systems* to explain another technqiue to improve on recommendations by combining the results from multiple recommendation models.

[11] Sarwar, Badrul et al. (2001). *Item-based Collaborative Filtering Recommendation Algorithms* [Online]. Available: https://dl.acm.org/citation.cfm?doid=371920.372071