# Introduction

In Content-based Recommendation Systems, we have been familiar with a simple Product Recommendation System based on the features of each item. The characteristic of Content-based Recommendation Systems is that the model construction for each user does not depend on other users but depends on the profile of each item. This has the advantage of saving memory and computing time. At the same time, the system is able to take advantage of the characteristic information of each item as described in the description of each item. This description can be built by the supplier or collected by asking users to tag items. Building a feature vector for each item often includes Natural Language Processing (NLP) techniques.

The above approach has two basic disadvantages:
- When building a model for a single user, content-based systems do not take advantage of information from other users. This information is often useful because users’ purchasing behavior is often grouped into a few simple groups; if the purchasing behavior of a few users in a group is known, the system should be able to infer the behavior of the remaining users.
- We don’t always have a description for every item. Asking users to tag is even more difficult because not everyone is willing to do it; or if they do, it’s a personal preference. NLP algorithms are also more complicated because they have to deal with synonyms, abbreviations, misspellings, or words written in different languages.

The above disadvantages can be solved by Collaborative Filtering (CF). In this article, I will present to you a CF method called Neighborhood-based Collaborative Filtering (NBCF). The next article will present another CF method called Matrix Factorization Collaborative Filtering. When we just say Collaborative Filtering, we will implicitly understand that the method used is Neighborhood-based.

The basic idea of ​​NBCF is to determine the interest level of a user in an item based on other users who are similar to this user. The similarity between users can be determined through the interest level of these users in other items that the system already knows. For example, A and B both like the movie Criminal Police, meaning they both rate this movie 5 stars. We know that A also likes The Judge, so it is likely that B also likes this movie.

As you can imagine, the two most important questions in a Neighborhood-based Collaborative Filtering system are:
- How to determine the similarity between two users?
- Once similar users are identified, how to predict a user's interest in an item?

# User-user Collaborative Filtering

### Similarity function

The most important task to do first in User-user Collaborative Filtering is to determine the similarity between two users. The only data we have is the Utility matrix $Y$. So this similarity must be determined based on the columns corresponding to the two users in this matrix.

![image.png](attachment:image.png)

Suppose user from $u_0$ to $u_6$ and all the item from $i_0$ to $i_4$ where the number in the matrix is the rating of the user for the item.

In the first observation, we can realize that $u_0$ and $u_1$ like $i_0$, $i_1$, #i_2$ and ignore $i_3$, $i_4$. And in constrast to other user. Then a similarity function must ensure:

$$sim(u_0, u_1) > sim(u_0, u_i), \forall i > 1$$


Luckily that $u_1$ like $i_2$ so the system need to recommend $i_2$ to $u_0$.