# Introduction

The basic models for recommender systems work with two kinds of data, which are 
- (i) the user-item interactions, such as ratings or buying behavior: **collaborative filtering methods**, and 
- (ii) the attribute information about the users and items such as textual profiles or relevant keywords: **content-based recommender methods** 
- In **knowledge-based recommender** systems, the recommendations are based on explicitly specified user requirements. Instead of using historical rating or buying data, external knowledge bases and constraints are used to create the recommendation.
- **Hybrid systems** can combine the strengths of various types of recommender systems to create techniques that can perform more robustly in a wide variety of settings.

# Content based Recommender System

In content-based recommender systems, the descriptive attributes of items are used to make recommendations. The item descriptions, which are labeled with ratings, are used as training data to create a user-specific classification or regression modeling problem. For each user, the training documents correspond to the descriptions of the items she has bought or rated

ADVANTAGES:
- making recommendations for new items, when sufficient rating data are not available for that item.
- Transparency
- User independence: no need for user data

Popular for: **cold-start problems**:
- One of the major problems in recommender systems is that the **number of initially available ratings is relatively small**. In such cases, it becomes more difficult to apply traditional collaborative filtering models. Content-based models are more robust than collaborative models in presence of colrd starts. 

DISADVANTAGES: 
- Limited to content: content-based methods provide obvious recommendations because of the use of keywords or content.
- Over-specified: the constructed model is specific to the user at hand, and the community knowledge from similar users is not leveraged. This phenomenon tends to reduce the diversity of the recommended items, which is undesirable.
- not effective at providing recommendations for new users. The training model for the target user needs to use the history of her ratings. In fact, it is usually important to have a large number of ratings available for the target user in order to make robust predictions without overfitting.

# Collaborative filtering

Collaborative filtering models use the collaborative power of the ratings provided by multiple users to make recommendations. 

The basic idea of collaborative filtering methods is that these unspecified ratings can be imputed because the observed ratings are often highly correlated across various users and items. This similarity can be used to make inferences about incompletely specified values. Most of the models for collaborative filtering focus on leveraging either inter-item correlations or inter-user correlations for the prediction process. Some models use both types of correlations. Furthermore, some models use carefully designed optimization techniques to create a training model in much the same way a classifier creates a training model from the labeled data. This model is then used to impute the missing values in the matrix, in the same way that a classifier imputes the missing test labels. There are two types of methods that are commonly used in collaborative filtering:
- **Memory-based methods** or *neighborhood- based collaborative filtering algorithms*: ratings of user-item combinations are predicted on the basis of their neighborhoods. These neighbours can be defined in two ways:
    - **User-based collaborative filtering:** the ratings provided by like-minded users of a target user A are used in order to make the recommendations for A. Thus, the basic idea is to determine users, who are similar to the target user A (**user similarity**), and recommend ratings for the unobserved ratings of A by computing weighted averages of the ratings of this peer group. 
        - Similarity functions are computed between the rows of the ratings matrix to discover **similar users**.
        - the ratings are predicted using the ratings of neighboring users
    - **Item-based collaborative filtering:** In order to make the rating predictions for target item B by user A, the first step is to determine a set S of items that are most similar to target item B. The ratings in item set S, which are specified by A, are used to predict whether the user A will like item B. 
        - Similarity functions are computed between the *columns of the ratings matrix* to discover **similar items**.
        - the ratings are predicted using the user's own ratings on neighboring items (closely related)
    - *PROS* of memory-based techniques are that they are simple to implement and the resulting recommendations are often easy to explain.
    - *CONS* memory-based algorithms do not work very well with sparse ratings matrices
- **Model-based methods:** machine learning and data mining methods are used in the context of predictive models.
    - Ex: decision trees, rule-based models, Bayesian methods and latent factor models
        - Latent factors: Latent factor models are considered to be state-of-the-art in recommender systems. These models leverage well-known dimensionality reduction methods to fill in the missing entries. 
    
PROBLEMS:
- The main challenge in designing collaborative filtering methods is that the underlying ratings matrices are sparse.
- Cold-Start: there needs to be enough other users to find a mathc
- First rater: cannot recommend items that have not been previously rated
- Popularity bias

PROS:
- Collaborative filtering models are closely related to missing value analysis


## Latent Factor Models: Matrix Factorization

Factorization is, in fact, a more general way of approximating a matrix when it is prone to dimensionality reduction because of correlations between columns (or rows). Most dimensionality reduction methods can also be expressed as matrix factorizations. The m × n ratings matrix R is approximately factorized into an m × k matrix U and an n × k matrix V: $R \approx U V^T$. Each column of U (or V ) is referred to as a latent vector or latent component, whereas each row of U (or V) is referred to as a latent factor. 
- The ith row ui of U is referred to as a user factor, and it contains k entries corresponding to the affinity of user i towards the k concepts in the ratings matrix.
- Similarly, each row vi of V is referred to as an item factor, and it represents the affinity of the ith item towards these k concepts.
Therefore, each rating $r_{ij}$ in R can be approximately expressed as a dot product of the ith user factor and jth item factor: $r_{ij} \approx u_i \cdot \overline{v_j}$


## Formulation of the RecSys problem

we assume that the user-item ratings matrix is an incomplete m × n matrix $R = [r_{uj} ]$ containing m users and n items.
1. Predicting the rating value of a user-item combination: This is the simplest and most primitive formulation of a recommender system. In this case, the missing rating ruj of the user u for item j is predicted.
2. Determining the top-k items or top-k users. Learn the top-k most relevant items for a particular user, or the top-k most relevant users for a particular item