# Recommender Systems

## Amazon (shopping)
<img src="images/amazon1.png" width=800/>

## Netflix (video)
<img src="images/netflix1.png" width=800/>

## Spotify (music)
<img src="images/spotify1.png" width=800/>

## Kayak (travel)
<img src="images/kayak1.png" width=800/>

### The three illustrations above are examples of recommendations. The 
companies "guess" the products we might like. These guesses are different
for each one of us. 

### Internet users (all of us) interact in many ways: 
* clicks (University Websites)
* likes, sad, angry (META)
* ratings (Netflix)

### Secondary information is available about our internet usage
* how much time is spent on a site
* how much time is spent between clicks
* what objects does my mouse hover over? 
* what is the date and time when these actions occur? 

### Additional Content
* scrap information from Facebook, Twitter, Instagram, etc
* sentiment analysis
* topic analysis
* analysis of reviews
* ...

## Objective
* Use above data to maximize profits *and* improve the user experience

# Classification of Collaborative Filteirng in 2007
Ref: 

<img src="images/collaborative_filtering_2007.png" width=1200/>

# Basic approaches to recommendation systems: 

#
## Matrix-factorization
<img src="images/matrix_factorization.png" width=800/>

The matrix contains known ratings by users of items. Objective is to fill the matrix. 
In general, the matrix is extremely sparse. 

## Collaborative Filtering
Consider the ratings table. Identify the ratings in the missing cells. 

<img src="images/ratings_table1.png" width=800/>

Start with Pearson Correlation $w_{u,v}$ or $w_{i,j}$ between two users $(u,v)$ or two items $(i,j)$: <br/> 

<img src="images/user_user1.png" width=600/>
<img src="images/item_item1.png" width=600/>

$r_{u,i}$ is the rating of item $i$ by user $u$. 

### User-User
If  users $A$ and $B$ have rated a collection of items similarly, present to $A$ items from $B$ not yet rated by $A$. 
Below is a formula to estimate the rating given by user $u$ for item $i$: 

<img src="images/weighted_sum1.png" width=800/><br/>
Ref: A Survey of Collaborative Filtering Techniques_2009_su etal_review

### Item-Item
Assign to user items with characteristics similar to items already rated.  
The formula based on item-item weights has a similar form. 

## Content-Based Filtering
* Text analysis of item descriptions, reviews, and other text-based data, related to the items
rated by the users. 
* Knowing a set of features a user is interested in, match a list of items with these features. 
* [Definition](https://developers.google.com/machine-learning/recommendation/content-based/basics)
    
## Hybrid recommendation systems
* Combine the outputs of two or more recommender systems

# Taxonomy of Collaborative Filtering in 2022
Ref: Collaborative ﬁltering recommender systems taxonomy_2022_papadakis etal_review
    
<img src="images/taxonomy1.png" width=800/>

We will concentrate on Neural Networks, more specifically, Graph Neural Networks

In [6]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [7]:
import pandas as pd
import numpy as np

In [10]:
df = pd.read_csv("activity_top10e5.csv", nrows=100000, )

  exec(code_obj, self.user_global_ns, self.user_ns)


In [11]:
df.head()

Unnamed: 0,MEMBER_ID,TIER_LEVEL,TIER_LEVEL_DESCRIPTION,PREVIOUS_TIER,LAST_TIER_CHANGE_DATE,STATUS,ENROLL_DATE,GENDER,BIRTH_DATE,NATIONALITY,...,HOLDING_ORIGIN_REGION,HOLDING_DESTINATION_REGION,HOLDING_ORIGIN_COUNTRY,HOLDING_DESTINATION_COUNTRY,SEGMENT_ORIGIN_REGION,SEGMENT_DESTINATION_REGION,SEGMENT_ORIGIN_COUNTRY,SEGMENT_DESTINATION_COUNTRY,AMOUNT_OF_BAGS,SEAT_ASSIGNMENT
0,100031203,T1,Silver,B0,2020-09-28 23:42:51,AC,2016-12-04,M,1962-02-14,Panama,...,HUB,CAM,PANAMA,COSTA RICA,HUB,CAM,PANAMA,COSTA RICA,1.0,19F
1,100031203,T1,Silver,B0,2020-09-28 23:42:51,AC,2016-12-04,M,1962-02-14,Panama,...,,,,,,,,,,
2,100031203,T1,Silver,B0,2020-09-28 23:42:51,AC,2016-12-04,M,1962-02-14,Panama,...,,,,,,,,,,
3,100031203,T1,Silver,B0,2020-09-28 23:42:51,AC,2016-12-04,M,1962-02-14,Panama,...,,,,,,,,,,
4,100031203,T1,Silver,B0,2020-09-28 23:42:51,AC,2016-12-04,M,1962-02-14,Panama,...,CAM,HUB,COSTA RICA,PANAMA,CAM,HUB,COSTA RICA,PANAMA,1.0,20D
