# Advanced ML, Recomendation project : 

### Description of the General topic:  
 
A recommendation system is a class of machine learning tools designed to suggest 
relevant items to users based on their preferences, behaviors, and other users’ 
activities. They are widely used across e-commerce, streaming platforms, social 
media, and online advertising, aiming to enhance user experience by delivering 
personalized content or product suggestions.

### Flow of the Code Project (All in Python)  :  
 
**Data Preprocessing:** 
- Preprocess textual data using tokenization, stemming, and vectorization 
techniques like TF-IDF or word embeddings. 
 
**Building Collaborative Filtering Models:** 
- Implement SVD for matrix factorization. 
- Build user-based and item-based collaborative filtering models using 
libraries like Scikit-Learn. 
 
**Building Content-Based Filtering Models:**
- Create item profiles using metadata. 
- Use cosine similarity or neural embeddings to identify similar items. 
 
**Combine with Hybrid Techniques:** 
- Experiment with hybrid models (weighted hybrid, feature-augmented 
collaborative filtering, etc.) to combine collaborative and content-based 
methods. 
- Train deep hybrid models if using neural networks, concatenating 
collaborative and content-based embeddings as input. 

**Visualization:**  
- Récapitulatifs des des résultats et visualisation par des graphiques dans 
la mesure du possible 

In [1]:
import numpy as np
import pandas as pd
import json

________________________________
### **Import of users data :**

In [2]:
# Import of users data : 

file = "Musical_Instruments.jsonl"

with open(file, 'r') as file:
    data = [json.loads(line) for line in file]



In [12]:

df_recommendation = pd.DataFrame(
    [{"id": item["parent_asin"], "user": item["user_id"], "rating": item["rating"]} for item in data]
)

In [13]:
# First we check if there is any duplicates in the dataset (ie a user that gives a rating twice for a product)
print(f"{df_recommendation[df_recommendation.duplicated(subset=['user', 'id'], keep=False)].shape}")

# Then we remove those duplicates (by doing the mean of the ratings) : 
df_recommendation = df_recommendation.groupby(['user', 'id'], as_index=False)['rating'].mean()
df_recommendation['rating'] = np.ceil(df_recommendation['rating'])

(77125, 3)


#### **Filter users that have rated less than 20 products**

In [15]:
rating_counts = df_recommendation.groupby('user').size().reset_index(name='count')

# Filtrer les 'user' qui ont au moins 20 ratings
valid_users = rating_counts[rating_counts['count'] >= 20]['user']

# Garder uniquement les lignes correspondantes dans le DataFrame original
df_recommendation = df_recommendation[df_recommendation['user'].isin(valid_users)]

In [16]:
df_recommendation.shape

(170027, 3)

#### **Products with less than 20 ratings**

In [17]:
rating_counts = df_recommendation.groupby('id').size().reset_index(name='count')

# Filtrer les 'id' qui ont au moins 20 ratings
valid_ids = rating_counts[rating_counts['count'] >= 20]['id']

# Garder uniquement les lignes correspondantes dans le DataFrame original
df_recommendation = df_recommendation[df_recommendation['id'].isin(valid_ids)]

In [18]:
df_recommendation.shape

(42626, 3)

In [19]:
print (f" Number of distincts products : {df_recommendation ['id'].nunique()}")
print (f" Number of distincts users : {df_recommendation['user'].nunique()}")

 Number of distincts products : 1003
 Number of distincts users : 5107


In [20]:
df_recommendation.isna().sum() # 0 missing value

user      0
id        0
rating    0
dtype: int64

In [21]:
ratings_per_product = df_recommendation.groupby('id')['user'].nunique()
print(f"The proportion of products rated by different users : \n")
pd.DataFrame(ratings_per_product.describe())

The proportion of products rated by different users : 



Unnamed: 0,user
count,1003.0
mean,42.498504
std,39.940309
min,20.0
25%,24.0
50%,31.0
75%,44.0
max,473.0


In [22]:
df_recommendation.to_csv('base de donnée_20_20.csv')

**(End preprocessing, csv database)**
___________________________

In [2]:
df_recommendation = pd.read_csv("base de donnée_20_20.csv")
df_recommendation.head(5)

Unnamed: 0,user,id,rating
0,AE23JYHGEN3D35CHE5OQQYJOW5RA,B000EEHKVY,5.0
1,AE23JYHGEN3D35CHE5OQQYJOW5RA,B000TGSM6E,5.0
2,AE23JYHGEN3D35CHE5OQQYJOW5RA,B008FDSWJ0,5.0
3,AE23JYHGEN3D35CHE5OQQYJOW5RA,B012VQ5A7S,5.0
4,AE23JYHGEN3D35CHE5OQQYJOW5RA,B076ZSHQ47,3.0


#### Import of products metadata : ( not used know)


In [None]:
# import of products metadata : 

products_1000_metadata = []
file_metadata = "Musical_Instruments.jsonl"

with open(file_metadata, 'r') as file:
    for i, line in enumerate(file):
        if i >= 1000:  
            break
        products_1000_metadata.append(json.loads(line))

In [None]:
# ### AFFICHAGE DE QUELQUES IMAGES  : 
# import json
# import random
# import requests
# from PIL import Image
# from io import BytesIO



# def get_random_products_with_images(products, num_products=90):
#     products_with_images = [p for p in products if p.get('images') and len(p['images']) > 0]
#     return random.sample(products_with_images, min(num_products, len(products_with_images)))


# def fetch_and_resize_image(url, size=(30, 30)):
#     try:
#         response = requests.get(url)
#         response.raise_for_status()
#         img = Image.open(BytesIO(response.content))
#         return img.resize(size)
#     except Exception as e:
#         print(f"Erreur lors du téléchargement de l'image : {e}")
#         return None

# # mosaïque
# def create_mosaic(images, grid_size=(10, 9), image_size=(30, 30)):
#     mosaic = Image.new('RGB', (grid_size[0] * image_size[0], grid_size[1] * image_size[1]))
#     for idx, img in enumerate(images):
#         if img:
#             x = (idx % grid_size[0]) * image_size[0]
#             y = (idx // grid_size[0]) * image_size[1]
#             mosaic.paste(img, (x, y))
#     mosaic.show()
#     return mosaic


# selected_products = get_random_products_with_images(products_1000_metadata)
# image_urls = [p['images'][0]['large'] for p in selected_products]

# images = [fetch_and_resize_image(url) for url in image_urls]
# mosaic = create_mosaic(images)

# Data Fields

## For User Reviews

| Field              | Type   | Explanation                                                                                                                                                     |
|--------------------|--------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `rating`           | float  | Rating of the product (from 1.0 to 5.0).                                                                                                                        |
| `title`            | str    | Title of the user review.                                                                                                                                       |
| `text`             | str    | Text body of the user review.                                                                                                                                   |
| `images`           | list   | Images that users post after they have received the product. Each image has different sizes (small, medium, large), represented by `small_image_url`, `medium_image_url`, and `large_image_url`. |
| `asin`             | str    | ID of the product.                                                                                                                                              |
| `parent_asin`      | str    | Parent ID of the product. Note: Products with different colors, styles, sizes usually belong to the same parent ID. The “asin” in previous Amazon datasets is actually the parent ID. Please use parent ID to find product meta. |
| `user_id`          | str    | ID of the reviewer.                                                                                                                                             |
| `timestamp`        | int    | Time of the review (unix time).                                                                                                                                 |
| `verified_purchase`| bool   | User purchase verification.                                                                                                                                     |
| `helpful_vote`     | int    | Helpful votes of the review.                                                                                                                                    |


In [20]:
import EDA_functions
image_url = "https://images-na.ssl-images-amazon.com/images/I/71DFEoJ+Z9L._SL256_.jpg"
EDA_functions.show_image(image_url)

# I- Collaborative Filtering

Collaborative filtering recommends products to users based on the behavior of other users with similar preferences. CF methods work on the principle that users who agreed on items in the past are likely to agree again. 
It is an alternative to content filtering that relies only on past user behavior—for example, previous transactions or product ratings— without requiring the creation of explicit profiles.Collaborative filtering analyzes relationships between users and interdependencies among products to identify new user-item associations.

There are two main types:

### a) User-Based Collaborative Filtering
This approach identifies users who have similar preferences (based on ratings or clicks) and recommends items that similar users liked.

### Implementation
We can try to implement this method using matrix factorization techniques like **Singular Value Decomposition (SVD)**, which reduces the dimensionality of the data matrix, capturing latent factors that explain user-item interactions.


In [22]:
df_recommendation = pd.DataFrame(
    [{"id": item["parent_asin"], "user": item["user_id"], "rating": item["rating"]} for item in products_1000_usersdata])


In [23]:
df_recommendation.head(5)

Unnamed: 0,id,user,rating
0,B003LPTAYI,AGKASBHYZPGTEPO6LWZPVJWB2BVA,5.0
1,B06XP6TDVY,AGCI7FAH4GL5FI65HYLKWTMFZ2CQ,3.0
2,B0040FJ27S,AGCI7FAH4GL5FI65HYLKWTMFZ2CQ,4.0
3,B00WJ3HL5I,AEM663T6XHZFWLODF4US2RCOCUSA,3.0
4,B07T9NM5QR,AFJTRBXMURLHS5EGNXLUHDHIZRFQ,5.0


In [24]:
print(f"The number of unique products is : {df_recommendation.id.nunique()}")
print(f"The number of unique users is : {df_recommendation.user.nunique()}")

The number of unique products is : 7193
The number of unique users is : 3107


In [25]:
ratings_per_product = df_recommendation.groupby('id')['user'].nunique()
print(f"The proportion of products rated by different users : \n")
pd.DataFrame(ratings_per_product.describe())

## Sur l'echantillon selectionné on remarque que plus de 50% des produits ne sont évalué que par deux personnes ... 


The proportion of products rated by different users : 



Unnamed: 0,user
count,7193.0
mean,1.375504
std,1.117584
min,1.0
25%,1.0
50%,1.0
75%,1.0
max,18.0


As it is doing in the paper : **Empirical Analysis of Predictive Algorithms for Collaborative Filtering by John S. Breese David Heckerman Carl Kadie**
We deal with the multiples grades by doing the mean of the grading by users, It a kind of Memory based algorithm. Maybe a user bought a product in the past and his opinion changed.
$$
\bar{v}_i = \frac{1}{|I_i|} \sum_{j \in I_i} v_{i,j}
$$

$$were \quad v_{i,j} \quad are \quad the \quad gradings \quad that \quad the \quad user \quad i \quad has \quad done \quad before.
$$

In [None]:
# First we check if there is any duplicates in the dataset (ie a user that gives a rating twice for a product)
print(f"{df_recommendation[df_recommendation.duplicated(subset=['user', 'id'], keep=False)].shape}")

# Then we remove those duplicates (by doing the mean of the ratings ) : 
df_recommendation = df_recommendation.groupby(['user', 'id'], as_index=False)['rating'].mean()

(195, 3)


In [27]:
#We convert the data in the long format for usre based collaborative filltering : 
df_recommendation = df_recommendation.pivot(index='user', columns='id', values='rating')

This operation gives us a very sparse matrix than can be hard to handle with many data because of the memory...

In [28]:
df_recommendation

id,0470536454,0739079891,076921990X,0769257631,0793533929,0874872758,0887976646,0899333990,1423414357,1423422465,...,B0CCXBSZ6M,B0CD9YRKV6,B0CDP1GDT1,B0CDTX85C8,B0CF3HD8V5,B0CF7XYWKS,B0CFD7WW4N,B0CFLG9ZNG,B0CG4SQ5MX,B0CGM14629
user,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
AE22H5O6TGL6JWSITGUNKH4HIKXA,,,,,,,,,,,...,,,,,,,,,,
AE23NAHINBRGBQ3A46YME3TPRL3A,,,,,,,,,,,...,,,,,,,,,,
AE23UY5SMJW3YHTX2NXRTJ5IWZQA,,,,,,,,,,,...,,,,,,,,,,
AE24GJD7LU2CDYQLP6PYTC3XZNAA,,,,,,,,,,,...,,,,,,,,,,
AE25NQAZI3725GZIL5FS52ZIKWKQ,,,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
AHZXSEDYECXQARVQJQPF7NDGLA7Q,,,,,,,,,,,...,,,,,,,,,,
AHZXV7EGK4I67ONX26JQICJZPS5A,,,,,,,,,,,...,,,,,,,,,,
AHZYQWEOFG4GKOFO2A3I7KLVEYDA,,,,,,,,,,,...,,,,,,,,,,
AHZZ4SJA2MAP4G6Z6E7BPJELACWA,,,,,,,,,,,...,,,,,,,,,,


In [58]:
user_id = "AE3335XF4PMHSXKTW5B7N7EALG3Q"
df_recommendation.loc[user_id][df_recommendation.loc[user_id].notna()].index # pour avoir la liste des produit qu'un user a évalué 
# I_i is the set of items on wich user i has graded

print (f"The user {user_id} has rating the products : {df_recommendation.loc[user_id][df_recommendation.loc[user_id].notna()].index}")

The user AE3335XF4PMHSXKTW5B7N7EALG3Q has rating the products : Index(['B07Q61G7JP', 'B07VZ7FYWP', 'B08DG5C6ZS', 'B08DXHVGGJ', 'B08HVGYBTK',
       'B08ZHHLHSP', 'B09GBMG83Z', 'B09GJVGV31'],
      dtype='object', name='id')


At this step we need to measure the similarity between the users based on their ratings. On commence par caculer les correlations  : 

In [30]:
def Correlation(a , b ): 
    """ this function compute the correlation between two vector a and b """
    return a.corr(b)

#### **Cosine Similarity** : 
Web page for the formula and the explaindantion about this metric :  https://en.wikipedia.org/wiki/Cosine_similarity


$$
\text{cosine similarity}(A, B) = \frac{A \cdot B}{\|A\| \|B\|}
$$

Where:
- \( A \) and \( B \) are vectors.
- \( A • B \) is the dot product of \( A \) and \( B \).
- \( \|A\| \) and \( \|B\| \) are the  norms of \( A \) and \( B \).


In [None]:
import numpy as np


def Cosine_similarity (a,b) : 
    """cosine similarity between two vectors"""

    vec1 = np.array(a)
    vec2 = np.array(b)
    
    dot_product = np.dot(vec1, vec2)
    norm_vec1 = np.linalg.norm(vec1)
    norm_vec2 = np.linalg.norm(vec2)
    
    if norm_vec1 == 0 or norm_vec2 == 0:
        return 0  # NO division by zero
    else:
        return dot_product / (norm_vec1 * norm_vec2)


In [59]:
#Cosine similarity for the entire matrix (between all products)
from sklearn.metrics.pairwise import cosine_similarity

df_recommendation = df_recommendation.fillna(0)
cosine_sim_matrix = cosine_similarity(df_recommendation) # (we maybe need to use thetranspose because we need to compute the similiratyu between the columns)

cosine_sim_df = pd.DataFrame(cosine_sim_matrix, index= df_recommendation.index, columns= df_recommendation.index)
cosine_sim_df


user,AE23ZBUF2YVBQPH2NN6F5XSA3QYQ,AE2O2C43KTYO4LXXGZWJZLE67GBQ,AE2YCIHHZH57ABZB7EWDTCF3WPKA,AE3335XF4PMHSXKTW5B7N7EALG3Q,AE3IGJOPJP6LFXEJTIXFJVSJLILA,AE3KLVXGZPANXE5XLXYKHTVAZ3FQ,AE3PLZHW6NXWBMZ76TDVFQG2MJFA,AE3TSST7D3QYFO2MUZ3QFNMYAEHQ,AE4FQVS6CJVC3QDQ7C2CPAJAZM4A,AE4JPPM4YPZ4EONHBXME6VWPVS2Q,...,AHYECMONATRG6ZVRAWKQ5RCTXQHA,AHYGVK6W353TCQL63AIECYCGBEJQ,AHYOSWORVZFXM5QMRIAW3JTTFFIQ,AHYTPQ6AS3EL3HUGGGVGRCFN7VPQ,AHYUFOFTGNEV4TGMQLASS6EA7QAQ,AHZ4TADPCXAAIKTFERGG5YB4BNQQ,AHZGQXTGR3WB6CQR3PP2TB2YPTUA,AHZGQXTGR3WB6CQR3PP2TB2YPTUA_1,AHZGQXTGR3WB6CQR3PP2TB2YPTUA_2,AHZJXRSEQSJ5TKYWWINKEORIBYSA
user,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
AE23ZBUF2YVBQPH2NN6F5XSA3QYQ,1.000000,0.082756,0.0,0.0,0.000000,0.000000,0.0,0.0,0.030912,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.156269,0.000000,0.000000,0.000000,0.000000
AE2O2C43KTYO4LXXGZWJZLE67GBQ,0.082756,1.000000,0.0,0.0,0.044918,0.050864,0.0,0.0,0.000000,0.0,...,0.0,0.063966,0.0,0.0,0.0,0.000000,0.041623,0.000000,0.070523,0.000000
AE2YCIHHZH57ABZB7EWDTCF3WPKA,0.000000,0.000000,1.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000
AE3335XF4PMHSXKTW5B7N7EALG3Q,0.000000,0.000000,0.0,1.0,0.000000,0.000000,0.0,0.0,0.000000,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000
AE3IGJOPJP6LFXEJTIXFJVSJLILA,0.000000,0.044918,0.0,0.0,1.000000,0.297012,0.0,0.0,0.000000,0.0,...,0.0,0.124507,0.0,0.0,0.0,0.000000,0.081018,0.000000,0.137270,0.159647
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
AHZ4TADPCXAAIKTFERGG5YB4BNQQ,0.156269,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.0,...,0.0,0.000000,0.0,0.0,0.0,1.000000,0.000000,0.000000,0.000000,0.000000
AHZGQXTGR3WB6CQR3PP2TB2YPTUA,0.000000,0.041623,0.0,0.0,0.081018,0.091741,0.0,0.0,0.000000,0.0,...,0.0,0.115374,0.0,0.0,0.0,0.000000,1.000000,0.807249,0.590210,0.000000
AHZGQXTGR3WB6CQR3PP2TB2YPTUA_1,0.000000,0.000000,0.0,0.0,0.000000,0.000000,0.0,0.0,0.000000,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.000000,0.807249,1.000000,0.000000,0.000000
AHZGQXTGR3WB6CQR3PP2TB2YPTUA_2,0.000000,0.070523,0.0,0.0,0.137270,0.155438,0.0,0.0,0.000000,0.0,...,0.0,0.195480,0.0,0.0,0.0,0.000000,0.590210,0.000000,1.000000,0.000000


#### **Similar Users  :** 

we calculate the similarity between a **target user** and other users in the system to find similar users. The process involves creating a **weight matrix** that accounts for the similarity between the target user and other users, normalizing the similarities to give more weight to users with higher similarity.

1. **Target User Selection:**
   To begin, we first choose a **target user** (let’s denote this user as \( u_t \)) for whom we want to find similar users. For this example, we choose `user id = AE3335XF4PMHSXKTW5B7N7EALG3Q` as the target user:
   
   $$ u_t = AE3335XF4PMHSXKTW5B7N7EALG3Q $$

2. **Similarity Calculation:**
   We calculate the **similarity** between the target user \( u_t \) and all other users using a **similarity matrix** \( S \), where each entry \( S_{ij} \) represents the similarity between user \( i \) and user \( j \). 

   The **similarity vector** for the target user \( u_t \) is represented as:
   
   $$ \text{similarities}_{u_t} = S_{u_t} $$
   
   we also  remove the similarity between the target user and themselves:
   
   $$ \text{similarities}_{u_t, \text{others}} = S_{u_t} \setminus \{ S_{u_t, u_t} \} $$

1. **Weight Calculation:**
   After obtaining the similarity scores between the target user \( u_t \) and other users, we create a **weight matrix** where each user's similarity score is normalized. This ensures that users who are more similar to the target user are given higher weight. The weight for each user \( i \) is calculated by normalizing the similarity score:

   $$ w_i = \frac{S_{u_t, i}}{\sum_{j \neq u_t} S_{u_t, j}} $$

   Where:
   - \( w_i \) is the weight for user \( i \),
   - \( S_{u_t, i} \) is the similarity between the target user \( u_t \) and user \( i \),
   - The denominator is the sum of the similarities between the target user \( u_t \) and all other users (excluding \( u_t \)).



In [60]:
user_id = "AE3335XF4PMHSXKTW5B7N7EALG3Q"

# we select the similarities between the target user and all other users
if user_id in cosine_sim_df.columns:
    similarities = cosine_sim_df[user_id].drop(user_id)
else:
    print(f"User ID {user_id} not found in similarity matrix.")
    
print(similarities)

user
AE23ZBUF2YVBQPH2NN6F5XSA3QYQ      0.0
AE2O2C43KTYO4LXXGZWJZLE67GBQ      0.0
AE2YCIHHZH57ABZB7EWDTCF3WPKA      0.0
AE3IGJOPJP6LFXEJTIXFJVSJLILA      0.0
AE3KLVXGZPANXE5XLXYKHTVAZ3FQ      0.0
                                 ... 
AHZ4TADPCXAAIKTFERGG5YB4BNQQ      0.0
AHZGQXTGR3WB6CQR3PP2TB2YPTUA      0.0
AHZGQXTGR3WB6CQR3PP2TB2YPTUA_1    0.0
AHZGQXTGR3WB6CQR3PP2TB2YPTUA_2    0.0
AHZJXRSEQSJ5TKYWWINKEORIBYSA      0.0
Name: AE3335XF4PMHSXKTW5B7N7EALG3Q, Length: 499, dtype: float64


user
AE22IPO5AD7T3QUS6TOPU6T6OL6Q    0.0
AE22Z3RLVIRU6RT5PNRK5CFFNEFQ    0.0
AE23HUJD2RENUFCMHPVVC3F64KRQ    0.0
AE24FFSUQHE3J6NYBICB7V2WHUAA    0.0
AE24VPMWEEQD62YPOG53BW7JCGFA    0.0
                               ... 
AHZXJ4N5GBXLRDEKD37LB6ZZTPWQ    0.0
AHZYBJVSPJO4NMRWIQ4TI4Y42CJA    0.0
AHZZJQYNVZUJNPNQ737ITGEQUB4A    0.0
AHZZNR5FSD5ODQYVFCWFNLHGX55Q    0.0
AHZZPUYPNZQ7QXK55HGGE3Z7TTEA    0.0
Name: AE227RVA23EPOD52V7J7CCRYIHBQ, Length: 7335, dtype: float64

In [None]:
# Calculate the weights by normalizing the similarities
weights = similarities / similarities.sum()

### b) Item-Based Collaborative Filtering
Instead of focusing on user similarity, this method finds items that are similar based on user ratings or interactions.

    -> https://datajobs.com/data-science-repo/Recommender-Systems-%5BNetflix%5D.pdf 

# II- Content based filtering 

# III - Hybrid Recommender Systems 