## Library Import
For this task Turicreate package from Apple.Inc was used and this library only runs on MacOs or on a Linux Kernel

In [51]:
import turicreate as tc 
import pandas as pd
import numpy as np

## Data Import

the Id of the customer, the id of the product and the review score were used for this task

In [52]:
cols=["customer_unique_id", "product_id", "review_score"]
df= pd.read_csv("python_merged.csv", usecols=cols) #use only columns in cols

## Data cleaning 
Duplicates and Null values were eliminated

In [53]:
df.dropna(inplace=True) 
df.drop_duplicates(inplace=True, keep="first")

# Data Manipulation
The data are both normalized and standardized

In [54]:
df_norm= df.copy()
  
# apply normalization techniques by Column 1
column = 'review_score'
df_norm[column] = (df_norm[column] - df_norm[column].min()) / (df_norm[column].max() - df_norm[column].min())
df_norm[column]=(df_norm[column]-df_norm[column].mean())/df[column].std()
  
# view normalized data
display(df_norm)

Unnamed: 0,product_id,review_score,customer_unique_id
0,f3c2d01a84c947b078e32bbef0718962,-0.387068,4854e9b3feff728c13ee5fc7d1547e92
1,5a6b04657a4c5ee34285d1e4619a96b4,-0.015951,830d5b7aaa3b6f1e9ad63703bec97d23
4,e2a1d45a73dc7f5a7f9236b043431b89,0.169608,87776adb449c551e74c13fc34f036105
5,817e1c2d22418c36386406ccacfa53e8,0.169608,9f302d00dd3e18ed3745778184b4f0fe
6,9e93b2c4cb5eea05e75a481c129b104d,-0.572627,3f4f614c632af7fc7508462a7cb55ac2
...,...,...,...
118138,17e18b0c88a853dd6de3e48a7cfa9d9a,-0.572627,56b6eede1b10925212f054a7ba614796
118139,6aa063e063f2ab982b471e58afe06d72,-0.387068,101375bf617fd60c9eee42f98d9a73d6
118140,282b126b2354516c5f400154398f616d,0.169608,b030929cf3b8c3370ea8c611f9ccb32e
118141,96ea060e41bdecc64e2de00b97068975,0.169608,3977f83a14549e6265bcded84e92ee80


## Model Preparation

The dataframe is first split into train and test with a 80/20 proportion, then they are transformed into an SFrame which is a specific formt of the package Turicreate, this has the advantage of not relying on RAM but on physical memory

In [55]:
def split_data(data):
    '''
    Splits dataset into training and test set.
    
    Args:
        data (pandas.DataFrame)
        
    Returns
        train_data (tc.SFrame)
        test_data (tc.SFrame)
    '''
    train, test = train_test_split(data, test_size = .2)
    train_data = tc.SFrame(train)
    test_data = tc.SFrame(test)
    return train_data, test_data

In [56]:
from sklearn.model_selection import train_test_split
train_data, test_data = split_data(df) #the function is called
train_data_norm, test_data_norm = split_data(df_norm)

## Modelling

First, the number of product to recommend is set to 5

In [57]:
user_id = 'customer_unique_id'
item_id = 'product_id'
users_to_recommend = list(df[user_id])
n_rec = 5 # number of items to recommend
n_display = 30 # to display the first few rows in an output dataset

Then a function is prepared to include the measures which are selected to perform the recommendation 
Three methods have been selected:
**Popularity method**: which is based on user preferences
**Cosine Similarity**: a collaborative filtering method
**Pearson Correlation**: another collaborative filtering method

In [58]:
def model(train_data, name, user_id, item_id, target, users_to_recommend, n_rec, n_display):
    if name == 'popularity':
        model = tc.popularity_recommender.create(train_data, 
                                                    user_id=user_id, 
                                                    item_id=item_id, 
                                                    target=target)
    elif name == 'cosine':
        model = tc.item_similarity_recommender.create(train_data, 
                                                    user_id=user_id, 
                                                    item_id=item_id, 
                                                    target=target, 
                                                    similarity_type='cosine')
    elif name == 'pearson':
        model = tc.item_similarity_recommender.create(train_data, 
                                                    user_id=user_id, 
                                                    item_id=item_id, 
                                                    target=target, 
                                                    similarity_type='pearson')
        
    recom = model.recommend(users=users_to_recommend, k=n_rec)
    recom.print_rows(n_display)
    return model

## Run models on train data

The output here shows what has been recommended to each customer, each customer has 5 products recommended to them ranked from the first to the fifth. 
Here 6 outputs are produced: three for the non normalized dataset and three for the transformed dataset

In [59]:
name = 'popularity'
target = 'review_score'
popularity = model(train_data, name, user_id, item_id, target, users_to_recommend, n_rec, n_display)

+-------------------------------+-------------------------------+-------+------+
|       customer_unique_id      |           product_id          | score | rank |
+-------------------------------+-------------------------------+-------+------+
| 4854e9b3feff728c13ee5fc7d1... | 6ed93af03d1f53308d3a9c6555... |  5.0  |  1   |
| 4854e9b3feff728c13ee5fc7d1... | f4851ffef385d506603396041d... |  5.0  |  2   |
| 4854e9b3feff728c13ee5fc7d1... | 18047039004204f52054f74b95... |  5.0  |  3   |
| 4854e9b3feff728c13ee5fc7d1... | 97868e08c02c9f4f6b49ebe5a4... |  5.0  |  4   |
| 4854e9b3feff728c13ee5fc7d1... | 73abf6f6dfa98d1d04760f4cc1... |  5.0  |  5   |
| 830d5b7aaa3b6f1e9ad63703be... | 6ed93af03d1f53308d3a9c6555... |  5.0  |  1   |
| 830d5b7aaa3b6f1e9ad63703be... | f4851ffef385d506603396041d... |  5.0  |  2   |
| 830d5b7aaa3b6f1e9ad63703be... | 18047039004204f52054f74b95... |  5.0  |  3   |
| 830d5b7aaa3b6f1e9ad63703be... | 97868e08c02c9f4f6b49ebe5a4... |  5.0  |  4   |
| 830d5b7aaa3b6f1e9ad63703be

In [60]:
name = 'popularity'
target = 'review_score'
pop_norm = model(train_data_norm, name, user_id, item_id, target, users_to_recommend, n_rec, n_display)

+-------------------------------+-------------------------------+
|       customer_unique_id      |           product_id          |
+-------------------------------+-------------------------------+
| 4854e9b3feff728c13ee5fc7d1... | 6c50c87e36d8641d67fbcceed5... |
| 4854e9b3feff728c13ee5fc7d1... | d0349534ab9b46b98d13a53051... |
| 4854e9b3feff728c13ee5fc7d1... | f196248e8b5d060cca414a664e... |
| 4854e9b3feff728c13ee5fc7d1... | c0350d6ac413eda4641bf92ab6... |
| 4854e9b3feff728c13ee5fc7d1... | 3215010238fcd9cab6ba7d2b81... |
| 830d5b7aaa3b6f1e9ad63703be... | 6c50c87e36d8641d67fbcceed5... |
| 830d5b7aaa3b6f1e9ad63703be... | d0349534ab9b46b98d13a53051... |
| 830d5b7aaa3b6f1e9ad63703be... | f196248e8b5d060cca414a664e... |
| 830d5b7aaa3b6f1e9ad63703be... | c0350d6ac413eda4641bf92ab6... |
| 830d5b7aaa3b6f1e9ad63703be... | 3215010238fcd9cab6ba7d2b81... |
| 87776adb449c551e74c13fc34f... | 6c50c87e36d8641d67fbcceed5... |
| 87776adb449c551e74c13fc34f... | d0349534ab9b46b98d13a53051... |
| 87776adb

In [61]:
name="cosine"
target="review_score"
cosine= model(train_data, name, user_id, item_id, target, users_to_recommend, n_rec, n_display)

+-------------------------------+-------------------------------+
|       customer_unique_id      |           product_id          |
+-------------------------------+-------------------------------+
| 4854e9b3feff728c13ee5fc7d1... | 36f60d45225e60c7da4558b070... |
| 4854e9b3feff728c13ee5fc7d1... | ca349c4d87c594b4995fcc4fa9... |
| 4854e9b3feff728c13ee5fc7d1... | c9058c144a2fef1c5b35f31945... |
| 4854e9b3feff728c13ee5fc7d1... | 0bcc3eeca39e1064258aa1e932... |
| 4854e9b3feff728c13ee5fc7d1... | 705b4a0072a3ecf575b497f895... |
| 830d5b7aaa3b6f1e9ad63703be... | 36f60d45225e60c7da4558b070... |
| 830d5b7aaa3b6f1e9ad63703be... | ca349c4d87c594b4995fcc4fa9... |
| 830d5b7aaa3b6f1e9ad63703be... | c9058c144a2fef1c5b35f31945... |
| 830d5b7aaa3b6f1e9ad63703be... | 0bcc3eeca39e1064258aa1e932... |
| 830d5b7aaa3b6f1e9ad63703be... | 705b4a0072a3ecf575b497f895... |
| 87776adb449c551e74c13fc34f... | 36f60d45225e60c7da4558b070... |
| 87776adb449c551e74c13fc34f... | ca349c4d87c594b4995fcc4fa9... |
| 87776adb

In [63]:
name="cosine"
target="review_score"
cosine_norm= model(train_data_norm, name, user_id, item_id, target, users_to_recommend, n_rec, n_display)

+-------------------------------+-------------------------------+
|       customer_unique_id      |           product_id          |
+-------------------------------+-------------------------------+
| 4854e9b3feff728c13ee5fc7d1... | 0b33e28a934b68a0475650499b... |
| 4854e9b3feff728c13ee5fc7d1... | efe572a7ee9cf0317bbcd4ec8b... |
| 4854e9b3feff728c13ee5fc7d1... | e4b0f340d2552f744b4c2d48df... |
| 4854e9b3feff728c13ee5fc7d1... | 4fbee589d3377144a90a338e4a... |
| 4854e9b3feff728c13ee5fc7d1... | 57c806e92eec9168ff2065cc65... |
| 830d5b7aaa3b6f1e9ad63703be... | 0b33e28a934b68a0475650499b... |
| 830d5b7aaa3b6f1e9ad63703be... | efe572a7ee9cf0317bbcd4ec8b... |
| 830d5b7aaa3b6f1e9ad63703be... | e4b0f340d2552f744b4c2d48df... |
| 830d5b7aaa3b6f1e9ad63703be... | 4fbee589d3377144a90a338e4a... |
| 830d5b7aaa3b6f1e9ad63703be... | 57c806e92eec9168ff2065cc65... |
| 87776adb449c551e74c13fc34f... | 0b33e28a934b68a0475650499b... |
| 87776adb449c551e74c13fc34f... | efe572a7ee9cf0317bbcd4ec8b... |
| 87776adb

In [62]:
name="pearson"
target="review_score"
pearson= model(train_data, name, user_id, item_id, target, users_to_recommend, n_rec, n_display)

+-------------------------------+-------------------------------+-------+------+
|       customer_unique_id      |           product_id          | score | rank |
+-------------------------------+-------------------------------+-------+------+
| 4854e9b3feff728c13ee5fc7d1... | 6ed93af03d1f53308d3a9c6555... |  5.0  |  1   |
| 4854e9b3feff728c13ee5fc7d1... | f4851ffef385d506603396041d... |  5.0  |  2   |
| 4854e9b3feff728c13ee5fc7d1... | 18047039004204f52054f74b95... |  5.0  |  3   |
| 4854e9b3feff728c13ee5fc7d1... | 97868e08c02c9f4f6b49ebe5a4... |  5.0  |  4   |
| 4854e9b3feff728c13ee5fc7d1... | 73abf6f6dfa98d1d04760f4cc1... |  5.0  |  5   |
| 830d5b7aaa3b6f1e9ad63703be... | 6ed93af03d1f53308d3a9c6555... |  5.0  |  1   |
| 830d5b7aaa3b6f1e9ad63703be... | f4851ffef385d506603396041d... |  5.0  |  2   |
| 830d5b7aaa3b6f1e9ad63703be... | 18047039004204f52054f74b95... |  5.0  |  3   |
| 830d5b7aaa3b6f1e9ad63703be... | 97868e08c02c9f4f6b49ebe5a4... |  5.0  |  4   |
| 830d5b7aaa3b6f1e9ad63703be

In [64]:
name="pearson"
target="review_score"
pearson_norm= model(train_data_norm, name, user_id, item_id, target, users_to_recommend, n_rec, n_display)

+-------------------------------+-------------------------------+
|       customer_unique_id      |           product_id          |
+-------------------------------+-------------------------------+
| 4854e9b3feff728c13ee5fc7d1... | 5ca6a40fbc611451486b6525c6... |
| 4854e9b3feff728c13ee5fc7d1... | 50571d0f0f8ffbeec1c64cefee... |
| 4854e9b3feff728c13ee5fc7d1... | 1e1172a8b6ae12312c22ccf336... |
| 4854e9b3feff728c13ee5fc7d1... | cc711d05cc2a7eaabf0629dbda... |
| 4854e9b3feff728c13ee5fc7d1... | 1ba7948f3aa31cdaf78a4885f3... |
| 830d5b7aaa3b6f1e9ad63703be... | 5ca6a40fbc611451486b6525c6... |
| 830d5b7aaa3b6f1e9ad63703be... | 50571d0f0f8ffbeec1c64cefee... |
| 830d5b7aaa3b6f1e9ad63703be... | 1e1172a8b6ae12312c22ccf336... |
| 830d5b7aaa3b6f1e9ad63703be... | cc711d05cc2a7eaabf0629dbda... |
| 830d5b7aaa3b6f1e9ad63703be... | 1ba7948f3aa31cdaf78a4885f3... |
| 87776adb449c551e74c13fc34f... | 5ca6a40fbc611451486b6525c6... |
| 87776adb449c551e74c13fc34f... | 50571d0f0f8ffbeec1c64cefee... |
| 87776adb

## Evaluation on the test set


In [65]:
models_w_counts = [popularity, cosine, pearson]
models_w_norm = [pop_norm, cosine_norm, pearson_norm]
names_w_counts = ['Popularity Model on Review_score', 'Cosine Similarity on Review Score', 'Pearson Similarity on Review Score']
names_w_norm = ['Popularity Model on Scaled Reviews', 'Cosine Similarity on Scaled Review', 'Pearson Similarity on Scaled Review']

In [66]:
eval_counts = tc.recommender.util.compare_models(test_data, models_w_counts, model_names=names_w_counts)

eval_norm = tc.recommender.util.compare_models(test_data_norm, models_w_norm, model_names=names_w_norm)

PROGRESS: Evaluate model Popularity Model on Review_score



Precision and recall summary statistics by cutoff
+--------+------------------------+-----------------------+
| cutoff |     mean_precision     |      mean_recall      |
+--------+------------------------+-----------------------+
|   1    |          0.0           |          0.0          |
|   2    |          0.0           |          0.0          |
|   3    | 1.6829067164807066e-05 | 5.048720149442118e-05 |
|   4    | 1.2621800373605294e-05 | 5.048720149442118e-05 |
|   5    | 1.0097440298884248e-05 | 5.048720149442118e-05 |
|   6    | 8.414533582403511e-06  | 5.048720149442118e-05 |
|   7    | 7.212457356345875e-06  | 5.048720149442118e-05 |
|   8    | 6.310900186802647e-06  | 5.048720149442118e-05 |
|   9    | 5.6096890549356836e-06 | 5.048720149442118e-05 |
|   10   | 5.048720149442139e-06  | 5.048720149442118e-05 |
+--------+------------------------+-----------------------+
[10 rows x 3 columns]


Overall RMSE: 1.4900980518250058

Per User RMSE (best)
+-----------------------------


Precision and recall summary statistics by cutoff
+--------+-----------------------+----------------------+
| cutoff |     mean_precision    |     mean_recall      |
+--------+-----------------------+----------------------+
|   1    |  0.005452617761397492 | 0.005217010821090185 |
|   2    |  0.003660322108345536 | 0.007009306474142136 |
|   3    |  0.002675821679204326 | 0.007690883694316823 |
|   4    | 0.0023981420709850076 | 0.00923074333989666  |
|   5    | 0.0020295855000757295 | 0.009735615354840894 |
|   6    | 0.0017670520523047407 | 0.010164756567543455 |
|   7    |  0.00153625341690167  | 0.01031621817202673  |
|   8    | 0.0013757762407229797 | 0.010568654179498836 |
|   9    |  0.001234131592085849 | 0.010669628582487677 |
|   10   | 0.0011258645933255955 | 0.010821090186970932 |
+--------+-----------------------+----------------------+
[10 rows x 3 columns]


Overall RMSE: 4.282568995080432

Per User RMSE (best)
+-------------------------------+----------------------+---


Precision and recall summary statistics by cutoff
+--------+------------------------+-----------------------+
| cutoff |     mean_precision     |      mean_recall      |
+--------+------------------------+-----------------------+
|   1    |          0.0           |          0.0          |
|   2    |          0.0           |          0.0          |
|   3    |          0.0           |          0.0          |
|   4    |          0.0           |          0.0          |
|   5    |          0.0           |          0.0          |
|   6    |          0.0           |          0.0          |
|   7    |          0.0           |          0.0          |
|   8    | 6.310900186802631e-06  | 5.048720149442105e-05 |
|   9    | 5.6096890549356785e-06 | 5.048720149442116e-05 |
|   10   | 5.048720149442119e-06  | 5.048720149442108e-05 |
+--------+------------------------+-----------------------+
[10 rows x 3 columns]


Overall RMSE: 2.428166957507819

Per User RMSE (best)
+------------------------------


Precision and recall summary statistics by cutoff
+--------+------------------------+-----------------------+
| cutoff |     mean_precision     |      mean_recall      |
+--------+------------------------+-----------------------+
|   1    | 0.0001008166145780825  | 0.0001008166145780825 |
|   2    | 0.0001008166145780825  |  0.000201633229156165 |
|   3    | 0.0001344221527707764  |  0.00040326645831233  |
|   4    | 0.00011341869140034283 | 0.0004536747656013713 |
|   5    | 0.00011089827603589066 | 0.0005544913801794518 |
|   6    | 0.00010081661457808254 | 0.0006048996874684941 |
|   7    | 0.00012242017484481434 | 0.0008569412239137011 |
|   8    | 0.00013232180663373327 | 0.0010585744530698662 |
|   9    | 0.00011761938367442962 | 0.0010585744530698655 |
|   10   | 0.00011593910676479494 | 0.0011593910676479494 |
+--------+------------------------+-----------------------+
[10 rows x 3 columns]


Overall RMSE: 0.27494936913444645

Per User RMSE (best)
+----------------------------


Precision and recall summary statistics by cutoff
+--------+------------------------+-----------------------+
| cutoff |     mean_precision     |      mean_recall      |
+--------+------------------------+-----------------------+
|   1    | 0.0012602076822260339  |  0.001209799374936991 |
|   2    | 0.0010081661457808269  | 0.0019407198306280917 |
|   3    | 0.0008569412239137013  |  0.002470007057163023 |
|   4    | 0.0006931142252243168  |  0.002671640286319185 |
|   5    | 0.0005847363645528776  |  0.002822865208186312 |
|   6    | 0.0004956816883422395  | 0.0028732735154753533 |
|   7    | 0.0004320712053346394  | 0.0029236818227643914 |
|   8    | 0.0003780623046678098  |  0.002923681822764396 |
|   9    | 0.0003528581510232886  |  0.003074906744631518 |
|   10   | 0.00033269482810767165 |  0.003226131666498644 |
+--------+------------------------+-----------------------+
[10 rows x 3 columns]


Overall RMSE: 0.2484541891668454

Per User RMSE (best)
+-----------------------------


Precision and recall summary statistics by cutoff
+--------+------------------------+-----------------------+
| cutoff |     mean_precision     |      mean_recall      |
+--------+------------------------+-----------------------+
|   1    |          0.0           |          0.0          |
|   2    |          0.0           |          0.0          |
|   3    |          0.0           |          0.0          |
|   4    | 1.2602076822260312e-05 | 5.040830728904125e-05 |
|   5    | 1.0081661457808212e-05 | 5.040830728904125e-05 |
|   6    |  8.40138454817354e-06  | 5.040830728904125e-05 |
|   7    | 7.201186755577316e-06  | 5.040830728904125e-05 |
|   8    | 6.301038411130156e-06  | 5.040830728904125e-05 |
|   9    |  5.60092303211569e-06  | 5.040830728904125e-05 |
|   10   | 5.0408307289041125e-06 | 5.040830728904125e-05 |
+--------+------------------------+-----------------------+
[10 rows x 3 columns]


Overall RMSE: 0.2749796642774072

Per User RMSE (best)
+-----------------------------

**Original Dataset**:


**Popularity Model** *RMSE*: 1.4900
**Cosine Similarity** *RMSE*: 4.2825
**Pearson Correlation** *RMSE*:  2.4281

**Normalized Dataset**:


**Popularity Model** *RMSE*: 0.2749
**Cosine Similarity** *RMSE*: 0.24
**Pearson Correlation** *RMSE*: 0.27

The best model is the Collaborative filtering model with Cosine Similarity and normalized and standardized data


The final recommendation is printed based on the lower RMSE

In [67]:
final_model = tc.item_similarity_recommender.create(tc.SFrame(df_norm), 
                                            user_id=user_id, 
                                            item_id=item_id, 
                                            target='review_score', similarity_type='cosine')
recom = final_model.recommend(users=users_to_recommend, k=n_rec)

recom.print_rows(n_display)

+-------------------------------+-------------------------------+
|       customer_unique_id      |           product_id          |
+-------------------------------+-------------------------------+
| 4854e9b3feff728c13ee5fc7d1... | ce49d8be84e535ca2a5eb0e76c... |
| 4854e9b3feff728c13ee5fc7d1... | e4b0f340d2552f744b4c2d48df... |
| 4854e9b3feff728c13ee5fc7d1... | efe572a7ee9cf0317bbcd4ec8b... |
| 4854e9b3feff728c13ee5fc7d1... | 0b33e28a934b68a0475650499b... |
| 4854e9b3feff728c13ee5fc7d1... | 370c500401de58da2a25c8cab2... |
| 830d5b7aaa3b6f1e9ad63703be... | ce49d8be84e535ca2a5eb0e76c... |
| 830d5b7aaa3b6f1e9ad63703be... | e4b0f340d2552f744b4c2d48df... |
| 830d5b7aaa3b6f1e9ad63703be... | efe572a7ee9cf0317bbcd4ec8b... |
| 830d5b7aaa3b6f1e9ad63703be... | 0b33e28a934b68a0475650499b... |
| 830d5b7aaa3b6f1e9ad63703be... | 370c500401de58da2a25c8cab2... |
| 87776adb449c551e74c13fc34f... | ce49d8be84e535ca2a5eb0e76c... |
| 87776adb449c551e74c13fc34f... | e4b0f340d2552f744b4c2d48df... |
| 87776adb

The frame produced through Turicreate is now transformed into a Pandas Dataframe, showing each cosine similarity score and the rank for each user

In [68]:
df_rec = recom.to_dataframe()
print(df_rec.shape)
df_rec.head()

(503190, 4)


Unnamed: 0,customer_unique_id,product_id,score,rank
0,4854e9b3feff728c13ee5fc7d1547e92,ce49d8be84e535ca2a5eb0e76c09c348,0.002964,1
1,4854e9b3feff728c13ee5fc7d1547e92,e4b0f340d2552f744b4c2d48dffbc0e2,0.002444,2
2,4854e9b3feff728c13ee5fc7d1547e92,efe572a7ee9cf0317bbcd4ec8b98a088,0.002422,3
3,4854e9b3feff728c13ee5fc7d1547e92,0b33e28a934b68a0475650499b9ae509,0.00223,4
4,4854e9b3feff728c13ee5fc7d1547e92,370c500401de58da2a25c8cab27527cc,0.002097,5


A function has been created to make the five recommendation in one row divided by the bar symbol (*|*) to have a clean dataset

In [84]:
def create_output(model, users_to_recommend, n_rec, print_csv=True):
    recomendation = model.recommend(users=users_to_recommend, k=n_rec)
    df_rec = recomendation.to_dataframe()
    df_rec['recommendedProducts'] = df_rec.groupby([user_id])[item_id] \
        .transform(lambda x: '|'.join(x.astype(str)))
    df_output = df_rec[['customer_unique_id', 'recommendedProducts']].drop_duplicates() \
        .sort_values('customer_unique_id').set_index('customer_unique_id')
    if print_csv:
        df_output.to_csv('recommendation.csv')
    return df_output

In [85]:
df_output = create_output(cosine_norm, users_to_recommend, n_rec, print_csv=True)
df_output.head()

(94007, 1)


Unnamed: 0_level_0,recommendedProducts
customer_unique_id,Unnamed: 1_level_1
0000366f3b9a7992bf8c76cfdf3221e2,cfe6e9c01d0bbb5df9a75f0e3286baa9|aa3cf7c656b2a...
0000b849f77a49e4a4ce2b2a4ca5be3f,0b33e28a934b68a0475650499b9ae509|efe572a7ee9cf...
0000f46a3911fa3c0805444483337064,0b33e28a934b68a0475650499b9ae509|efe572a7ee9cf...
0000f6ccb0745a6a4b88665a16c9f078,0b33e28a934b68a0475650499b9ae509|efe572a7ee9cf...
0004aac84e0df4da2b147fca70cf8255,0b33e28a934b68a0475650499b9ae509|efe572a7ee9cf...


Next step was to create a function that on user input tells which products are recommended to a specific user and if the user is not present in the database it return **"customer not found"**

In [2]:
def customer_recomendation():
    if customer_id not in df_output.index:
        print('Customer not found.')
        return customer_id
    return df_output.loc[customer_id]

In [87]:
customer_recomendation("44b6bbfea26596437062a38c8e6bcec1")

recommendedProducts    cfe6e9c01d0bbb5df9a75f0e3286baa9|aa3cf7c656b2a...
Name: 44b6bbfea26596437062a38c8e6bcec1, dtype: object

In [88]:
customer_recomendation("miro")

Customer not found.


'miro'