# <span style="color:orange">A:</span> Introduction

A central issue for Olist is that most customers purchase only one product when they are on the platform as can be seen from the graph below.

<img src="orderitems.png" width="500">

Upselling is commonly used in traditional commerce to battle problems of this sort by having sales representatives suggest customers additional purchases, for instance if they buy a couch (Sofa) the representative might show them a lot of nice chairs that go well with the couch to see if he can make them buy something extra. [[1]](https://en.wikipedia.org/wiki/Upselling)

In ecommerce sales representatives are not present to make such suggestions, and instead the company relies solely on the natural behaviour of the customer.

However, using design techniques and advanced analytics, ecommerce websites can nudge customers into buying extra items without having a sales representative present [[2]](https://link.springer.com/article/10.1007/s12599-016-0453-1). One such technique is the use of *Recommender systems*, where famously known examples are how Netflix recommends movies or how Amazon recommends products [[3]](https://dl.acm.org/doi/pdf/10.1145/336992.337035).

In this report, several recommender systems will be constructed and tested, and finally combined into a hybdrid recommender system to evaluate how well such system could perform for Olist.

Before diving into the different recommender systems, the KPI's will be described. Then, the recommender systems will be created according to the outline as can be seen after the KPI section.

# <span style="color:orange">B:</span> KPI's

The KPI's for this project focus solely on revenue:

$$Revenue = Quantity \times Price$$

As it is assumed Olist gets a percentage cut from each product sold. This makes room for two basic KPI's for the recommender systems:
1. Can the recommender system increase quantity purchased on the website - e.g. by making customers purchase more pr. order.
1. Can the recommender system increase price of items purchased - e.g. by recommending customers more expensive products.

The focus of the recommender systems here wil solely be on (1), therefor the KPI can simply be expressed as:

$$\mu_{Quantity_{old}}-\mu_{Quantity_{new}}$$

i.e. what is the average amount of items pr. order before the recommender systems and what are they after the recommender systems. However, as it will not be possible to test the systems live, a simplified KPI will be constructed, here the KPI will simply be: ***Amount of orders for which a recommendation can be made***. This KPI can then be used to guesstimate how many additional products, $Q$, will be sold.

# <span style="color:orange">C:</span> Outline

0. **Descriptive analysis**
    1. Products dataset
    1. OrderItems dataset
    1. Orders dataset
    1. Customers dataset
    1. Reviews dataset
1. **Association rule mining** (Non-personalized)
    1. Model building
        1. Read data
        1. Format data
        1. Build model
        1. Investigate results and evaluate
        1. Build recommender function
    1. Conclusion
1. **Content-based filtering** (Personalized)
    1. Model building
        1. Read data
        1. Format data
        1. Build model
        1. Investigate results and evaluate
        1. Build recommender function
    1. Conclusion
1. **Item-item collaborative filtering** (Personalized)
    1. Model building
        1. Read data
        1. Format data
        1. Build model
        1. Investigate results and evaluate
        1. Build recommender function
    1. Conclusion
1. **Hybrid recommender system**
    1. Building the system
    1. Evaluation
1. **Discussion**
1. **Conclusion**
1. **References**

***Data Set:*** https://www.kaggle.com/datasets/olistbr/brazilian-ecommerce

# <span style="color:blue">0:</span> Descriptive analysis

Here the datasets used will be introduced and a a descriptive analysis will showcase some basic stats and data from the datasets.

In [None]:
!pip install apyori

In [None]:
import warnings
warnings.filterwarnings("ignore")

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from apyori import apriori
from sklearn.linear_model import LogisticRegression
import seaborn as sns
import plotly.graph_objects as go
sns.set(style='darkgrid', palette='muted', color_codes=True)

The construction of the recommender systems relies on 5 datasets.
1. Products: Which contains information about each product listed on Olist such as category and size
2. Order items: Which contains information about products purchased in each order
3. Orders: Which connects a customer to an order
4. Customers: Which contains basic geographical information on each customer
5. Reviews: Which contains reviews related to an order

The datasets have relationship as shown in the image below where green boxes indicate the dataset have been used in the analysis and red cross means the dataset is not used.

<img src="images/architecture.png" width="500">

First the datasets will be loaded.

In [None]:
products = pd.read_csv("olist_products_dataset.csv")
orderItems = pd.read_csv("olist_order_items_dataset.csv")
orders = pd.read_csv("olist_orders_dataset.csv")
customers = pd.read_csv("olist_customers_dataset.csv")
reviews = pd.read_csv("olist_order_reviews_dataset.csv")
products_translation = pd.read_csv("product_category_name_translation.csv")

## <span style="color:blue">0.1:</span> Products dataset

The products dataset contains 32.341 products and the product features consists of dimensions, category, and information about its descriptions and number of photos.

In [None]:
products.head()

Investigating the products categories as has been done below, it can be seen that a wide variety of products are being purchased, from computer parts to watches, car-parts and furniture. This variety indicates that there is a good opportunity for making good recommendations for additional purchases.

In [None]:
# Count number of products in each category and get english names
products = products.merge(products_translation[['product_category_name_english', 'product_category_name']], on = 'product_category_name')
productsList = products.groupby('product_category_name_english')['product_category_name_english'].count()
productsList = productsList.sort_values()

# Plot categories
fig, axs = plt.subplots(figsize = [13,7])
plt.setp(axs.xaxis.get_majorticklabels(), rotation=90)

# Distance boxplot
axs.set_title("Products in each category", fontsize = 16)
axs.bar(productsList.index.to_list(), productsList.to_list())
axs.set_ylabel('Number of products',  fontsize=16)

The remainder of the dataset primarily relates to product dimensions and the description and amount of photos. From a quick statistical investigation it can be seen that quite a lot of variety exists, which indicates together with the amount of different categories that even though the product features are very few, it should still be possible to construct an MVP recommender system.

In [None]:
products.describe()

## <span style="color:blue">0.2:</span> OrderItems dataset

The orderitems dataset consists of 112,650 total orders with 1 or more products, each with an associated price and freight value.

In [None]:
orderItems.head()

Investigating how many products have been purchased pr. order reveals the almost every order contains just 1 product, a fact that was already shown in the introduction. The downside of this is that this will significantly impact some types of recommender systems as not a lot of information about connections between products will be available.

In [None]:
# First count number of items pr order id
order_id_counts = orderItems.groupby('order_id')['order_item_id'].count()
# get number of orders pr size
sizes = order_id_counts.value_counts()
# then plot and print
plt.figure(figsize=(10,5))
plt.title("Total orders vs. Number of items pr. order", fontsize=16)
plt.xlabel("Number of items pr. order", fontsize=16)
plt.ylabel("Total orders", fontsize=16)
sizes.plot.bar()

Looking at the price of the items, it can be seen that the products range from as low as ~1,00 DKK to as high as ~8.000 DKK with a mean of ~140 DKK meaning that products vary a lot but most of them are relatively low cost, meaning upselling generally has a high potential as lower costs products are easier to upsell.

In [None]:
orderItems.describe()

## <span style="color:blue">0.3:</span> Orders dataset

The orders dataset is only used to connect an order with a customer and a review in this analysis, but delivery information and timeseries data is available.

In [None]:
orders.head()

## <span style="color:blue">0.4:</span> Customers dataset

The customers dataset contains geographic information about customers, but is only used in this analysis to get orders and reviews pr. unique customer.

In [None]:
customers.head()

When looking at how many orders each customer has made in the time of the dataset (ca. 2 years) it becomes apparent that not only does must customers only purchase a single item on Olist, they also only purchase once on the platform in total. This means that not only a Olist's customers purchasing too little items at every purchase, they are also purchasing way too infrequent. This cements the need for a good recommender system so customers can not only be helped into buying more pr. order, but also to buy more often, for instance by sending out emails with personal recommendations as part of their marketing strategy.

In [None]:
# Get how many orders each customer makes
customersCount = customers.merge(orders[['order_id', 'customer_id']], on = 'customer_id')
customersCount = customersCount.groupby('customer_unique_id')['order_id'].nunique()
customersCount = customersCount.value_counts()

plt.figure(figsize=(10,5))
plt.title("Count of total orders for orders pr. customer", fontsize=16)
plt.xlabel("Total orders", fontsize=16)
plt.ylabel("Orders pr. customer", fontsize=16)
customersCount.plot.bar()

## <span style="color:blue">0.5:</span> Reviews dataset

Finally, the reviews dataset contains a review (if provided) for each order, note that reviews are related to orders and not the products itself (although most orders have just one product in them)

In [None]:
reviews.head()

Looking at the number of reviews compared to the number of orders, it can to big suprise be seen that almost every order has an associated review score with it. This is good as some recommender systems such as content-based filtering as will be shown later relies on the review scores.

In [None]:
print("Number of orders in total: ", orders['order_id'].nunique())
print("Number of reviews in total: ", reviews['review_id'].nunique())

# <span style="color:blue">1:</span> Association rule mining

*(Non-personalized)*

The first recommender system created will be based on association rule mining and the Apriori algorithm [[4]](http://www.vldb.org/conf/1994/P487.PDF). The algorithm mines for frequencies in the dataset and seeks to establish association rules between items. The algorithm is often used in traditional market basket analysis where it finds rules such as: "If the customer has put milk in the basket, the the customer is likely to also buy eggs". This can be used to provide recommendations on Olist based on the customers current basket.

The algorithm is very flexible as it adapts to what the customer is currently purchasing, and relies on general, rather than specific rules. The goal of the algorithm is to provide a widget that can ease the shopping experience, for instance if the customer puts a new phone in their basket, an instant pop-up could say "Do you want a cover with your phone?" and then recommend a specific cover. This is an example rule that the algorithm could find if customers that purchases phones also often purchases a cover. This would for the customer be seen as a very natural addition, and the likelihood of the customer following a good rule is generally quite high.

The model will be build using the [Apyori [5]](https://pypi.org/project/apyori/) package which is an implementation of the algorithm in python.

## <span style="color:blue">1.1:</span> Model building

Here the model will be build.

### <span style="color:blue">1.1.1:</span> Read data

First, read the data.

In [None]:
path = "olist_order_items_dataset.csv"
df = pd.read_csv(path)
df.head()

### <span style="color:blue">1.1.2:</span> Format data

The data will be reformatted to fit the apyori implementation of the Apriori algorithm. The format required is a list of lists, where each row is an order and the list corresponding to the order are items purchased in the order.

In [None]:
# Only order_id and product_id are needed
df = df[['order_id','product_id']]

# Only keep order with multiple purchases
orders = df.groupby('order_id')['product_id'].count()
orders = orders[orders > 1].index
df = df[df['order_id'].isin(orders)]

In [None]:
# Convert dataframe to list of lists
orders = df.groupby('order_id')['product_id'].apply(list)
orders = orders.to_numpy()

### <span style="color:blue">1.1.3:</span> Build model

The choice of parameters in the apriori algorithm is important for two primary reasons:

1. It helps reduce the dataset in a meaningful way to reduce runtime (runtimes can easily be many hours or even days)
2. It helps produce a list containing only meaningful results

The choices can either be tuned as one would tune a traditional ML model, or it can be chosen based on domain knowledge. The parameters below have been chosen based on domain knowledge as traditional parameter tuning does not provide any additional value for this particular dataset.

The paremeters chosen are:
* **Minimum support:** 2/len(orders)
    * This means the model only looks at items that exists in atleast two separate orders. This value might seem low, but as can be seen from the distribution of items pr. order, setting the number any higher would remove almost all data
* **Minimum confidence:** 0.25
    * Since items are purchased so infrequently together, the confidence has been set to consider combinations that have a confidence of at least 25%, meaning that whenever item A is bought, item B must be bought together with it in at least 25% of the times.
* **Minimum lift:** 1
    * Due to the limitations of the dataset, it has been chosen to consider all values of lift > 1. As can be seen from the results this value has little meaning
* **Minimum rule length:** 2
    * There must be at least 2 products pr. rule. It will be very unlikely to construct rules longer than 2 with this dataset.

In [None]:
# Determine association rules based on parameters
association_rules = apriori(orders, min_support = (2/len(orders)), min_confidence = 0.25, min_lift = 1, min_length = 2)

With the model build, a dataframe will then be constructed to investigate the rules constructed more closely. Note that rules of length > 2 are not included. Due to the downwards closure principle information regarding recommendations for the current basket will not be lost by excluding rules of length > 2, but the function can be simplified which improves speed.

In [None]:
# Pass results to list
association_rules = list(association_rules)

# Initiate
item1 = []
item2 = []
support = []
confidence = []
lift = []

# Extract values
for rule in association_rules:

    # Extract items in rules
    if len(rule[0]) == 2:
        items = [x for x in rule[0]]
        item1 = np.append(item1, items[0])
        item2 = np.append(item2, items[1])
    if len(rule[0]) == 3:
        continue

    # Extract support, confidence and lift figures
    support = np.append(support, rule[1])
    confidence = np.append(confidence, rule[2][0][2])
    lift = np.append(lift, rule[2][0][3])

# Construct association rules dataframe
df_ar = pd.DataFrame(list(zip(item1, item2, support, confidence, lift)),
               columns =['Item 1', 'Item 2', 'Support', 'Confidence', 'Lift'])

df_ar.head()

### <span style="color:blue">1.1.4:</span> Investigate results

#### <span style="color:blue">1.1.4.1:</span> Results investigation

The results can be investigated with some basic statistical measures

In [None]:
# Investigate the results
df_ar.describe()

Investigating the statistical measure of the results from the Apriori algorithm, some concerning things become apparent. Here they will be discussed for each column.

**Support**

The key figures can be converted to number of items as

$$min \cdot len(orders) = 0.000204 \cdot 9803 = 2 $$
$$mean \cdot len(orders) = 0.000299 \cdot 9803 = 2.93 $$
$$50\% \cdot len(orders) = 0.000204 \cdot 9803 = 2 $$
$$75\% \cdot len(orders) = 0.000306 \cdot 9803 = 3 $$
$$max \cdot len(orders) = 0.003468 \cdot 9803 = 34 $$

From this it becomes apparent that most items are only bought a few times, resulting in very little support for each rule.

**Confidence:** Confidence on the other hand is rather high with a mean of 0.64. This is a natural result due to the fact that there is so little support for each item in the data.

**Lift:** As a result of the poor results in support and confidence, lift will often be massively inflated and provide much higher values than what would be expected in reality.


#### <span style="color:blue">1.1.4.2:</span> Product investigation

Below, some arbitrarily rules from the algorithm will be investigated to get an idea of the nature of the rules.

In [None]:
df_ar.head()

The first rule outlined below shows that buying one item from the "Perfume" category states you will also buy another item from that same category. This seems logical as a customer might buy two scents from the same collection, or buy a soap + balsam together as a pair.

In [None]:
products = pd.read_csv("olist_products_dataset.csv")
products[products['product_id'] == '005030ef108f58b46b78116f754d8d38']

In [None]:
products[products['product_id'] == '51250f90d798d377a1928e8a4e2e9ae1']

The other rule likewise shows that buying one item in "Console games" makes it likely that you will buy another item within that same category. This as well seems logical, as you might buy a console + a controller, or a console + a game or similar.

In [None]:
products[products['product_id'] == '0129d1e9b29d3fe6833cc922374cd252']

In [None]:
products[products['product_id'] == '9f14f401fce2d9ddd450e7ce60cb7f5f']

Since the recommendations are based on actual purchases, there seems to be some logic behind the rules

### <span style="color:blue">1.1.5:</span> Build recommender function

The association rule mining model will be part of a larger system, but first a function for this part will be constructed.

In [None]:
def recommendationARM(items):

    # Initialize empty dataframe
    recommendations = pd.DataFrame(columns = ['Item 1', 'Item 2', 'Support', 'Confidence', 'Lift'])

    # Append rule to dataframe if item in basket has a rule associated to it
    for item in items:
        if item in df_ar['Item 1'].values:
            recommendations = recommendations.append(df_ar.loc[df_ar['Item 1'] == item])
        else:
            continue

    if recommendations.empty:
        recommendations = "Failed"
    else:
        # Extract top 3 best recommendations based on lift
        recommendations = recommendations.sort_values(by=['Lift'], ascending = False)
        recommendations = recommendations[['Item 2', 'Lift']].iloc[0:3].to_numpy()

    return recommendations

The function takes a list of items as input and returns the (up to) 3 most likely items the customer will purchase based on their current basket together with the lift for the rule. Example:

In [None]:
recommendation = recommendationARM(['d199462b8361a3e047e64595bce5b6ea', '014a8a503291921f7b004a5215bb3c36'])
recommendation

The user has items 'd199462b8361a3e047e64595bce5b6ea' and '014a8a503291921f7b004a5215bb3c36' in his/her basket and the system recommends three items; 'fc2d17e60bdd3349a25f3ca6a802a91e' with lift 4901.5, 'dd6a505f83dd3c6326aa9856519e0978' with lift 408.458 and '14dffa241a078aeaebaef48a49e807ca' with lift 272.305.

How good these recommendations are is unfortunately impossible to assess without having more information about the products.

## <span style="color:blue">1.2:</span> Conclusion

In conclusion, the Apriori algorithm does not work well for the current dataset. However, this does not mean that it is not usable. The algorithm have already found some logical links between items, and this can provide a basis for the recommender engine. Additionally, as the company succeeds with increasing number of items pr. order, the algorithm can be re-run and new rules can be extracted, continously increasing the performance of this part of the recommender engine.

# <span style="color:blue">2:</span> Content-based filtering

*(Personalized)*

With the first recommender system focusing on current items in the basket, what happens when the customer does not currently have an item in the basket? In that case, a recommender system using the user, instead of an item, as an input could be used.

Content-based filtering works by looking at what items a user has previously liked and disliked, and uses the features of those products to create a prediction of whether a user will like or dislike a new item with some probability. Essentially, a binary classification algorithm is constructed for each user, which here will be done using a simple logistic regression as proof of concept [[6]](https://en.wikipedia.org/wiki/Recommender_system#Content-based_filtering).

As was seen in the association rule mining, the algorithm performed poorly due to lack of multiple items pr. order. Similarly, it can be expected that content-based filtering will peform badly based on the lack of multiple orders pr. customer. However, by focusing on customers with multiple orders present, a proof of concept can still be constructed.

## <span style="color:blue">2.1:</span> Model building

### <span style="color:blue">2.1.1:</span> Read data

First, read the data needed.

In [None]:
# Import datasets
products = pd.read_csv("olist_products_dataset.csv")
orderItems = pd.read_csv("olist_order_items_dataset.csv")
orders = pd.read_csv("olist_orders_dataset.csv")
customers = pd.read_csv("olist_customers_dataset.csv")
reviews = pd.read_csv("olist_order_reviews_dataset.csv")

### <span style="color:blue">2.1.2:</span> Format data

The format required for the product data is a dataframe with products as indeces and product features as columns.

In [None]:
# OneHotEncode category data and set index
products = pd.get_dummies(products, columns=["product_category_name"], prefix='', prefix_sep='')
products = products.set_index('product_id')
products.dropna(inplace=True)

In [None]:
products.head()

The format required for the customers dataframe is a dataframe with products as indices and customers as columns where the values NaN = Product not purchased by customer, 0 = Product purchased but not liked and 1 = Product purchased and liked by the customer. Since this type of model will only work for customers that have had multiple purchases, the dataset is reduced as such. The dataframe construction can be seen below.

In [None]:
# Merge dataframes
customers = customers[['customer_id', 'customer_unique_id']].merge(orders[['order_id', 'customer_id']], on = 'customer_id')

# Include only customers with multiple purchases
red = customers.groupby('customer_unique_id')['order_id'].nunique()
red.sort_values(ascending=False, inplace=True)
red = red[red > 1]
red = red.index.tolist()
customers = customers[customers['customer_unique_id'].isin(red)]

# Merge dataframes
customers = customers.merge(orderItems[['order_id', 'product_id']], on = 'order_id')
customers = customers.merge(reviews[['review_score', 'order_id']], on = 'order_id')

# Keep only unique customer ID, product ID and review score and redefine index
customers = customers[['customer_unique_id', 'product_id', 'review_score']]
customers = customers.set_index('product_id')

# Comment this line to work on entire dataset
# customers = customers.sample(round(0.1*len(customers)), random_state=1)

# OneHotEncode customers in new dataframe
dummies = pd.get_dummies(customers['customer_unique_id'])

# Set values for each customer/product combination to the review score
dummies.values[dummies != 0] = customers['review_score']

# Remove duplicates in the index
customers = dummies.groupby(level=0).max()

# Set reindex to same as products dataframe
customers = customers.reindex(products.index)

# Replace all 0's with NaN = Customer have not purchased product
customers = customers.replace(0, np.nan)

# Convert reviews such that 1-2 = 0 (not like) and 3-5 = 1 (like)
customers[(customers == 2) | (customers == 1)] = 0
customers[customers >= 3] = 1

# Remove customers that does not have both a like and a dislike
for customer in customers:
    if customers[customer].nunique() < 2:
        customers.drop(columns = [customer], inplace = True)

In [None]:
customers.head()

The products dataframe and customers dataframe now has matching indices (product_id's) where the products dataframe consists of features for each product and the customers dataframe consists of the product likes of each customer.

### <span style="color:blue">2.1.3:</span> Build model

First, extract user product purchases with features as a list (features) and a list of products for each user with 1 or 0 indicating whether the user has liked the purchased product or not (labels). This part is inspired by [[7]](https://cn.inside.dtu.dk/cnnet/filesharing/download/4d379354-2162-4c5c-bdd6-5b732f3a7bc2)

In [None]:
# Extract lengths for iteration
n_customers = len(customers.columns)
n_products = len(products)
n_features = len(products.columns)

In [None]:
# For each customer, extract each product that user has bought together with all information about that product
X_customers = [products.iloc[customers[i].notna().values].values for i in list(customers)]
# For each customer, extract the products that user have liked or disliked
Y_customers = [customers[customers[i].notna()][i].values for i in list(customers)]

Then, for each user, train a logistic regression model on the features of the liked products to create a unique customer profiling model that can be used to predict the probability that the customer will like a product that has not been purhcased by that customer before.

In [None]:
lr_hyperparams = {'penalty':'l2', 'C':1000, 'solver':'lbfgs'}
customer_profiles = [LogisticRegression(**lr_hyperparams) for i in range(n_customers)]
customer_profiles = [model.fit(X_customers[i], Y_customers[i]) for i, model in enumerate(customer_profiles)]

### <span style="color:blue">2.1.4:</span> Investigate results

The results will be investigated both with regards to general parameters and user specific recommendations.

#### <span style="color:blue">2.1.4.1:</span> Parameter investigation

First the parameters for each user will be investigated to get a general feeling of the users.

In [None]:
# Initialize empty dataframe
paramCheck = products.copy()
paramCheck = paramCheck.reset_index(drop=True)
paramCheck = paramCheck[0:0]
paramCheck.insert(0, 'Overall bias', "")
paramCheck.insert(0, 'Customer', "")

# Add customers to dataframe
for i in range(len(customer_profiles)):
    vals = customer_profiles[i].coef_[0][:].tolist()
    vals.insert(0, customer_profiles[i].intercept_[0])
    vals.insert(0, list(customers)[i])
    paramCheck.loc[i] = vals

paramCheck.head()

Some example observations:

It can be seen for instance that user 2 and 3 on average are a bit more positive than user 0, 1 and 4. However, the bias is really low and in practice means nothing.

It can be seen that customer 0 dislikes long product names and customer 1 dislikes long descriptions but like longer names etc.  

The different features can be investigated more generally through a simple statistical description:

In [None]:
paramCheck.describe()

Here it can be seen that bias is generally quite low meaning relatively fair users and that customers on average dislikes long names and descriptions but like many pictures.

#### <span style="color:blue">2.1.4.2:</span> User investigation

Here the model predictions will be evaluated for the users

In [None]:
# Initialize empty dataframe
userCheck = pd.DataFrame(columns=['Customer', 'Item 1', 'Item 2', 'Item 3', 'Item 4', 'Item 5', 'Item 6', 'Item 7', 'Item 8', 'Item 9', 'Item 10'])

# Get feature values and product names
X = products.values
product_names = products.index.values

# Add each customer and top 10 recommendations to dataframe
for i in range(len(customer_profiles)):
    theList = []
    prob = customer_profiles[i].predict_proba(X)
    product_prob = list(zip(product_names, prob))
    product_prob.sort(reverse=True, key=lambda x: x[1][1])
    for j in range(10):
        theList.append(product_prob[j][1][1])
    theList.insert(0,list(customers)[i])
    userCheck.loc[i] = theList

userCheck.head()

By quick investigation it can be seen that most customers are being predicted to have a 100% chance of liking the top 10 product recommendations. This hints at a poor performing model due to lack of data on each customer.

The dataframe can be investigated more generally using describe():

In [None]:
userCheck.describe()

Here it can be seen, that when considering the top 10 recommendations for each user, almost each and every recommendation is projected at 100% probability, confirming the suspicion that the model is performing poorly due to lack of data on the customers; Lack of data here meaning that customers rarely have multiple orders which makes profiling them in this manner difficult.

#### <span style="color:blue">2.1.4.3:</span> Product investigation

Here, the customer '004b45ec5c64187465168251cd1c9c2f' (arbitrarily chosen) will be investigated to see the nature of the recommendations

As can be seen below, the customer has previously purchased something in the category 'Computer' which the customer did not like, and something within 'Kitchen' which the customer liked.

In [None]:
productsT = pd.read_csv("olist_products_dataset.csv")
productsT = productsT.merge(products_translation[['product_category_name_english', 'product_category_name']], on = 'product_category_name')
customers['01ea7dfdac01a4e8fbe2902b73510b20'].sort_values()

In [None]:
# Dislike
productsT[productsT['product_id'] == 'ba9943916136c9959e24f71bdae28faa']

In [None]:
# Like
productsT[productsT['product_id'] == 'cdad30e46a6fc84785a525ad6b5cf748']

Looking into the recommendations for this customer, it can be seen that a product within 'Kitchen', a product within 'Office' and a product within 'Cool stuff' have been recommended. From the limited information available, the recommendations does seem to have some logic behind them, but it is hard to evaluate properly without better information.

In [None]:
customer = '01ea7dfdac01a4e8fbe2902b73510b20'
recommendation = []
i = list(customers).index(customer)
X =  products.values
product_names = products.index.values
prob = customer_profiles[i].predict_proba(X)
product_prob = list(zip(product_names, prob))
product_prob.sort(reverse=True, key=lambda x: x[1][1])
for i in range(3):
    recommendation.append([product_prob[i][0], product_prob[i][1][1]])
recommendation

In [None]:
productsT[productsT['product_id'] == 'cdad30e46a6fc84785a525ad6b5cf748']

In [None]:
productsT[productsT['product_id'] == '61d980017b9f55ac2451dbf0bdb849f2']

In [None]:
productsT[productsT['product_id'] == '0d94e21c8156ffa3dcb77cb1d456ef80']

### <span style="color:blue">2.1.5:</span> Build recommender function

Despite the poor performance (or rather, too good performance) of the model, the reccomendation function will still be constructed as a proof of concept since the model could be improved dramatically as customers purchase more products over time.

In [None]:
def recommendationCBF(customer):

    recommendation = []

    # Check if customer has a model associated
    if customer in list(customers):

        # Get model number of customer
        i = list(customers).index(customer)

        # Get values and names of products
        X =  products.values
        product_names = products.index.values

        # Predict probability of customer liking each product
        prob = customer_profiles[i].predict_proba(X)

        # Store product names and probability in sorted list
        product_prob = list(zip(product_names, prob))
        product_prob.sort(reverse=True, key=lambda x: x[1][1])

        # Recommend top 3 items
        for i in range(3):
            recommendation.append([product_prob[i][0], product_prob[i][1][1]])
    else:
        return "Failed"

    return recommendation

The function simply takes a unique_user_id as input and returns the top 3 recommendations for that user, based on what products that user has previously liked and disliked. As an example, recommendations for the user '01ea7dfdac01a4e8fbe2902b73510b20' have been extracted below.

In [None]:
recommendation = recommendationCBF('01ea7dfdac01a4e8fbe2902b73510b20')
recommendation

Here it can be seen that with very high probability, the user will like the three products.

## <span style="color:blue">2.2:</span> Conclusion

The conclusion here is much similar to that of the Apriori algorithm, namely that the content-based filtering does not work well with the current data available. However, as customers increase the number of orders, the model might see rapid improvement. However, as it only works for customers with multiple unique orders, the recommendations currently impact just a fraction of the total customers.

# <span style="color:blue">3:</span> Item-Item Collaborative filtering

*(Personalized)*

Seeing how associative rules and user-based recommendations both provided somewhat unsatisfactory results, a final option is available in terms of recommender systems.

In the field of 'Collaborative filtering' user-user recommendations and item-item recommendations er the two possibilities. User-user finds users that a similar to a user in terms of ratings given to documents, and provides recommendations based on the idea that *'users that have agreed in the past are likely to agree in the future**. However, as was seen, due to the sparsity of user information this seems a bad road to take.

Another options is using item-item recommendations. Here, instead of finding similar users, similar items are found instead. That is, items that have gotten similar ratings are deemed similar. Due to the excessive amount of items in the database, this seems a good road to take. However, a word of caution is that since Olist contains so many different types of items, this could lead to bad recommendations.

## <span style="color:blue">3.1:</span> Model building

Time to build the model.

### <span style="color:blue">3.1.1:</span> Read data

In [None]:
products_IICF = pd.read_csv("olist_products_dataset.csv")
orderItems = pd.read_csv("olist_order_items_dataset.csv")
orders = pd.read_csv("olist_orders_dataset.csv")
customers_IICF = pd.read_csv("olist_customers_dataset.csv")
reviews = pd.read_csv("olist_order_reviews_dataset.csv")

### <span style="color:blue">3.1.2:</span> Format data

The first step is to construct a dataframe with users as indices and products as columns. Each user should have NaN for a product they have not purchased, and have their review for a product they have purchased. Note that only customers with multiple purchases have been included to reduce the size of the dataset.

In [None]:
# Merge dataframes
customers_IICF = customers_IICF[['customer_id', 'customer_unique_id']].merge(orders[['order_id', 'customer_id']], on = 'customer_id')
customers_IICF = customers_IICF.merge(orderItems[['order_id', 'product_id']], on = 'order_id')

# Uncomment to include only customers with multiple purchases
red = customers_IICF.groupby('customer_unique_id')['order_id'].nunique()
red.sort_values(ascending=False, inplace=True)
red = red[red > 1]
red = red.index.tolist()
customers_IICF = customers_IICF[customers_IICF['customer_unique_id'].isin(red)]

# Merge dataframes
customers_IICF = customers_IICF.merge(reviews[['review_score', 'order_id']], on = 'order_id')

# Keep only unique customer ID, product ID and review score and redefine index
customers_IICF = customers_IICF[['customer_unique_id', 'product_id', 'review_score']]
customers_IICF = customers_IICF.set_index('customer_unique_id')

# customers = customers.sample(round(0.1*len(customers)), random_state=1)

# OneHotEncode products in new dataframe
dummies = pd.get_dummies(customers_IICF['product_id'])

# Set values for each customer/product combination to the review score
dummies.values[dummies != 0] = customers_IICF['review_score']

# Remove duplicates in the index
customers_IICF = dummies.groupby(level=0).max()

# Replace all 0's with NaN = Customer have not purchased product
customers_IICF = customers_IICF.replace(0, np.nan)
customers_IICF

### <span style="color:blue">3.1.3:</span> Build model

Now that the initial dataframe is constructed, the next step is to create a correlation-matrix between the products. This has been done below.

In [None]:
# Correlation matrix
correlations = customers_IICF.corr()
correlations

The resulting dataframe is extremely sparse, however, it can still be used as a proof of concept. The model build will be giving recommendations based on a single item, i.e. if a customer is currently on that items page, these could be listed as "You might also like..." or similar.

First, a a list of the current customers purchased and non-purchased items will be constructed, such that the recommendations will only be for items not previously purchased. This part is inspired by [[7]](https://cn.inside.dtu.dk/cnnet/filesharing/download/4d379354-2162-4c5c-bdd6-5b732f3a7bc2).

In [None]:
K = 3
customer_id = 'ff8892f7c26aa0446da53d01b18df463'

In [None]:
nonPurchased = customers_IICF.loc[customer_id]
nonPurchased = nonPurchased[nonPurchased.isnull()]

The result is an incredibly sparse list as the customer has most likely only purchased a few products from the platform. The model will be build on the 3 nearest neighbours using the correlations from above.

In [None]:
# Loop over each non-purchased items in the customer's list.
for product_id, val in nonPurchased.iteritems():

    # Initialize ratings and weights
    rating = 0
    weights_sum = 0

    # Get correlations of non-purchased product and sort to get k closest neighbours
    neighbours_corr = correlations[product_id].sort_values(ascending=False)[1: K+1]

    # Get mean rating of the product
    product_mean = customers_IICF[product_id].mean()

    # Get mean ratings of the neighbourghs
    neighbours_ratings = customers_IICF[neighbours_corr.index].transpose()
    neighbours_means = neighbours_ratings.mean(axis=1)

    # Loop over each neighbourgh of the product
    for neighbour_id, row in neighbours_ratings.iterrows():

        # If NaN, check next neighbourgh
        if np.isnan(row[customer_id]):
            continue

        # Else, calculate subparts of score calculation
        else:
            rating += neighbours_corr[neighbour_id] * (row[customer_id] - neighbours_means[neighbour_id])
            weights_sum += abs(neighbours_corr[neighbour_id])

    # Calculate score
    if weights_sum > 0:
        rating /= weights_sum
    else:
        rating += product_mean
        nonPurchased.at[product_id] = rating

# Return k best scores
nonPurchased.sort_values(ascending=False)[0:K]

The result is three recommendations based on top-similatiries of products that the user have not yet bought, to what the user have bought previously.

### <span style="color:blue">3.1.4:</span> Investigate results and evaluate

To investigate the results, a dataset with each user in the index and their top three items as columns will be constructed. Here, which items recommended will not be important as the dataset does not allow for any meaningful investigation due to lack of features, instead, the evaluation will look at the recommendation scores. Note that only the first 100 customers are investigated, as the runtime is very big.

In [None]:
# Initialize empty dataframe
userCheck = pd.DataFrame(columns=['Customer', 'Item 1: Score', 'Item 2: Score', 'Item 3: Score'])

K = 3
for i in range(100):

    # Get customer non purchased items
    customer_id = customers_IICF.index[i]
    nonPurchased = customers_IICF.loc[customer_id]
    nonPurchased = nonPurchased[nonPurchased.isnull()]


    # Loop over each non-purchased items in the customer's list.
    for product_id, val in nonPurchased.items():

        rating = 0
        weights_sum = 0

        # Get correlations of non-purchased product and sort to get k closest neighbours
        neighbours_corr = correlations[product_id].sort_values(ascending=False)[1: K+1]

        # Get mean rating of the product
        product_mean = customers_IICF[product_id].mean()

        # Get mean ratings of the neighbourghs
        neighbours_ratings = customers_IICF[neighbours_corr.index].transpose()
        neighbours_means = neighbours_ratings.mean(axis=1)

        # Loop over each neighbourgh of the product
        for neighbour_id, row in neighbours_ratings.iterrows():

            # If NaN, check next neighbourgh
            if np.isnan(row[customer_id]):
                continue

            # Else, calculate subparts of score calculation
            else:
                rating += neighbours_corr[neighbour_id] * (row[customer_id] - neighbours_means[neighbour_id])
                weights_sum += abs(neighbours_corr[neighbour_id])

        # Calculate score
        if weights_sum > 0:
            rating /= weights_sum
        else:
            rating += product_mean
            nonPurchased.at[product_id] = rating

    # Return k best scores
    appender = nonPurchased.sort_values(ascending=False)[0:K].to_list()
    appender.insert(0,customer_id)
    userCheck.loc[i] = appender

userCheck.head()

In [None]:
userCheck.describe()

Investigating the results of the recommendations once again proves that since each product is bought so irregularly, traditional recommendations like these become close to obsolete as the similarity score is just straight-up 5 for each item, most likely due to lack of samples. However, without more information about the products themselves to do a qualitative investigation, the results are hard to deem good or bad.

Looking into the items recommended, a specific look is taken at user '004288347e5e88a27ded2bb23747066c' (arbitrarily chosen).

In [None]:
K = 3
customer_id = '004288347e5e88a27ded2bb23747066c'
nonPurchased = customers_IICF.loc[customer_id]
nonPurchased = nonPurchased[nonPurchased.isnull()]
# Loop over each non-purchased items in the customer's list.
for product_id, val in nonPurchased.items():

    # Initialize ratings and weights
    rating = 0
    weights_sum = 0

    # Get correlations of non-purchased product and sort to get k closest neighbours
    neighbours_corr = correlations[product_id].sort_values(ascending=False)[1: K+1]

    # Get mean rating of the product
    product_mean = customers_IICF[product_id].mean()

    # Get mean ratings of the neighbourghs
    neighbours_ratings = customers_IICF[neighbours_corr.index].transpose()
    neighbours_means = neighbours_ratings.mean(axis=1)

    # Loop over each neighbourgh of the product
    for neighbour_id, row in neighbours_ratings.iterrows():

        # If NaN, check next neighbourgh
        if np.isnan(row[customer_id]):
            continue
        # Else, calculate subparts of score calculation
        else:
            rating += neighbours_corr[neighbour_id] * (row[customer_id] - neighbours_means[neighbour_id])
            weights_sum += abs(neighbours_corr[neighbour_id])

    # Calculate score
    if weights_sum > 0:
        rating /= weights_sum
    else:
        rating += product_mean
        nonPurchased.at[product_id] = rating

# Return k best scores
nonPurchased.sort_values(ascending=False)[0:K]

The customer has previously purchased the products '6e1b14d3cbb5fb3a2c00351007127dfd' and 'a2bd2eae20998a24c22b110334928b02' as can be seen below. The products Belong to the categories "Cool stuff" and "Luggage accessories".

In [None]:
customers_IICF[customers_IICF.index == customer_id].count().sort_values(ascending = False)

In [None]:
productsT = products_IICF.merge(products_translation[['product_category_name_english', 'product_category_name']], on = 'product_category_name')


In [None]:
productsT[productsT['product_id'] == '6e1b14d3cbb5fb3a2c00351007127dfd']

In [None]:
productsT[productsT['product_id'] == 'a2bd2eae20998a24c22b110334928b02']

The recommended items for this customer can be seen below:

In [None]:
productsT[productsT['product_id'] == 'ffd4bf4306745865e5692f69bd237893']

In [None]:
productsT[productsT['product_id'] == '6562dd6467621f6b5ab019a96d4ee8c7']

In [None]:
productsT[productsT['product_id'] == '6d4529d03dc71d5aa90fd51b4633fdaa']

Here it can be seen that the recommendations are in three different categories, "Fashion", "Auto" and "Stationary". The categories does not seem very logically connected, further indicating that the dataset is too sparse for good recommendations using this technique.

### <span style="color:blue">3.1.5:</span> Build recommender function

The recommender function simple takes a user as an input, and reports back the most similar items to products this user has previously liked.

In [None]:
def recommendationIICF(customer_id):

    K = 3
    if customer_id in customers_IICF.index:
        nonPurchased = customers_IICF.loc[customer_id]
        nonPurchased = nonPurchased[nonPurchased.isnull()]

        # Loop over each non-purchased items in the customer's list.
        for product_id, val in nonPurchased.items():

            rating = 0
            weights_sum = 0

            # Get correlations of non-purchased product and sort to get k closest neighbours
            neighbours_corr = correlations[product_id].sort_values(ascending=False)[1: K+1]

            # Get mean rating of the product
            product_mean = customers_IICF[product_id].mean()

            # Get mean ratings of the neighbourghs
            neighbours_ratings = customers_IICF[neighbours_corr.index].transpose()
            neighbours_means = neighbours_ratings.mean(axis=1)

            # Loop over each neighbourgh of the product
            for neighbour_id, row in neighbours_ratings.iterrows():

                # If NaN, check next neighbourgh
                if np.isnan(row[customer_id]):
                    continue
                #Else, calculate subparts of score calculation
                else:
                    rating += neighbours_corr[neighbour_id] * (row[customer_id] - neighbours_means[neighbour_id])
                    weights_sum += abs(neighbours_corr[neighbour_id])

            if weights_sum > 0:
                rating /= weights_sum
            else:
                rating += product_mean
                nonPurchased.at[product_id] = rating

        #Return k best scores
        scores = nonPurchased.sort_values(ascending=False)[0:K]
        scores_index = scores.index.to_list()
        scores_values = scores.to_list()
        scores = [[scores_index[0],scores_values[0]],[scores_index[1],scores_values[1]],[scores_index[2],scores_values[2]]]

    else:
        return "Failed"

    return scores

Here for example, providing the function with user '005030ef108f58b46b78116f754d8d38' returns the following products and scores:

In [None]:
recommendations = recommendationIICF('004288347e5e88a27ded2bb23747066c')
recommendations

Where it can be seen that this person gets recommended product 'ffd4bf4306745865e5692f69bd237893', product '6562dd6467621f6b5ab019a96d4ee8c7' and product '6d4529d03dc71d5aa90fd51b4633fdaa'

## <span style="color:blue">3.2:</span> Conclusion

Similar to the other systems constructed, the item-item based collaborative filtering (IICF) also seems to suffer from the nature of the dataset. It seems too sparse to be used to do proper recommendations, additionally, due to the lack of features/data available the results are incredibly hard to evaluate properly but when investigating the products, no logical connections between the user and the purchases seemed to be present.

The upside for IICF however, is that it is able to provide recommendations for all customers in the database. This is a vast improvement from the previous models that were only able to provide recommendations for a limited subset of the customers/items.

# <span style="color:blue">4:</span> Hybrid Recommender Engine

Most recommender systems today use a combination of many methods to always have a good recommendation on hand. This is known as 'Hybrid recommender systems' and can be done in many different ways. Common for all of them is that they combine different individual models into one unified model.

Some techniques for hybridization include [[8]](https://en.wikipedia.org/wiki/Recommender_system#Hybrid_recommender_systems).
* **Weighted:** Relating scores from differen models with a weight and choose the best
* **Switching:** Choose among systems in a lexicographic order
* **Mixed:** Present all different recommendations together
* **Feature combination:** Combine features from multiple recommenders into a master recommender and give single recommendation

More exists, and each have their own advantage. Here *'Weighted'* and *'Feature combination'* will be hard to conduct due to the lack of information available to investigate the predictions made, which makes weighting hard and feature combination even harder.

*'Switching'* and *'Mixed'* seem better options. *'Mixed'* could be used as the nature of the recommendations are very different and can be used in different instances of the Olist website. *'Switching'* could be used to provide the best possible recommendation at all time according to some rule.

Before deciding, a recap of the three systems will be made:
1. **Association rule mining:** This system provides a direct recommendation for an additional buy when the customer puts a product in his/her basket. This system only works for items with associated rules.
1. **Content-based filtering:** This system looks at previous likes/dislike of the customer and recommends product similar to the ones liked by the customer. This system only works for customers with multiple purchases.
1. **Item-item based Collaborative filtering:** This system looks at correlation between items, and recommends an item to a user if it has a high correlation with items previously bought by the user. This system works for all users.

When investigating the different algorithms, it became clear that (1) gave the most logical recommendations but worked only for a few products (2) gave somewhat logical conclusions but worked only for a few customers and (3) gave somewhat ilogical recommendations but worked for all customers.

As such, using a combination of *'Switching'* and *'Mixed'* seems like a good option. Using *'Switching'* to always prioritize (1) over (2)/(3), and always prioritize (2) over (3). However, as the recommender systems provide recommendations of different nature, it would also be beneficial to get all recommendations from the system so they can be used in different areas. The system will be build as a simple prototype below, and then it will be evaluated in terms of the KPI's.

## <span style="color:blue">4.1:</span> Building the engine

In [None]:
# NB! Function uses customer_unique_id

def recommendationHYBRID(customer, items=None):

    best = None

    # Order recommendations as ARM > CBF > IICF

    ## IICF
    IICFrec = recommendationIICF(customer)
    if IICFrec == "Failed":
        None
    else:
        best = IICFrec

    ## CBF
    CBFrec = recommendationCBF(customer)
    if  CBFrec != "Failed":
            best = CBFrec

    ## Association rule mining
    if (items != None):
        ARMrec = recommendationARM(items)
        if ARMrec != "Failed":
            best = ARMrec


    return best, ARMrec, CBFrec, IICFrec

The engine is a an example of how simple one can combine multiple systems into a stronger performing ensemble. By using a simple lexicographic ordering the system will output the best recommendations, where the lexicographic order has been determined by previous investigations. Additionally, the system will provide the opportunity to access any recommendation made by the system.

Below is an example of naively going over 100 different products and customers.

In [None]:
productsIT = pd.read_csv("olist_products_dataset.csv")
customersIT = pd.read_csv("olist_customers_dataset.csv")

In [None]:
for i in range(100):
    a = recommendationHYBRID(customersIT['customer_unique_id'].iloc[i], productsIT['product_id'].iloc[i])
    print(a)
    if a[0] != None:
        b = a

Such that the results can easily be accessed by index i.e. b[0] is the best recommendation and b[1-3] is ARM, CBF and IICF respectively. Note that scores have been included such that recommendations can always be evaluated. Although all scores are near perfect currently, as the system evolves cut-off thresholds could be made.

In [None]:
b[0]

In the printout above it can be seen that IICF does not find a result for every customer. This is because the model has only been trainde on customers with more than one purchase to limit runtimes, as it takes far too long to run on a normal laptop with all customers.

## <span style="color:blue">4.2:</span> Evaluating the results compared to KPI's

The hybrid recommender engine is able to put out three different recommendations. For the KPI estimation it will be assumed that the recommendations are mutually exclusive for simplification. Note that the IICF model was build only for a subset of the total data, but works for the entire set of customers as shown below.

In [None]:
print("Number of recommendations from ARM:  ", len(df_ar))
print("Number of recommendations from CBF:  ", len(customers.columns))
print("Number of recommendations from IICF: ", len(customersIT))

It is assumed that given a ARM recommendation a customer is 10% likely to buy the additional product, given a CBF recommendation the customer is 5% likely to buy the product and given an IICF recommendation the customer is 1% likely to buy the product. These numbers should be adjusted according to domain knowledge provided by Olist.

The average price of the products is 120, as calculated in the descriptive analysis. This leads to the following increases in Quantity and Revenue:

In [None]:
print("Quantity increase from ARM: ", len(df_ar)*0.1)
print("Quantity increase from CBF: ", len(customers.columns)*0.05)
print("Quantity increase from IICF: ", len(customersIT)*0.01)

In [None]:
print("Revenue increase from ARM: ", round(len(df_ar)*0.1*120), "R$")
print("Revenue increase from CBF: ", round(len(customers.columns)*0.05*120), "R$")
print("Revenue increase from IICF: ", round(len(customersIT)*0.01*120), "R$")

Such that the total increase is:

In [None]:
print("Total revenue increase: ", 2280+2172+119329, "R$")

7376431 Pakistani rupees | 73 Lacs

which when compared to an approximation of existing revenue:

In [None]:
print("Current revenue approximation: ", round(len(customersIT)*120), "R$")

Is an increase in revenue of approximately:

In [None]:
print("Revenue increase: ", round((((11932920+123781)/11932920)-1)*100,2),"%")

Which can be visualized with contributions from the different engines as can be seen below:

In [None]:
fig = go.Figure(go.Waterfall(
    measure = ["absolute", "relative", "relative", "relative", "total"],
    x = ["Revenue old", "ARM", "CBF", "IICF", "Revenue new"],
    textposition = "outside",
    text = ["+11932920", "+2280", "+2172", "+119329", "Total"],
    y = [11932920, 2280, 2172, 119329, 0],
    connector = {"line":{"color":"rgb(63, 63, 63)"}},
))

fig.update_layout(
        title = "Increase in revenue from implementing the Hybrid Recommender system",
        showlegend = False,
        yaxis=dict(range=[10500000,12500000])
)

fig.show()

Here it becomes apparent that even though ARM and CBF are the best recommender systems, IICF is still able to contribute with a significant revenue boost due to the shear amount of customers.

Furthermore, as orders increase and customers buy multiple products and purchase multiple times on Olist, these systems can be re-trained and improved further.

However, it is important to evaluate all measures using Olist's domain knowledge to varify assumptions made herein.

# <span style="color:orange">D:</span> Discussion

There are many points of discussion in this technical report, the most important which will be listed below.

1. **Most customers only use Olist once**

As was discovered in the descriptive analysis, most customers only use Olist once. This impacts systems that rely on user-information to make recommendations such as CBF and IICF. This can either be due to a too limited datarange, or simply because Olist has a very special business model. Either way, this is something that must be looked into in order to increase performance of the systems.

2. **Most customers buy only one product**

Similarly to customers only using Olist once, the majority of customers only purchase one single product as well. This means that not only are there very sparse customer information, there is also very sparse product information due to the excessive amounts of products on the platform. This further reduces performance of the recommender systems, especially systems such as the ARM, but CBF and IICF systems are also impacted. Using recommender systems can be part of the solution to making people buy more, and might improve on itself when implemented.

3. **Features are too superficial**

A big issue especially for the CBF system is that features for the products are very superficial and does not tell much about the product itself. It the product description was available an SVD and NLP model could be used to extract more detailed information about the products which would significantly enhance predictions using CBF.

4. **Reviews are per order not per product**

One big issue when using CBF and IICF is that Olist has recommendations not pr. product but pr. order, this means that the reviews may not be directly transferable to the product as has been done in this report. This issue once again shows that the Olist data pose some serious limits on the performance of the systems, why they at most can be considered MVP's (Minimum viable products) or POC's (Proof of concepts).

5. **Systems have not been evaluated using test/training splits**

Also traditionally being best practice, test/training splits have not been used to evaluate the systems in this report. This is due to the fact that initial evaluations showed how *random* the systems behaved due to lack of data and sparsity. As such, it was concluded that test/training splits and more traditional evaluation techniques would only lead to wrong conclusions.

6. **Systems are quite 'advanced'**

The systems proposed here are quite advanced, however, due to the limitations posed by the Olist dataset, Olist might be better off using some very basic and naive recommendations initially, before implementing more advanced systems like these. One such example could be to simply look at the product the user is currently viewing and finding similar products using a very basic similarity construct. I.e. if the user is viewing a Huawai phone, an android and an apple iphone might be listed as recommendations to give the customer more opportunities. This type of recommendation system was, however, not implemented as this is hypothesized to not directly increase products pr. order as it simply suggests alternatives.

7. **Lack of domain knowledge**

The final point of discussion is the lack of domain knowledge available. Olist operates in Brazil and might have a very different culture than what is known here in karachi, additionally, not much knowledge was available for how Olist operates on a general level. As such, any recommendations and analysis herein should preferably be gone over together with domain experts from Olist.

# <span style="color:orange">E:</span> Conclusion

In this report three different recommender systems were constructed and finally combined into a Hybrid recommender system using the *Switching* and *Mixed* technique.

The three systems constructed are:
1. Association rule mining
1. Content-based filtering
1. Item-item collaborative filtering

The association rule mining (ARM) found 190 products for which it could construct a recommendation rule. The content-based filtering (CBF) found 362 users for which it could provide customized recommendations. Finally, item-item based collaborative filtering (IICF) found 99441 users for which it could provide recommendations.

This resulted in a 1.04% revenue increase, assuming an avg. price of 120R$ and customer willingness to follow recommendations as: ARM: 10\%, CBF: 5\% and IICF: 1\%.

The systems perform relatively poorly with the current data, but with construction of these proof of concept models they can easily be improved by adding additional data. Additionally, as customers keep purchasing more, the models will grow and become better as a result.

Therefore, it is recommended that Olist start evaluating the recommendations more in-depth, and start deploying them with extensive supervision to further develop the systems in a real-world environment.

# <span style="color:orange">F:</span> References

1. https://en.wikipedia.org/wiki/Upselling
1. https://link.springer.com/article/10.1007/s12599-016-0453-1
1. https://dl.acm.org/doi/pdf/10.1145/336992.337035
1. http://www.vldb.org/conf/1994/P487.PDF
1. https://pypi.org/project/apyori/
1. https://en.wikipedia.org/wiki/Recommender_system#Content-based_filtering
1. https://cn.inside.dtu.dk/cnnet/filesharing/download/4d379354-2162-4c5c-bdd6-5b732f3a7bc2
1. https://en.wikipedia.org/wiki/Recommender_system#Hybrid_recommender_systems
