# Association Rule - Apriori and ECLAT 

Training association rule models (Apriori and ECLAT) to find the most related items bought by customers of a french supermarket during a week. All 7501 lines of the dataset represent items bought by an unique customer, during this week.

This algorithm associate products preferences by most of the customers and can be used to generate products recommendation and help on displaying products strategy.

In [None]:
!pip install apyori


Collecting apyori
  Downloading https://files.pythonhosted.org/packages/5e/62/5ffde5c473ea4b033490617ec5caa80d59804875ad3c3c57c0976533a21a/apyori-1.1.2.tar.gz
Building wheels for collected packages: apyori
  Building wheel for apyori (setup.py) ... [?25l[?25hdone
  Created wheel for apyori: filename=apyori-1.1.2-cp36-none-any.whl size=5975 sha256=bb0fe07a47b2e2c6f86b6a0d246c3071090427d36ec7a306a445b956c3afd480
  Stored in directory: /root/.cache/pip/wheels/5d/92/bb/474bbadbc8c0062b9eb168f69982a0443263f8ab1711a8cad0
Successfully built apyori
Installing collected packages: apyori
Successfully installed apyori-1.1.2


In [None]:
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [None]:
# Data Loading
dataset = pd.read_csv('Market_Basket_Optimisation.csv', header = None)

In [None]:
dataset

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,shrimp,almonds,avocado,vegetables mix,green grapes,whole weat flour,yams,cottage cheese,energy drink,tomato juice,low fat yogurt,green tea,honey,salad,mineral water,salmon,antioxydant juice,frozen smoothie,spinach,olive oil
1,burgers,meatballs,eggs,,,,,,,,,,,,,,,,,
2,chutney,,,,,,,,,,,,,,,,,,,
3,turkey,avocado,,,,,,,,,,,,,,,,,,
4,mineral water,milk,energy bar,whole wheat rice,green tea,,,,,,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
7496,butter,light mayo,fresh bread,,,,,,,,,,,,,,,,,
7497,burgers,frozen vegetables,eggs,french fries,magazines,green tea,,,,,,,,,,,,,,
7498,chicken,,,,,,,,,,,,,,,,,,,
7499,escalope,green tea,,,,,,,,,,,,,,,,,,


In [None]:
# Adding all customers into a list of lists
transactions = []
for i in range(0, len(dataset)):
    transactions.append([str.(dataset.values[i,j]) for j in range(0, 20)])

In [None]:
transactions[:2]

[['shrimp',
  'almonds',
  'avocado',
  'vegetables mix',
  'green grapes',
  'whole weat flour',
  'yams',
  'cottage cheese',
  'energy drink',
  'tomato juice',
  'low fat yogurt',
  'green tea',
  'honey',
  'salad',
  'mineral water',
  'salmon',
  'antioxydant juice',
  'frozen smoothie',
  'spinach',
  'olive oil'],
 ['burgers',
  'meatballs',
  'eggs',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan']]

### Apriori implementation using apyori library 
source: https://github.com/ymoch/apyori

The output of this part is to see which are the products that used to be more bought in combination compared to other combinations using apriori algorithm.

We will put some transformations to fit on dataframes and to make the visualization easier.

In [None]:
# Inspecting elements
transactions[:3]

[['shrimp',
  'almonds',
  'avocado',
  'vegetables mix',
  'green grapes',
  'whole weat flour',
  'yams',
  'cottage cheese',
  'energy drink',
  'tomato juice',
  'low fat yogurt',
  'green tea',
  'honey',
  'salad',
  'mineral water',
  'salmon',
  'antioxydant juice',
  'frozen smoothie',
  'spinach',
  'olive oil'],
 ['burgers',
  'meatballs',
  'eggs',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan'],
 ['chutney',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan']]

We dont want to set confidence too high. Because some items appear togheter not because they are associated, but because they are purchased a lot. So if people buy miniral water and toilet paper together its simply because everyones buys them, not becasue we need toilet paper because you bought mineral water.

In [None]:
# Training Apriori on the dataset
# The hyperparameters choosen on this training are:
# min_support = items bought more than 3 times a day * 7 days (week) / 7500 customers = 0.0028
# min_confidence: at least 20%, min_lift = minimum of 3 (less than that is too low)
# min_length: we want at least 2 items to be associated. No point in having a single item in the result

from apyori import apriori
rules = apriori(transactions, min_support = 0.003, min_confidence = 0.2, min_lift = 3, min_length = 2)

In [None]:
# Visualising the results
results = list(rules)

In [None]:
results

[RelationRecord(items=frozenset({'light cream', 'chicken'}), support=0.004532728969470737, ordered_statistics=[OrderedStatistic(items_base=frozenset({'light cream'}), items_add=frozenset({'chicken'}), confidence=0.29059829059829057, lift=4.84395061728395)]),
 RelationRecord(items=frozenset({'escalope', 'mushroom cream sauce'}), support=0.005732568990801226, ordered_statistics=[OrderedStatistic(items_base=frozenset({'mushroom cream sauce'}), items_add=frozenset({'escalope'}), confidence=0.3006993006993007, lift=3.790832696715049)]),
 RelationRecord(items=frozenset({'escalope', 'pasta'}), support=0.005865884548726837, ordered_statistics=[OrderedStatistic(items_base=frozenset({'pasta'}), items_add=frozenset({'escalope'}), confidence=0.3728813559322034, lift=4.700811850163794)]),
 RelationRecord(items=frozenset({'honey', 'fromage blanc'}), support=0.003332888948140248, ordered_statistics=[OrderedStatistic(items_base=frozenset({'fromage blanc'}), items_add=frozenset({'honey'}), confidence=0

In [None]:
lift = []
association = []
for i in range (0, len(results)):
    lift.append(results[:len(results)][i][2][0][3])
    association.append(list(results[:len(results)][i][0]))

### Visualizing results in a dataframe

In [None]:
rank = pd.DataFrame([association, lift]).transpose()
rank.columns = ['Association', 'Lift']

In [None]:
# Show top 10 higher lift scores
rank.sort_values('Lift', ascending=False).head(10)

Unnamed: 0,Association,Lift
150,"[soup, frozen vegetables, milk, mineral water,...",7.98718
97,"[soup, frozen vegetables, mineral water, milk]",7.98718
149,"[frozen vegetables, milk, olive oil, mineral w...",6.12827
96,"[olive oil, frozen vegetables, mineral water, ...",6.12827
132,"[olive oil, nan, mineral water, whole wheat pa...",6.12827
59,"[olive oil, whole wheat pasta, mineral water]",6.11586
50,"[spaghetti, ground beef, tomato sauce]",5.53597
122,"[spaghetti, nan, ground beef, tomato sauce]",5.53597
28,"[honey, nan, fromage blanc]",5.17882
3,"[honey, fromage blanc]",5.16427


By the study, "olive oil, whole wheat pasta, mineral water" are the most commom combined items from this week for the supermarket in question.  