In [1]:
!pip install apyori



# Eclat for Market Basket Analysis

Using Eclat algorithm to analyze and recognise various possible combinations of items bought together in a grocery store

Eclat is basically a simplified version of apriori, which considers only support, instead of using all 3 i.e. Support, Confidence and Lift. Therefore, it can be implemented the same way as apriori, by only considering support during visualization

In [2]:
import pandas as pd
import numpy as np
from apyori import apriori

## Preprocessing the data 

In [3]:
dataset=pd.read_csv('Market_Basket_Optimisation.csv',header=None)
dataset.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,shrimp,almonds,avocado,vegetables mix,green grapes,whole weat flour,yams,cottage cheese,energy drink,tomato juice,low fat yogurt,green tea,honey,salad,mineral water,salmon,antioxydant juice,frozen smoothie,spinach,olive oil
1,burgers,meatballs,eggs,,,,,,,,,,,,,,,,,
2,chutney,,,,,,,,,,,,,,,,,,,
3,turkey,avocado,,,,,,,,,,,,,,,,,,
4,mineral water,milk,energy bar,whole wheat rice,green tea,,,,,,,,,,,,,,,


In [4]:
dataset.shape

(7501, 20)

The dataset contains the items bought by a customer i.e. each row represents one customer.

Converting the dataframe into a list of lists, as required by the apriori algorithm.

In [5]:
transactions=[]
for i in range(0,7501):
    transactions.append([str(dataset.values[i,j]) for j in range(0,20)])

In [6]:
transactions[:2]

[['shrimp',
  'almonds',
  'avocado',
  'vegetables mix',
  'green grapes',
  'whole weat flour',
  'yams',
  'cottage cheese',
  'energy drink',
  'tomato juice',
  'low fat yogurt',
  'green tea',
  'honey',
  'salad',
  'mineral water',
  'salmon',
  'antioxydant juice',
  'frozen smoothie',
  'spinach',
  'olive oil'],
 ['burgers',
  'meatballs',
  'eggs',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan',
  'nan']]

## Training the Eclat model

In [7]:
from apyori import apriori
rules=apriori(transactions,min_support=0.003,min_confidence=0.2,min_lift=3,min_length=2,max_length=2)

List of observed rules can be seen below:

In [8]:
results=list(rules)
results

[RelationRecord(items=frozenset({'light cream', 'chicken'}), support=0.004532728969470737, ordered_statistics=[OrderedStatistic(items_base=frozenset({'light cream'}), items_add=frozenset({'chicken'}), confidence=0.29059829059829057, lift=4.84395061728395)]),
 RelationRecord(items=frozenset({'mushroom cream sauce', 'escalope'}), support=0.005732568990801226, ordered_statistics=[OrderedStatistic(items_base=frozenset({'mushroom cream sauce'}), items_add=frozenset({'escalope'}), confidence=0.3006993006993007, lift=3.790832696715049)]),
 RelationRecord(items=frozenset({'pasta', 'escalope'}), support=0.005865884548726837, ordered_statistics=[OrderedStatistic(items_base=frozenset({'pasta'}), items_add=frozenset({'escalope'}), confidence=0.3728813559322034, lift=4.700811850163794)]),
 RelationRecord(items=frozenset({'honey', 'fromage blanc'}), support=0.003332888948140248, ordered_statistics=[OrderedStatistic(items_base=frozenset({'fromage blanc'}), items_add=frozenset({'honey'}), confidence=0

## Visualization of Results

In [9]:
lhs,rhs,support=[],[],[]
for result in results:
    lhs.append(tuple(result[2][0][0])[0])
    rhs.append(tuple(result[2][0][1])[0])
    support.append(result[1])

In [10]:
columns=list(zip(lhs,rhs,support))

Converting into dataframe

In [11]:
result_df=pd.DataFrame(columns)
result_df.columns=['Item1','Item2','Support']
result_df

Unnamed: 0,Item1,Item2,Support
0,light cream,chicken,0.004533
1,mushroom cream sauce,escalope,0.005733
2,pasta,escalope,0.005866
3,fromage blanc,honey,0.003333
4,herb & pepper,ground beef,0.015998
5,tomato sauce,ground beef,0.005333
6,light cream,olive oil,0.0032
7,whole wheat pasta,olive oil,0.007999
8,pasta,shrimp,0.005066


Sorting the relations by their support

In [12]:
result_df=result_df.sort_values('Support',ascending=False)
result_df

Unnamed: 0,Item1,Item2,Support
4,herb & pepper,ground beef,0.015998
7,whole wheat pasta,olive oil,0.007999
2,pasta,escalope,0.005866
1,mushroom cream sauce,escalope,0.005733
5,tomato sauce,ground beef,0.005333
8,pasta,shrimp,0.005066
0,light cream,chicken,0.004533
3,fromage blanc,honey,0.003333
6,light cream,olive oil,0.0032


The results will be the same as apriori, only their order will differ.

### Since Apriori considers support, confidence and lift, unlike Eclat which considers only support, it is a better method for Association Learning. 