**Apriori Algorithm** in Python-Market Basket Analysis

**Problem Statement** : The manager of a retail store is trying to find out an association rule between six items, to figure out which items are more often bought together so that he can keep the items together in order to increase sales

In [None]:
!pip install apyori

Collecting apyori
  Downloading apyori-1.1.2.tar.gz (8.6 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: apyori
  Building wheel for apyori (setup.py) ... [?25l[?25hdone
  Created wheel for apyori: filename=apyori-1.1.2-py3-none-any.whl size=5954 sha256=8474d8839e9b41e1ca5042010eda54e8626be58deee6629f197a594c192f3187
  Stored in directory: /root/.cache/pip/wheels/77/3d/a6/d317a6fb32be58a602b1e8c6b5d6f31f79322da554cad2a5ea
Successfully built apyori
Installing collected packages: apyori
Successfully installed apyori-1.1.2


In [None]:
import numpy as np
import pandas as pd
from apyori import apriori

In [None]:
store_data = pd.read_csv('/Day1.csv',header=None)

In [None]:
print(store_data)

       0      1      2       3     4      5
0   Wine  Chips  Bread  Butter  Milk  Apple
1   Wine    NaN  Bread  Butter  Milk    NaN
2    NaN    NaN  Bread  Butter  Milk    NaN
3    NaN  Chips    NaN     NaN   NaN  Apple
4   Wine  Chips  Bread  Butter  Milk  Apple
5   Wine  Chips    NaN     NaN  Milk    NaN
6   Wine  Chips  Bread  Butter   NaN  Apple
7   Wine  Chips    NaN     NaN  Milk    NaN
8   Wine    NaN  Bread     NaN   NaN  Apple
9   Wine    NaN  Bread  Butter  Milk    NaN
10   NaN  Chips  Bread  Butter   NaN  Apple
11  Wine    NaN    NaN  Butter  Milk  Apple
12  Wine  Chips  Bread  Butter  Milk    NaN
13  Wine    NaN  Bread     NaN  Milk  Apple
14  Wine    NaN  Bread  Butter  Milk  Apple
15  Wine  Chips  Bread  Butter  Milk  Apple
16   NaN  Chips  Bread  Butter  Milk  Apple
17   NaN  Chips    NaN  Butter  Milk  Apple
18  Wine  Chips  Bread  Butter  Milk  Apple
19  Wine    NaN  Bread  Butter  Milk  Apple
20  Wine  Chips  Bread     NaN  Milk  Apple
21   NaN  Chips    NaN     NaN  

In [None]:
#find the shape of the dataset
store_data.shape

(22, 6)

**Convert the pandas dataframe into a List  of list**

In [None]:
records = []
for i in range(0,22):
  records.append([str(store_data.values[i,j]) for j in range(0,6)])

In [None]:
records

[['Wine', 'Chips', 'Bread', 'Butter', 'Milk', 'Apple'],
 ['Wine', 'nan', 'Bread', 'Butter', 'Milk', 'nan'],
 ['nan', 'nan', 'Bread', 'Butter', 'Milk', 'nan'],
 ['nan', 'Chips', 'nan', 'nan', 'nan', 'Apple'],
 ['Wine', 'Chips', 'Bread', 'Butter', 'Milk', 'Apple'],
 ['Wine', 'Chips', 'nan', 'nan', 'Milk', 'nan'],
 ['Wine', 'Chips', 'Bread', 'Butter', 'nan', 'Apple'],
 ['Wine', 'Chips', 'nan', 'nan', 'Milk', 'nan'],
 ['Wine', 'nan', 'Bread', 'nan', 'nan', 'Apple'],
 ['Wine', 'nan', 'Bread', 'Butter', 'Milk', 'nan'],
 ['nan', 'Chips', 'Bread', 'Butter', 'nan', 'Apple'],
 ['Wine', 'nan', 'nan', 'Butter', 'Milk', 'Apple'],
 ['Wine', 'Chips', 'Bread', 'Butter', 'Milk', 'nan'],
 ['Wine', 'nan', 'Bread', 'nan', 'Milk', 'Apple'],
 ['Wine', 'nan', 'Bread', 'Butter', 'Milk', 'Apple'],
 ['Wine', 'Chips', 'Bread', 'Butter', 'Milk', 'Apple'],
 ['nan', 'Chips', 'Bread', 'Butter', 'Milk', 'Apple'],
 ['nan', 'Chips', 'nan', 'Butter', 'Milk', 'Apple'],
 ['Wine', 'Chips', 'Bread', 'Butter', 'Milk', 'Apple

**Build the Apriori Model**

In [None]:
#Building the first Apriori Model
association_rules = apriori(records,min_support=0.50,min_confidence=0.7,min_lift=1.2,min_length=2)
association_results = list(association_rules)

In [None]:
#print the number of rules
print(len(association_results))

1


In [None]:
#have a glance at the rules
print(association_results)

[RelationRecord(items=frozenset({'Bread', 'Milk', 'Butter'}), support=0.5, ordered_statistics=[OrderedStatistic(items_base=frozenset({'Butter'}), items_add=frozenset({'Bread', 'Milk'}), confidence=0.7333333333333334, lift=1.241025641025641), OrderedStatistic(items_base=frozenset({'Bread', 'Milk'}), items_add=frozenset({'Butter'}), confidence=0.8461538461538461, lift=1.241025641025641)])]


**The support value for the first rule is 0.5**. This number is calculated by dividing the number of transactions containing 'Milk', 'Bread' , and 'Butter' by the total number of transactions.

**The confidence level for the rule is 0.846**, which shows that out of all the transactions that contain both 'Milk' and 'Bread', 84.6% contain 'butter' too.

**The lift of 1.241 tells us tha 'butter' is 1.241 times more likely to be bought by the customers who buy both 'milk' and 'butter' compared to the default likeihood sale of 'butter'.**