# OVERVIEW

## Market Basket Analysis with Apriori
**Market Basket Analysis** is the process of discovering frequent item sets in large transactional database is called market basket analysis. In another definition, market basket analysis is frequent item set mining leads to the discovery of associations and correlations among items.

#### Load Library

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from apyori import apriori

#### Importing the Dataset

In [2]:
supermarket = pd.read_csv('GroceryStoreDataSet.csv', index_col=False)
supermarket

Unnamed: 0,ITEM1,ITEM2,ITEM3,ITEM4
0,MILK,BREAD,BISCUIT,
1,BREAD,MILK,BISCUIT,CORNFLAKES
2,BREAD,TEA,BOURNVITA,
3,JAM,MAGGI,BREAD,MILK
4,MAGGI,TEA,BISCUIT,
5,BREAD,TEA,BOURNVITA,
6,MAGGI,TEA,CORNFLAKES,
7,MAGGI,BREAD,TEA,BISCUIT
8,JAM,MAGGI,BREAD,TEA
9,BREAD,MILK,,


In [3]:
num_records = len(supermarket)
num_records

20

#### Data Preprocessing

In [4]:
records = []
for i in range(0, num_records):
    records.append([str(supermarket.values[i,j]) 
                  for j in range(0,3)])

### Applying Apriori
Specify the parameters of apriori class.
- The list
- min_support
- min_confidence
- min_lift
- min_length (the minimum number of items that you want in your rules, typically 2)

In [5]:
association_rules = apriori(records,
                           min_support=0.05,
                           min_confidence=0.5,
                           min_lift=3,
                           min_length=2)
association_results = list(association_rules)
print(len(association_results))

12


In [6]:
print(association_results[0])

RelationRecord(items=frozenset({'MAGGI', 'JAM'}), support=0.1, ordered_statistics=[OrderedStatistic(items_base=frozenset({'JAM'}), items_add=frozenset({'MAGGI'}), confidence=1.0, lift=4.0)])


In [7]:
results = []
for item in association_results:
    # first index of the inner list
    # Contains base item and add item
    pair = item[0]
    items = [x for x in pair]
    
    #first index of the inner list
    value0 = items[0] + " -> " + items[1]
    value1 = str(item[1])
    value2 = str(item[2][0][2])
    value3 = str(item[2][0][3])
    
    
    #third index of the inner list
    
    rows = (value0, value1, value2, value3)
    results.append(rows)
    
labels = ['Rule','Support', 'Confidence', 'Lift']
supermarket_suggestion = pd.DataFrame.from_records(results, columns=labels)
supermarket_suggestion

Unnamed: 0,Rule,Support,Confidence,Lift
0,MAGGI -> JAM,0.1,1.0,4.0
1,nan -> MILK,0.05,1.0,5.0
2,BISCUIT -> BREAD,0.1,0.6666666666666667,3.333333333333333
3,BISCUIT -> COCK,0.1,1.0,6.666666666666667
4,BISCUIT -> MAGGI,0.05,1.0,3.333333333333333
5,BOURNVITA -> BREAD,0.1,0.6666666666666667,3.333333333333333
6,BOURNVITA -> SUGER,0.05,1.0,3.333333333333333
7,MAGGI -> BREAD,0.1,1.0,4.0
8,BREAD -> nan,0.05,1.0,5.0
9,SUGER -> CORNFLAKES,0.05,1.0,3.333333333333333
