# Applying Apirori for Market-Basket Analysis

The Apriori algorithm was proposed by Agrawal and Srikant in 1994.
There are three major components of the Apriori algorithm:

1) Support

2) Confidence

3) Lift

## Support

Support refers to the popularity of item and can be calculated by finding the number of transactions containing a particular item divided by the total number of transactions.

#### Example:

Support(diaper) = (Transactions containing (diaper))/(Total Transactions)
Support(diaper) = 150 / 1000 = 15 %

## Confidence

Confidence refers to the likelihood that an item B is also bought if item A is bought. It can be calculated by finding the number of transactions where A and B are bought together, divided by the total number of transactions where A is bought. Mathematically, it can be represented as:

#### Example

The confidence of likelihood of purchasing a diaper if a customer purchase milk.

Confidence(milk → diaper) = (Transactions containing both (milk and diaper))/(Transactions containing milk)
Confidence(milk → daiper) =30 / 120 = 25 %


## Lift

Lift refers to the increase in the ratio of the sale of B when A is sold.
Lift(A –> B) can be calculated by dividing Confidence(A -> B) divided by Support(B).
Mathematically it can be represented as:

#### Example

Lift(milk → diaper) = (Confidence (milk → diaper))/(Support (diaper))
Lift(milk → diaper) = 25 / 15 = 1.66
So by Lift theory, there is 1.66 times more chance of buying milk and diaper together then just buying diaper alone.

## Association rule by Lift

# 1. Importing libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from apyori import apriori

# 2.Importing the dataset

In [2]:
groceries_df = pd.read_csv('groceries - groceries.csv',encoding='latin-1')

In [3]:
groceries_df.head()

Unnamed: 0,Item(s),Item 1,Item 2,Item 3,Item 4,Item 5,Item 6,Item 7,Item 8,Item 9,...,Item 23,Item 24,Item 25,Item 26,Item 27,Item 28,Item 29,Item 30,Item 31,Item 32
0,4,citrus fruit,semi-finished bread,margarine,ready soups,,,,,,...,,,,,,,,,,
1,3,tropical fruit,yogurt,coffee,,,,,,,...,,,,,,,,,,
2,1,whole milk,,,,,,,,,...,,,,,,,,,,
3,4,pip fruit,yogurt,cream cheese,meat spreads,,,,,,...,,,,,,,,,,
4,4,other vegetables,whole milk,condensed milk,long life bakery product,,,,,,...,,,,,,,,,,


In [4]:
groceries_df = groceries_df.drop(['Item(s)'], axis = 1)

In [5]:
num_records = len(groceries_df)
print(num_records)

9835


# 3.Data Proprocessing

In [6]:
records =[]
for i in range(0,num_records):
    records.append([str(groceries_df.values[i,j]) for j in range(0, 32)])

# 4.Using Apriori from the apriori library

In [7]:
association_rules = apriori (records,min_support=0.0045, min_confidence = 0.20, min_lift = 3, min_length =2)
association_results = list(association_rules)

The first rule

In [8]:
print(association_results[0])

RelationRecord(items=frozenset({'whipped/sour cream', 'baking powder'}), support=0.004575495678698526, ordered_statistics=[OrderedStatistic(items_base=frozenset({'baking powder'}), items_add=frozenset({'whipped/sour cream'}), confidence=0.25862068965517243, lift=3.607850330154072)])


# 5.Display results 

In [9]:
results = []
for item in association_results:
    pair = item[0]
    items = [x for x in pair if x != 'nan']
    
    value0 = str(items[0])
    value1 = str(items[1])
    
    value2 = str(item[1])[:7]
    
    value3 = str(item[2][0][2])[:7]
    value4 = str(item[2][0][3])[:7]
    
    rows = (value0,value1,value2,value3,value4)
    results.append(rows)

labels = ['Item 1','Item 2','Support','Confidence','Lift']
item_suggestion = pd.DataFrame.from_records(results,columns = labels)
pd.set_option('display.max_rows', item_suggestion.shape[0]+1)
# item_suggestion = item_suggestion.drop_duplicates(subset=['Item 1', 'Item 2'], keep=False)
item_suggestion.style.hide_index()
# item_suggestion

Item 1,Item 2,Support,Confidence,Lift
whipped/sour cream,baking powder,0.00457,0.25862,3.60785
root vegetables,beef,0.01738,0.33139,3.04036
berries,whipped/sour cream,0.00904,0.27217,3.79688
liquor,bottled beer,0.00467,0.42201,5.24059
bottled beer,red/blush wine,0.00488,0.25396,3.15375
sugar,flour,0.00498,0.28654,8.46311
root vegetables,herbs,0.00701,0.43124,3.95647
sliced cheese,sausage,0.00701,0.2863,3.04743
whipped/sour cream,baking powder,0.00457,0.25862,3.61297
root vegetables,beef,0.01728,0.32945,3.0254
