### Question
Market Basket Analysis: Apriori Algorithm <br>
Dataset: Order1.csv <br>
The dataset has 38765 rows of the purchase orders of people from the
grocery stores. <br>
These orders can be analysed, and association rules can
be generated using Market Basket Analysis by algorithms like Apriori
Algorithm. <br>
Follow following Steps: <br>
1. Data Pre-processing
2. Generate the list of transactions from the dataset
3. Train Apriori on the dataset
4. Visualize the list of datasets

In [7]:
# Import necessary libraries
import pandas as pd
from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules

In [8]:
# Load the dataset
df = pd.read_csv('Order1.csv')

In [9]:
# Create a basket df
basket = (df.groupby(['Member_number', 'itemDescription'])['Date'].count().unstack().reset_index().fillna(0).set_index('Member_number')) 

In [10]:
# Visualize Basket df
basket.head(5)

itemDescription,Instant food products,UHT-milk,abrasive cleaner,artif. sweetener,baby cosmetics,bags,baking powder,bathroom cleaner,beef,berries,...,turkey,vinegar,waffles,whipped/sour cream,whisky,white bread,white wine,whole milk,yogurt,zwieback
Member_number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,1.0,0.0
1001,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,1.0,0.0,1.0,0.0,2.0,0.0,0.0
1002,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
1003,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0


In [11]:
# Encoding to purchased or not purchased
basket = basket.applymap(lambda x: 1 if x >= 1 else 0)

In [14]:
# Generate the list of transactions from the dataset
transactions = basket.apply(lambda row: row.index[row == 1].tolist(), axis=1).tolist()

# Train Apriori on the dataset
frequent_itemsets = apriori(basket, min_support=0.02, use_colnames=True)  # Adjust min_support as needed

# Generate association rules
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)



In [15]:
# Display frequent itemsets
print("Frequent Itemsets:")
print(frequent_itemsets.head(5))

# Display association rules
print("\nAssociation Rules:")
print(rules.head(5))

Frequent Itemsets:
    support         itemsets
0  0.078502       (UHT-milk)
1  0.031042  (baking powder)
2  0.119548           (beef)
3  0.079785        (berries)
4  0.062083      (beverages)

Association Rules:
          antecedents         consequents  antecedent support  \
0          (UHT-milk)     (bottled water)            0.078502   
1     (bottled water)          (UHT-milk)            0.213699   
2          (UHT-milk)  (other vegetables)            0.078502   
3  (other vegetables)          (UHT-milk)            0.376603   
4          (UHT-milk)        (rolls/buns)            0.078502   

   consequent support   support  confidence      lift  leverage  conviction  \
0            0.213699  0.021293    0.271242  1.269268  0.004517    1.078960   
1            0.078502  0.021293    0.099640  1.269268  0.004517    1.023477   
2            0.376603  0.038994    0.496732  1.318979  0.009430    1.238697   
3            0.078502  0.038994    0.103542  1.318979  0.009430    1.027933   
4