## Association Rule Learning

In machine learning, association rule learning is a method of finding interesting relationships between the variables in a large dataset. This concept is mainly used by supermarkets and multipurpose e-commerce websites. Where it is used for defining the patterns of selling different products. More formally we can say it is useful to extract strong riles from a large database using any measure of interestingness.



In [1]:
''' importing libraries '''
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori,association_rules

In [2]:
''' reading dataset '''
data_frame = pd.read_csv('https://raw.githubusercontent.com/aryan-jadon/Market-Basket-Item-Assignment/main/Dataset/TRAIN-ARULES.csv')
data_frame.product_name = ["".join(i.split(",")) for i in data_frame.product_name]
orders = data_frame.order_id.unique()


In [5]:
''' checking shape '''
orders.shape

(1418,)

In [6]:
''' data transformation '''
transactions = [data_frame[data_frame.order_id == i]["product_name"].tolist() for i in orders]
trans_encoder = TransactionEncoder().fit(transactions)
transactions = trans_encoder.transform(transactions)
transactions = pd.DataFrame(transactions,columns=trans_encoder.columns_)

In [7]:
transactions.head(5)

Unnamed: 0,0% Fat Free Organic Milk,0% Greek Blueberry on the Bottom Yogurt,0% Greek Strained Yogurt,1 Apple + 1 Pear Fruit Bar,1 Liter,1 Step Kashmir Spinach Indian Cuisine,1% Lowfat Milk,1% Milk,100 Calorie Per Bag Popcorn,100% Australian Tea Tree Oil,100% Carrot Juice,100% Florida Orange Juice,100% Grated Parmesan Cheese,100% Guava Juice,100% Juice,100% Juice Apple Juice,100% Juice Variety Pack,100% Lactose Free Milk,100% Lactose Free Reduced Fat Calcium Enriched Milk,100% Mango Juice,100% Mighty Mango Juice Smoothie,100% Natural Diced Tomatoes,100% Natural Spring Water,100% Natural Tomato Sauce,100% Orange Juice No Pulp,100% Organic Diced Tomatoes,100% Plant Protein Beastley Sliders,100% Pomegranate Juice,100% Premium Select Not From Concentrate Pure Prune Juice,100% Pure Apple Juice,100% Pure Corn Starch,100% Pure Pumpkin,100% Pure Vegetable Oil,100% Raw Coconut Water,100% Recycled 2 Ply Jumbo Paper Towel Roll,100% Recycled Aluminum Foil,100% Recycled Bath Tissue Rolls,100% Recycled Bathroom Tissue,100% Recycled Paper Towels,100% Whole Grain Corn Meal,...,XL Emerald White Seedless Grapes,XL Pick-A-Size Paper Towel Rolls,Yellow Bell Pepper,Yellow Corn Meal,Yellow Corn Taco Shells,Yellow Corn Tortilla Chips,Yellow Corn Tortillas,Yellow Enriched & Degerminated Corn Meal,Yellow Grape Tomatoes,Yellow Onions,Yellow Potato,Yellow Straightneck Squash,YoBaby Blueberry Apple Yogurt,YoKids Blueberry & Strawberry/Vanilla Yogurt,YoKids Squeeze! Organic Strawberry Flavor Yogurt,YoKids Squeezers Organic Low-Fat Yogurt Strawberry,YoKids Strawberry Banana/Strawberry Yogurt,Yobaby Organic Plain Yogurt,Yoghurt Blueberry,Yogurt Lowfat Strawberry,Yogurt Organic Lowfat Strawberry,Yogurt Pretzels,Yogurt Strained Low-Fat Coconut,Yotoddler Organic Pear Spinach Mango Yogurt,Yukon Gold Potatoes 5lb Bag,Z Bar Protein Peanut Butter Chocolate Protein Snack Bar,ZBar Organic Chocolate Brownie Energy Snack,ZBar Protein Chocolate Mint Protein Bar,Zen Tea,Zero Calorie Cola,Zero Calorie Cola Soda,Zero Calorie Lemon Lime Soda,Zero Calorie Tonic Water,Zero Go-Go Mixed Berry Vitamin Water,Zero Soda,Zero Vitamin Water,Zero XXX Acai Blueberry Pomegranate,Zucchini Noodles,Zucchini Squash,smartwater® Electrolyte Enhanced Water
0,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False


In [8]:
''' using apriori algorithm '''
freq_itemset_using_apriori = apriori(transactions, min_support = 0.01, use_colnames = True)
association_rules_using_apriori = association_rules(freq_itemset_using_apriori, metric = "confidence")

In [9]:
print("Association Rules using Apriori Method:- \n")
for item in association_rules_using_apriori.iterrows():
  print(f"Rule {item[0] + 1}: {tuple(item[1].antecedents)} -> {tuple(item[1].consequents)}")

Association Rules using Apriori Method:- 

Rule 1: ('Authentic French Brioche',) -> ('Petit Suisse Fruit',)
Rule 2: ('Petit Suisse Fruit',) -> ('Authentic French Brioche',)
Rule 3: ('Oatmeal Crème Pies',) -> ('Cran-Apple Juice Drink',)
Rule 4: ('Cran-Apple Juice Drink',) -> ('Oatmeal Crème Pies',)
Rule 5: ('Sparkling Water Natural Mango Essenced',) -> ('Dark Chocolate Minis',)
Rule 6: ('Grade A Extra Large Eggs',) -> ('Natural Lime Flavor Sparkling Mineral Water',)
Rule 7: ('Lemon Sparkling Water',) -> ('Orange Sparkling Water',)
Rule 8: ('Light Oaked Chardonnay',) -> ('Natural Lime Flavor Sparkling Mineral Water',)
Rule 9: ('Reduced Fat Milk',) -> ('Natural Artesian Bottled Water',)
Rule 10: ('Organic Graham Crunch Cereal',) -> ('Organic Heritage Flakes Cereal',)
Rule 11: ('Sparkling Water Natural Mango Essenced',) -> ('Organic Pink Lemonade Bunny Fruit Snacks',)
Rule 12: ('Sparkling Water Natural Mango Essenced',) -> ('Peach-Pear Sparkling Water',)
Rule 13: ('Zero Calorie Cola',) -> 

In [10]:
''' reading data '''
data_frame_test = pd.read_csv('https://raw.githubusercontent.com/aryan-jadon/Market-Basket-Item-Assignment/main/Dataset/testarules.csv')
data_frame_test.product_name = ["".join(i.split(",")) for i in data_frame_test]

  This is separate from the ipykernel package so we can avoid doing imports until


### Checking Test Dataset

In [12]:
data_frame_test

Unnamed: 0,Item1,Item2,Item3,Item4,Item5
0,Dark Chocolate Minis,Organic Pink Lemonade Bunny Fruit Snacks,Peach-Pear Sparkling Water,,


### Checking Support for Items present in test dataset

In [11]:
freq_itemset_using_apriori[ freq_itemset_using_apriori['itemsets'] == {'Dark Chocolate Minis'} ]

Unnamed: 0,support,itemsets
26,0.021157,(Dark Chocolate Minis)


In [14]:
freq_itemset_using_apriori[ freq_itemset_using_apriori['itemsets'] == {'Organic Pink Lemonade Bunny Fruit Snacks'} ]

Unnamed: 0,support,itemsets
97,0.020451,(Organic Pink Lemonade Bunny Fruit Snacks)


In [15]:
freq_itemset_using_apriori[ freq_itemset_using_apriori['itemsets'] == {'Peach-Pear Sparkling Water'} ]

Unnamed: 0,support,itemsets
116,0.016925,(Peach-Pear Sparkling Water)
