# 🛒 Market Basket Analysis with Association Rules

In this notebook, we’ll apply frequent itemset mining techniques (Apriori/FP-Growth) to the grouped transaction data and generate association rules to uncover product relationships.


## 📦 Load Preprocessed Transactions

We'll start by loading the grouped transactional data generated in the previous notebook.


In [9]:
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder

# Load the grouped transactions
df = pd.read_csv("../data/market_basket_grouped.csv")

# Preview structure
df.head()


Unnamed: 0,TransactionID,Item
0,T0001,['Bread']
1,T0002,"['Eggs', 'Tomatoes', 'Butter']"
2,T0003,['Beef']
3,T0004,"['Apples', 'Bread', 'Beef', 'Chicken', 'Milk']"
4,T0005,"['Tomatoes', 'Bread', 'Eggs', 'Bananas', 'Appl..."


## 🧹 Prepare Data for Itemset Mining

We need to convert the list of items in each transaction into a one-hot encoded format using `TransactionEncoder`.


In [10]:
from ast import literal_eval

# Convert stringified lists into actual lists
df['Item'] = df['Item'].apply(literal_eval)

# One-hot encode the transactions
te = TransactionEncoder()
te_ary = te.fit(df['Item']).transform(df['Item'])
df_encoded = pd.DataFrame(te_ary, columns=te.columns_)

df_encoded.head()


Unnamed: 0,Apples,Bananas,Beef,Bread,Butter,Cheese,Chicken,Eggs,Milk,Tomatoes
0,False,False,False,True,False,False,False,False,False,False
1,False,False,False,False,True,False,False,True,False,True
2,False,False,True,False,False,False,False,False,False,False
3,True,False,True,True,False,False,True,False,True,False
4,True,True,False,True,False,False,False,True,False,True


## 📊 Generate Frequent Itemsets

Using the Apriori algorithm, we’ll identify itemsets that occur frequently across transactions (min support = 0.2).


In [11]:
from mlxtend.frequent_patterns import apriori

frequent_itemsets = apriori(df_encoded, min_support=0.2, use_colnames=True)
frequent_itemsets.sort_values(by="support", ascending=False).head()


Unnamed: 0,support,itemsets
9,0.344,(Tomatoes)
2,0.338,(Beef)
8,0.32,(Milk)
1,0.32,(Bananas)
6,0.318,(Chicken)


## 🔗 Generate Association Rules

We’ll now extract rules based on lift, confidence, and support.


In [12]:
from mlxtend.frequent_patterns import association_rules

rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1.0)
rules = rules.sort_values(by="lift", ascending=False)

rules.head()


Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,representativity,leverage,conviction,zhangs_metric,jaccard,certainty,kulczynski


## 💾 Save Rules to CSV

Export the generated association rules to a CSV file for further use or dashboarding.


In [13]:
rules.to_csv("../data/market_basket_rules.csv", index=False)
print("Rules exported successfully.")

Rules exported successfully.


## ⭐ Optional: Filter Top Rules

We can optionally filter and preview rules based on thresholds.


In [14]:
top_rules = rules[(rules['lift'] > 1.2) & (rules['confidence'] > 0.6)]
top_rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']]


Unnamed: 0,antecedents,consequents,support,confidence,lift
