## What’s Inside the Dataset:

Each row is a customer’s purchase (like a shopping basket).

Each column holds one item from that purchase.

We have 7501 transactions and up to 20 items per transaction.

Some transactions are short (like one item), some are massive (like a feast).

In [2]:
!pip install mlxtend

from mlxtend.frequent_patterns import apriori, association_rules
from mlxtend.preprocessing import TransactionEncoder
import pandas as pd

# Toy dataset
toy_dataset = [['Skirt', 'Sneakers', 'Scarf', 'Pants', 'Hat'],
               ['Sunglasses', 'Skirt', 'Sneakers', 'Pants', 'Hat'],
               ['Dress', 'Sandals', 'Scarf', 'Pants', 'Heels'],
               ['Dress', 'Necklace', 'Earrings', 'Scarf', 'Hat', 'Heels', 'Hat'],
               ['Earrings', 'Skirt', 'Skirt', 'Scarf', 'Shirt', 'Pants']]

# Step 1: Transform to one-hot encoded dataframe

te = TransactionEncoder()
te_ary = te.fit(toy_dataset).transform(toy_dataset)
df_toy = pd.DataFrame(te_ary, columns=te.columns_)

# Step 2: Run Apriori
frequent_items = apriori(df_toy, min_support=0.3, use_colnames=True)

# Step 3: Generate Association Rules
rules = association_rules(frequent_items, metric="lift", min_threshold=1.0)
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])


       antecedents               consequents  support  confidence      lift
0          (Heels)                   (Dress)      0.4    1.000000  2.500000
1          (Dress)                   (Heels)      0.4    1.000000  2.500000
2          (Scarf)                   (Dress)      0.4    0.500000  1.250000
3          (Dress)                   (Scarf)      0.4    1.000000  1.250000
4          (Scarf)                (Earrings)      0.4    0.500000  1.250000
..             ...                       ...      ...         ...       ...
61  (Skirt, Pants)           (Hat, Sneakers)      0.4    0.666667  1.666667
62           (Hat)  (Sneakers, Skirt, Pants)      0.4    0.666667  1.666667
63      (Sneakers)       (Hat, Skirt, Pants)      0.4    1.000000  2.500000
64         (Skirt)    (Hat, Sneakers, Pants)      0.4    0.666667  1.666667
65         (Pants)    (Hat, Sneakers, Skirt)      0.4    0.500000  1.250000

[66 rows x 5 columns]


## THE ABOVE GIVES US:

Which items are often bought together.

Which items boost sales when paired (high lift).

Ideas for bundles, shelf placement, or promotions.

In [3]:
df = pd.read_csv("Market_Basket_Optimisation.csv", header=None)

# Convert to list of lists
transactions = []
for _, row in df.iterrows():
    basket = row.dropna().tolist()
    transactions.append(basket)

# Encode transactions
te = TransactionEncoder()
te_ary = te.fit(transactions).transform(transactions)
df_encoded = pd.DataFrame(te_ary, columns=te.columns_)

# Run Apriori
frequent_itemsets = apriori(df_encoded, min_support=0.02, use_colnames=True)

# Generate association rules
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1.0)

# Show most impactful rules
rules_sorted = rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']]\
                .sort_values(by='lift', ascending=False)

print(rules_sorted.head(10))


            antecedents          consequents   support  confidence      lift
64          (spaghetti)        (ground beef)  0.039195    0.225115  2.291162
65        (ground beef)          (spaghetti)  0.039195    0.398915  2.291162
87          (olive oil)          (spaghetti)  0.022930    0.348178  1.999758
86          (spaghetti)          (olive oil)  0.022930    0.131700  1.999758
79      (mineral water)               (soup)  0.023064    0.096756  1.914955
78               (soup)      (mineral water)  0.023064    0.456464  1.914955
53               (milk)  (frozen vegetables)  0.023597    0.182099  1.910382
52  (frozen vegetables)               (milk)  0.023597    0.247552  1.910382
0             (burgers)               (eggs)  0.028796    0.330275  1.837830
1                (eggs)            (burgers)  0.028796    0.160237  1.837830


## How to Interpret These Rules

### Each rule tells a story:

Antecedents	Consequents	    Lift	Meaning
{mineral water}	{chocolate}	1.4	    People who buy water are 1.4x more likely to buy chocolate too
{spaghetti}	{ground beef}	1.6	    This combo sells better together


## Business Plan for the Supermarket

### Based on the top rules:

Create product bundles: "Water + Chocolate", "Spaghetti + Ground Beef" offers.

Use cross-promotions: Put chocolate near the water aisle. Ground beef next to pasta.

Targeted Ads: Recommend paired products in receipts or apps ("Others also bought...").

Shelf layout optimization: Boost impulse buys by placing items with high lift near checkout.