In [None]:
## Title: Market Basket Analysis
## Objective: to identify products frequently purchased together at Daily Grind Coffee. 
## These associations can inform product bundling, cross-selling strategies, and targeted promotions.

In [6]:
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

# 1) LOAD AND PREPARE DATA
# Load dataset
df = pd.read_csv("coffee_shop_sales_simulated.csv")

# Create a transaction level dataset
basket = (
    df.groupby(['transaction_id', 'product_name'])['quantity']
    .sum().unstack().reset_index().fillna(0)
)
# Convert to boolean (true or false)
basket.set_index('transaction_id', inplace =True)
basket = (basket > 0)


# 2) APPLY APRIORI
# Find frequent items
frequent_itemsets = apriori(basket, min_support=0.05, use_colnames=True)

# Generate association rules
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)

# Only keep rules with at least 2 items (no singletons)
rules = rules[rules['antecedents'].apply(lambda x: len(x) > 0) & 
              rules['consequents'].apply(lambda x: len(x) > 0)]

# Frequent items sets
print("Frequent Itemsets:")
print(frequent_itemsets.head(10))
# Sort and print rules
print("Top Association Rules:")
print(rules.sort_values("confidence", ascending=False).head(10))

Frequent Itemsets:
    support                        itemsets
0  0.498399              (Blueberry Muffin)
1  0.188275                    (Cappuccino)
2  0.200080                    (Chai Latte)
3  0.501601                     (Croissant)
4  0.204882                      (Espresso)
5  0.205082                     (Green Tea)
6  0.201681                         (Latte)
7  0.095438  (Blueberry Muffin, Cappuccino)
8  0.102241  (Blueberry Muffin, Chai Latte)
9  0.103041    (Blueberry Muffin, Espresso)
Top Association Rules:
          antecedents         consequents  antecedent support  \
8             (Latte)         (Croissant)            0.201681   
3        (Chai Latte)  (Blueberry Muffin)            0.200080   
6         (Green Tea)         (Croissant)            0.205082   
1        (Cappuccino)  (Blueberry Muffin)            0.188275   
5          (Espresso)  (Blueberry Muffin)            0.204882   
9         (Croissant)             (Latte)            0.501601   
7         (Croissan

In [None]:
## Reflection: This analysis revealed that while bakery items like Croissants and Blueberry Muffins frequently co-occur with drinks such as 
## Cappuccinos, Lattes, and Espressos, the associations are relatively weak (lift ~1.0). The exercise highlighted the importance of preparing data 
## for the right analytical context — since the original dataset lacked multi-item baskets, I engineered a simulated version to demonstrate Market 
## Basket Analysis. The project strengthened my skills in data preparation, association rule mining, and business interpretation, showing how even modest product 
## pairings can inspire bundling and promotion strategies in retail.