# Association Rule Mining using FP-Growth

This notebook demonstrates how to extract frequent itemsets and association rules from transaction data using the FP-Growth algorithm, inspired by the Kaggle notebook by Mohammed Derouiche.

---

**Dataset**: Cleaned version of the [UCI Online Retail Dataset](https://archive.ics.uci.edu/ml/datasets/online+retail)  
**Goal**: Generate association rules in the form `antecedents → consequents` to be used in a recommendation system API.


In [1]:
# Import necessary libraries
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import fpgrowth, association_rules

In [2]:
df = pd.read_csv("../data/transactions_fpgrowth.csv")
transactions = df["items"].str.split(",")

In [3]:
te = TransactionEncoder()
te_data = te.fit_transform(transactions)
df_trans = pd.DataFrame(te_data, columns=te.columns_)

In [4]:
frequent_itemsets = fpgrowth(df_trans, min_support=0.01, use_colnames=True)
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)

In [5]:
rules_tidy = rules[["antecedents", "consequents", "confidence"]].rename(
    columns={"antecedents": "antecedent", "consequents": "consequent"}
)
rules_tidy["antecedent"] = rules_tidy["antecedent"].apply(lambda s: next(iter(s)))
rules_tidy["consequent"] = rules_tidy["consequent"].apply(lambda s: next(iter(s)))
rules_tidy = rules_tidy.sort_values("confidence", ascending=False)

In [6]:
rules_tidy.to_csv("../data/rules.csv", index=False)

In [7]:
print(f"Saved {len(rules_tidy)} rules to ../data/rules.csv")

Saved 952 rules to ../data/rules.csv
