
## **Association Rule Mining (Unsupervised Algorithm)**

Association rule mining is an unsupervised learning method used to **find patterns or relationships between items** in large datasets.

Think of it like:
“If a customer buys **A**, they also usually buy **B**.”

It doesn’t need labels. It only looks at **co-occurrence patterns**.

---

## **Where is it used?**

* Market basket analysis (Amazon: “Frequently bought together”)
* Website click patterns
* Medical symptoms that co-occur
* Fraud detection (certain patterns appear together)

---

## **Key Terms:**

### **1. Itemset**

A group of items.
Example: `{Milk, Bread, Butter}`

### **2. Support**

How often an itemset appears in the whole dataset.
Think of it as *popularity*.

Example:
If 30 out of 100 transactions contain milk,
**Support(milk) = 30/100 = 0.30**

### **3. Confidence**

How often rule **A → B** is true.

Example:
Confidence(Milk → Bread) =
Number of transactions containing both / Number containing milk

Shows how likely B is bought when A is bought.

### **4. Lift**

Tells if the rule is actually useful or just random.

Lift > 1 → Good association
Lift = 1 → No relation
Lift < 1 → Negative relation

---

## **How rules look**

A common rule format is:

```
A → B
```

Meaning:
If A happens, B is likely to happen.

Example:

```
{Diapers} → {Beer}
```

Yes, this is a real pattern discovered in a store!

---

## **Famous algorithms used**

* **Apriori**
* **FP-Growth**

Both find frequent itemsets and generate rules.



In [1]:
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, fpgrowth, association_rules

# -----------------------------------------
# 1. Load the dataset
# -----------------------------------------
df = pd.read_csv("Market_Basket_Optimisation.csv", header=None)

# Each row = one transaction (items stored across columns)
transactions = df.apply(lambda row: row.dropna().tolist(), axis=1)

print("Transactions:")
print(transactions.head())

# -----------------------------------------
# 2. One-hot encode the data
# -----------------------------------------
te = TransactionEncoder()
te_data = te.fit(transactions).transform(transactions)
df_encoded = pd.DataFrame(te_data, columns=te.columns_)

# -----------------------------------------
# 3. APRIORI
# -----------------------------------------
print("\n======================================== APRIORI ================================================================")

# This dataset is sparse → use low min_support
frequent_apriori = apriori(df_encoded, min_support=0.01, use_colnames=True)

print("Frequent itemsets:\n", frequent_apriori.head())

rules_apriori = association_rules(frequent_apriori, metric="lift", min_threshold=1)

print("\nTop Association Rules (Apriori):")
print(rules_apriori.sort_values("lift", ascending=False).head())

# -----------------------------------------
# 4. FP-GROWTH
# -----------------------------------------
print("\n==================================================== FP-GROWTH ================================================================")

frequent_fpgrowth = fpgrowth(df_encoded, min_support=0.01, use_colnames=True)

print("Frequent itemsets:\n", frequent_fpgrowth.head())

rules_fpgrowth = association_rules(frequent_fpgrowth,metric="lift", min_threshold=1)

print("\nTop Association Rules (FP-Growth):")
print(rules_fpgrowth.sort_values("lift", ascending=False).head())


Transactions:
0    [shrimp, almonds, avocado, vegetables mix, gre...
1                           [burgers, meatballs, eggs]
2                                            [chutney]
3                                    [turkey, avocado]
4    [mineral water, milk, energy bar, whole wheat ...
dtype: object

Frequent itemsets:
     support          itemsets
0  0.020397         (almonds)
1  0.033329         (avocado)
2  0.010799  (barbecue sauce)
3  0.014265       (black tea)
4  0.011465      (body spray)

Top Association Rules (Apriori):
                    antecedents                 consequents  \
215             (herb & pepper)               (ground beef)   
214               (ground beef)             (herb & pepper)   
385               (ground beef)  (mineral water, spaghetti)   
384  (mineral water, spaghetti)               (ground beef)   
394  (mineral water, spaghetti)                 (olive oil)   

     antecedent support  consequent support   support  confidence      lift  \
215 