<a href="https://colab.research.google.com/github/silwalprabin/data-mining-and-machine-learning/blob/main/W10_Apriori_FP_Growth_Google_Colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Association Rule Mining using Apriori & FP-Growth

This Google Colab notebook demonstrates **Apriori** and **FP-Growth** algorithms using a sample transactional dataset.

Topics covered:
- Association Rule Mining concepts
- Apriori algorithm
- FP-Growth algorithm
- Association rule generation


## 1. Install Required Libraries

In [1]:
!pip install mlxtend pandas



## 2. Sample Transactional Dataset

In [2]:
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder

transactions = [
    ['Milk', 'Bread', 'Butter'],
    ['Bread', 'Butter'],
    ['Milk', 'Bread'],
    ['Milk', 'Butter'],
    ['Bread', 'Butter', 'Jam'],
    ['Milk', 'Bread', 'Butter', 'Jam']
]

te = TransactionEncoder()
te_array = te.fit(transactions).transform(transactions)
df = pd.DataFrame(te_array, columns=te.columns_)
df

Unnamed: 0,Bread,Butter,Jam,Milk
0,True,True,False,True
1,True,True,False,False
2,True,False,False,True
3,False,True,False,True
4,True,True,True,False
5,True,True,True,True


## 3. Apriori Algorithm
**Steps:**
1. Generate frequent itemsets
2. Apply minimum support threshold
3. Generate association rules


In [3]:
from mlxtend.frequent_patterns import apriori, association_rules

frequent_itemsets = apriori(df, min_support=0.3, use_colnames=True)
frequent_itemsets

  return datetime.utcnow().replace(tzinfo=utc)


Unnamed: 0,support,itemsets
0,0.833333,(Bread)
1,0.833333,(Butter)
2,0.333333,(Jam)
3,0.666667,(Milk)
4,0.666667,"(Bread, Butter)"
5,0.333333,"(Bread, Jam)"
6,0.5,"(Bread, Milk)"
7,0.333333,"(Butter, Jam)"
8,0.5,"(Milk, Butter)"
9,0.333333,"(Bread, Butter, Jam)"


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


### Apriori Association Rules

In [4]:
rules_apriori = association_rules(frequent_itemsets, metric='confidence', min_threshold=0.6)
rules_apriori[['antecedents', 'consequents', 'support', 'confidence', 'lift']]

  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Unnamed: 0,antecedents,consequents,support,confidence,lift
0,(Bread),(Butter),0.666667,0.8,0.96
1,(Butter),(Bread),0.666667,0.8,0.96
2,(Jam),(Bread),0.333333,1.0,1.2
3,(Bread),(Milk),0.5,0.6,0.9
4,(Milk),(Bread),0.5,0.75,0.9
5,(Jam),(Butter),0.333333,1.0,1.2
6,(Milk),(Butter),0.5,0.75,0.9
7,(Butter),(Milk),0.5,0.6,0.9
8,"(Bread, Jam)",(Butter),0.333333,1.0,1.2
9,"(Butter, Jam)",(Bread),0.333333,1.0,1.2


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


## 4. FP-Growth Algorithm
**Advantages:** Faster, no candidate generation, scalable

In [5]:
from mlxtend.frequent_patterns import fpgrowth

frequent_itemsets_fp = fpgrowth(df, min_support=0.3, use_colnames=True)
frequent_itemsets_fp

  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Unnamed: 0,support,itemsets
0,0.833333,(Bread)
1,0.833333,(Butter)
2,0.666667,(Milk)
3,0.333333,(Jam)
4,0.666667,"(Bread, Butter)"
5,0.5,"(Milk, Butter)"
6,0.5,"(Bread, Milk)"
7,0.333333,"(Bread, Milk, Butter)"
8,0.333333,"(Butter, Jam)"
9,0.333333,"(Bread, Jam)"


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


### FP-Growth Association Rules

In [6]:
rules_fp = association_rules(frequent_itemsets_fp, metric='confidence', min_threshold=0.6)
rules_fp[['antecedents', 'consequents', 'support', 'confidence', 'lift']]

  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


Unnamed: 0,antecedents,consequents,support,confidence,lift
0,(Bread),(Butter),0.666667,0.8,0.96
1,(Butter),(Bread),0.666667,0.8,0.96
2,(Milk),(Butter),0.5,0.75,0.9
3,(Butter),(Milk),0.5,0.6,0.9
4,(Bread),(Milk),0.5,0.6,0.9
5,(Milk),(Bread),0.5,0.75,0.9
6,"(Bread, Milk)",(Butter),0.333333,0.666667,0.8
7,"(Milk, Butter)",(Bread),0.333333,0.666667,0.8
8,(Jam),(Butter),0.333333,1.0,1.2
9,(Jam),(Bread),0.333333,1.0,1.2


  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)
  return datetime.utcnow().replace(tzinfo=utc)


## 5. Apriori vs FP-Growth Comparison

| Feature | Apriori | FP-Growth |
|-------|--------|----------|
| Candidate Generation | Yes | No |
| Speed | Slower | Faster |
| Best for | Small datasets | Large datasets |