

# 🌟 **FP-Growth Algorithm – Short Definition**

FP-Growth (Frequent Pattern Growth) is a fast algorithm used to find **frequent itemsets** (common combinations of items) **without generating all possible combinations** like Apriori does. 🚀

---

## 📘 **Key Details You Should Know:**

### 🧠 **Purpose:**

* Used in **Market Basket Analysis** 🛒
  (e.g., What items are bought together?)

### ⚙️ **How It Works:**

1. **Build an FP-Tree** 🌳:
   A compact structure that stores transactions in a compressed way.
2. **Mine the Tree** 🔍:
   Find frequent patterns by traversing the tree **without candidate generation**.

### ✅ **Why It's Better Than Apriori:**

* **No need to generate all item combinations** ❌🔁
* **Faster** for large datasets ⚡
* **Uses less memory** 🧠💾

### 📏 **Important Terms:**

* **Frequent Itemset**: Group of items that appear together often. 📦
* **Minimum Support**: Threshold to decide if an itemset is frequent. 📊
* **Conditional Pattern Base**: Paths in the FP-tree that help in mining. 🧭

### 📌 **Applications:**

* Product recommendation 🛍️
* Fraud detection 🚨
* Web usage mining 🌐



## 📝 **Summary:**

> FP-Growth is a smart, efficient way to find what items often appear together in transactions, without wasting time on unnecessary combinations. 🧩✅




In [1]:
# LOADING DATA:
import pandas as pd
ORDERS=pd.read_csv(r"C:\Users\Nagesh Agrawal\OneDrive\Desktop\6_MACHINE LEARNING\1_DATASETS\association rules\order_products__prior.csv")
PRODUCTS=pd.read_csv(r"C:\Users\Nagesh Agrawal\OneDrive\Desktop\6_MACHINE LEARNING\1_DATASETS\association rules\products.csv")

DATA=pd.merge(ORDERS,PRODUCTS, on="product_id", how="left")
DATA

Unnamed: 0,order_id,product_id,add_to_cart_order,reordered,product_name,aisle_id,department_id
0,2,33120,1,1,Organic Egg Whites,86,16
1,2,28985,2,1,Michigan Organic Kale,83,4
2,2,9327,3,0,Garlic Powder,104,13
3,2,45918,4,1,Coconut Butter,19,13
4,2,30035,5,0,Natural Sweetener,17,13
...,...,...,...,...,...,...,...
32434484,3421083,39678,6,1,Free & Clear Natural Dishwasher Detergent,74,17
32434485,3421083,11352,7,0,Organic Mini Sandwich Crackers Peanut Butter,78,19
32434486,3421083,4600,8,0,All Natural French Toast Sticks,52,1
32434487,3421083,24852,9,1,Banana,24,4


In [2]:
# Group by order_id and list product names.
BASKET = DATA.groupby('order_id')['product_name'].apply(list)
BASKET# 3214874 ARE UNIQUE ORDER_IDs.
# FOR EACH ORDER_ID THERE ARE MULTIPLE PRODUCT NAMES.

order_id
2          [Organic Egg Whites, Michigan Organic Kale, Ga...
3          [Total 2% with Strawberry Lowfat Greek Straine...
4          [Plain Pre-Sliced Bagels, Honey/Lemon Cough Dr...
5          [Bag of Organic Bananas, Just Crisp, Parmesan,...
6          [Cleanse, Dryer Sheets Geranium Scent, Clean D...
                                 ...                        
3421079                                      [Moisture Soap]
3421080    [Organic Whole Milk, Vanilla Bean Ice Cream, O...
3421081    [Hint of Lime Flavored Tortilla Chips, Classic...
3421082    [Fresh 99% Lean Ground Turkey, Original Spray,...
3421083    [Freeze Dried Mango Slices, Purple Carrot & bl...
Name: product_name, Length: 3214874, dtype: object

In [3]:
# LOADING SUBSET :
BASKET=BASKET[:1000]# 1000 ORDER_IDs

In [4]:
BASKET

order_id
2       [Organic Egg Whites, Michigan Organic Kale, Ga...
3       [Total 2% with Strawberry Lowfat Greek Straine...
4       [Plain Pre-Sliced Bagels, Honey/Lemon Cough Dr...
5       [Bag of Organic Bananas, Just Crisp, Parmesan,...
6       [Cleanse, Dryer Sheets Geranium Scent, Clean D...
                              ...                        
1043    [Jalapeno Peppers, Limes, Organic Lemon, Red P...
1044    [Gluten Free Brown Rice Bread, Tortillas, Brow...
1045    [Seedless Small Watermelon, Organic Strawberri...
1046        [Coconut Water, Semi-Sweet Chocolate Morsels]
1047    [Blueberry Pint, Cheese Enchilada Meal, Enchil...
Name: product_name, Length: 1000, dtype: object

In [5]:
from mlxtend.preprocessing import TransactionEncoder
TE=TransactionEncoder()
TE_DATA=TE.fit(BASKET).transform(BASKET)
TE_DATA=pd.DataFrame(TE_DATA, columns=TE.columns_)
TE_DATA

Unnamed: 0,0% Fat Blueberry Greek Yogurt,0% Fat Free Organic Milk,0% Fat Organic Greek Vanilla Yogurt,0% Greek Strained Yogurt,0% Milkfat Greek Plain Yogurt,0% Milkfat Greek Yogurt Honey,1 % Lowfat Milk,1 Apple + 1 Pear Fruit Bar,1 Ply Paper Towels,1% Low Fat Milk,...,Zero Vitamin Water,Zinfandel,Zucchini Noodles,from Concentrate Mango Nectar,gel hand wash sea minerals,smartwater® Electrolyte Enhanced Water,vitaminwater® XXX Acai Blueberry Pomegranate,with Crispy Almonds Cereal,with Olive Oil Mayonnaise,with Olive Oil Mayonnaise Dressing
0,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
996,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
997,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
998,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


In [8]:
from mlxtend.frequent_patterns import fpgrowth
FREQUENT_PATTERNS=fpgrowth(TE_DATA,min_support=0.01, use_colnames=True)
FREQUENT_PATTERNS

Unnamed: 0,support,itemsets
0,0.025,(Carrots)
1,0.019,(Michigan Organic Kale)
2,0.015,(Organic Egg Whites)
3,0.072,(Organic Baby Spinach)
4,0.016,(Organic Ginger Root)
...,...,...
141,0.011,"(Organic Yellow Onion, Bag of Organic Bananas)"
142,0.010,"(Cucumber Kirby, Banana)"
143,0.015,"(Banana, Organic Fuji Apple)"
144,0.013,"(Strawberries, Banana)"


In [9]:
from mlxtend.frequent_patterns import association_rules
rules = association_rules(FREQUENT_PATTERNS, metric="lift", min_threshold=0.5)
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])


                 antecedents               consequents  support  confidence  \
0     (Organic Baby Spinach)                  (Banana)    0.024    0.333333   
1                   (Banana)    (Organic Baby Spinach)    0.024    0.147239   
2     (Organic Baby Spinach)  (Bag of Organic Bananas)    0.010    0.138889   
3   (Bag of Organic Bananas)    (Organic Baby Spinach)    0.010    0.082645   
4   (Bag of Organic Bananas)    (Organic Hass Avocado)    0.023    0.190083   
5     (Organic Hass Avocado)  (Bag of Organic Bananas)    0.023    0.333333   
6     (Organic Baby Spinach)    (Organic Hass Avocado)    0.010    0.138889   
7     (Organic Hass Avocado)    (Organic Baby Spinach)    0.010    0.144928   
8     (Organic Strawberries)    (Organic Hass Avocado)    0.016    0.210526   
9     (Organic Hass Avocado)    (Organic Strawberries)    0.016    0.231884   
10  (Bag of Organic Bananas)     (Organic Raspberries)    0.016    0.132231   
11     (Organic Raspberries)  (Bag of Organic Banana