# FP-Growth Algorithm

### Frequent Itemset Pattern
Reference: Manoj, Joseph R. (May 9, 2022). FP-Growth Algorithm: Frequent Itemset Pattern. Retrieved from https://www.kaggle.com/code/rjmanoj/fp-growth-algorithm-frequent-itemset-pattern/notebook

In [1]:
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder

### 1. Insert data
Below data is a dataset of commonly purchased items based on transactions.

In [2]:
dataset = [['Milk', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
           ['Dill', 'Onion', 'Nutmeg', 'Kidney Beans', 'Eggs', 'Yogurt'],
           ['Milk', 'Apple', 'Kidney Beans', 'Eggs'],
           ['Milk', 'Unicorn', 'Corn', 'Kidney Beans', 'Yogurt'],
           ['Corn', 'Onion', 'Onion', 'Kidney Beans', 'Ice cream', 'Eggs']]

### 2. Encode the dataset
Encoding the dataset is purposefully to see in which of each items being purchased on each transaction.

In [3]:
te = TransactionEncoder()
te_ary = te.fit(dataset).transform(dataset)
df = pd.DataFrame(te_ary, columns=te.columns_)
df

Unnamed: 0,Apple,Corn,Dill,Eggs,Ice cream,Kidney Beans,Milk,Nutmeg,Onion,Unicorn,Yogurt
0,False,False,False,True,False,True,True,True,True,False,True
1,False,False,True,True,False,True,False,True,True,False,True
2,True,False,False,True,False,True,True,False,False,False,False
3,False,True,False,False,False,True,True,False,False,True,True
4,False,True,False,True,True,True,False,False,True,False,False


### 3. Create Frequent Pattern Set
A Frequent Pattern set is built which will contain all the elements whose frequency is greater than or equal to the minimum support. 

Minimum support is the minimum support for an itemset to be identified as frequent.

These elements are stored in descending order of their respective frequencies.

In this example, the minimum support is 3 out of 5 items in each transactions, which is 3/5 = 0.6.

In [4]:
from mlxtend.frequent_patterns import fpgrowth

fpgrowth(df, min_support=0.6)

Unnamed: 0,support,itemsets
0,1.0,(5)
1,0.8,(3)
2,0.6,(10)
3,0.6,(8)
4,0.6,(6)
5,0.8,"(3, 5)"
6,0.6,"(10, 5)"
7,0.6,"(8, 3)"
8,0.6,"(8, 5)"
9,0.6,"(8, 3, 5)"


### 4. Find Frequent Pattern

In [5]:
fpgrowth(df, min_support=0.6, use_colnames=True)

Unnamed: 0,support,itemsets
0,1.0,(Kidney Beans)
1,0.8,(Eggs)
2,0.6,(Yogurt)
3,0.6,(Onion)
4,0.6,(Milk)
5,0.8,"(Eggs, Kidney Beans)"
6,0.6,"(Yogurt, Kidney Beans)"
7,0.6,"(Eggs, Onion)"
8,0.6,"(Onion, Kidney Beans)"
9,0.6,"(Eggs, Onion, Kidney Beans)"


From the above frequent pattern, we can see the pattern of the same item(s) of 3 (or more) out of 5 items  being purchased in each transactions.