### Looi Kah Fung P-COM0049/22
Statement 
Ever since Amazon went online in 1995, the e-commerce powerful force has
undergone a slew of changes, despite being led by the same man, Jeff Bezos, during
the ensuing two-plus decades. When the Seattle-based company first launched its
website, all it sold was books.
But Amazon never really stopped changing the inventory it sold. Bezos said he
wanted his store to become the world’s largest, so he worked hard toward meeting
that goal, whether that meant offering new products, launching Amazon Prime and
launching Amazon Instant Video. The list goes on and on. Today, Amazon sells more
than 200 million products to customers all over the world.
During the year end sale of 2019, Amazon came out with an analysis on the
association rules of sales transactions occurs. The analysis was done to helps them
to improve their business for next year. A snapshot of 9 transactions is shown in Table 1
Table 1: Sales transactions
Transaction ID Items bought
1001 Backpack, air purifier, t-shirt
1038 Air purifier, cup
1040 Air purifier, socks
1024 Backpack, air purifier, cup
1033 Backpack, socks
1034 Air purifier, socks
1042 Backpack, socks
1052 Backpack, air purifier, socks, t-shirt
1051 Backpack, air purifier, socks

In [1]:
#install mlxtend package 
!pip install mlxtend

Collecting mlxtend
  Downloading mlxtend-0.22.0-py2.py3-none-any.whl (1.4 MB)
Installing collected packages: mlxtend
Successfully installed mlxtend-0.22.0


In [6]:
# store the item sets as lists of strings in a list - 2D array. It can be either dictionary key:value or csv loaded. 
transactions = [
    ['backpack', 'air purifier', 't-shirt'],
    ['air purifier','cup'],
    ['air purifier', 'socks'],
    ['backpack', 'air purifier', 'cup'],
    ['backpack', 'socks'],
    ['air purifier', 'socks'],
    ['backpack', 'socks'],
    ['backpack', 'air purifier', 'socks', 't-shirt'],
    ['backpack', 'air purifier', 'socks'],
]

In [13]:
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder

#instantiate a transaction encoder
my_transactionencoder = TransactionEncoder()

#fit the transaction encoder using the list of transaction tuples 
my_transactionencoder.fit(transactions)

#transofmr the list of transaction tuples into an array of encoded transactions
encoded_transactions = my_transactionencoder.transform(transactions)

#convert the array of encode transactions into a dataframe
encoded_transactions_df = pd.DataFrame(encoded_transactions, columns=my_transactionencoder.columns_)
encoded_transactions_df

Unnamed: 0,air purifier,backpack,cup,socks,t-shirt
0,True,True,False,False,True
1,True,False,True,False,False
2,True,False,False,True,False
3,True,True,True,False,False
4,False,True,False,True,False
5,True,False,False,True,False
6,False,True,False,True,False
7,True,True,False,True,True
8,True,True,False,True,False


In [33]:
# support count  = 8 transactions, support threshold = 50%  
# min support = 4
min_support = 4/len(transactions)

# compute the frequent itemsets using fpgrowth
from mlxtend.frequent_patterns.fpgrowth import fpgrowth
frequent_itemsets = fpgrowth(encoded_transactions_df, min_support=min_support, use_colnames = True)


In [34]:
frequent_itemsets

Unnamed: 0,support,itemsets
0,0.777778,(air purifier)
1,0.666667,(backpack)
2,0.666667,(socks)
3,0.444444,"(air purifier, backpack)"
4,0.444444,"(backpack, socks)"
5,0.444444,"(air purifier, socks)"


In [35]:
# Compute the association rules based on the frequent itemsets
from mlxtend.frequent_patterns import association_rules

# compute and print the association rules
res=association_rules(frequent_itemsets, metric="confidence", min_threshold= 0.5)


In [36]:
res

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,zhangs_metric
0,(air purifier),(backpack),0.777778,0.666667,0.444444,0.571429,0.857143,-0.074074,0.777778,-0.428571
1,(backpack),(air purifier),0.666667,0.777778,0.444444,0.666667,0.857143,-0.074074,0.666667,-0.333333
2,(backpack),(socks),0.666667,0.666667,0.444444,0.666667,1.0,0.0,1.0,0.0
3,(socks),(backpack),0.666667,0.666667,0.444444,0.666667,1.0,0.0,1.0,0.0
4,(air purifier),(socks),0.777778,0.666667,0.444444,0.571429,0.857143,-0.074074,0.777778,-0.428571
5,(socks),(air purifier),0.666667,0.777778,0.444444,0.666667,0.857143,-0.074074,0.666667,-0.333333


Rule - association rule is birectional (backpack <-> socks)
1. support - percentage that the products co-occur
2. confidence - number of times that a rule occurs - conditional probability of the RHS given the LHS
3. lift - strength of association 

Link https://towardsdatascience.com/the-fp-growth-algorithm-1ffa20e839b8
Link 2 https://hands-on.cloud/implementation-of-fp-growth-algorithm-using-python/#h-implementation-of-fp-growth-algorithm-using-python
Link 3 https://coderspacket.com/implementing-fp-growth-algorithm-in-machine-learning-using-python