# Sean Pharris
# March 16, 2022
# Association Rules and Lift Analysis


## A1.  Can we accurately recommend items to customers with the data we have?

## A2.  The goal of this analysis is to accurately recommend customers other item based on the data we have.


## B1.  Market basket explanation:  
### In market basket analysis we utilize Association Rules to determine the relationships between different items. Those rules are:  

### Antecedent - The first item in the purchase
    * Scenario Example: transactions including milk purchased first then bread.
    * Milk is the Antecedent

### Consequent - The next item purchased after the Antecedent
    * Scenario Example: transactions including milk purchased first then bread.
    * Bread is the Consequent

### Support - the ratio of transactions that contain a particular item
    * Scenario 1: the number of transactions with the particular item / number total transactions
    * Example: transactions including milk / all transactions = 2/10 = .2 (Ratio)
       * Worded as "Support for Milk is .2"
       
    * Scenario 2: the number of transactions with the particular item / another particular item
    * Example: transactions including milk / transactions including bread = 8/10 = .8 (Ratio)
       * Worded as "Support for Milk to Bread is .8"

### Confidence - Provides a more complete picture of the metrics
    * Support(X & Y) / Support(X)
        * Scenario Example: Support(Milk & Bread) / Support(Milk) = .2 / .8 = .25
        * This will tell us the probabilty of a bread purchase given that we hace purchased milk

### Lift - Provides another metric for evaluating the relationship between items
    * Support(X & Y) / (Support(X) * Support(Y))
    * Ranges from 0 to infinity

### Leverage - Similar to lift but easier to interpret
    * Leverage(X then Y) = Support(X & Y) - Support(X) * Support(Y)

### Conviction - More complicated and less intuitive than leverage    
    * Conviction(X then Y) = (Support(X) * Support(Not Y)) / Support(X & Not Y)
    * Support(Not Y) can be interpreted as 1 - Support(Y)

### Association & Disassociation - Takes values between -1 and +1; if value of +1, then perfect association; if value of -1, then perfect dissociation
    * Zhang(A then B) = (Confidence(A then B) - Confidence(Not A then B)) / (Max\[Confidence(A then B), Confidence(Not A then B)])

### We can expect and outcome of at least an antecedent and a consequent to determine recommendations for the items in the data set

In [1]:
import numpy as np
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

/kaggle/input/telecommarketbasket/D212 Task 3 Market Basket (Churn) Data Consideration and Dictionary.pdf
/kaggle/input/telecommarketbasket/teleco_market_basket.csv


In [2]:
df = pd.read_csv('/kaggle/input/telecommarketbasket/teleco_market_basket.csv')
df.head()

Unnamed: 0,Item01,Item02,Item03,Item04,Item05,Item06,Item07,Item08,Item09,Item10,Item11,Item12,Item13,Item14,Item15,Item16,Item17,Item18,Item19,Item20
0,,,,,,,,,,,,,,,,,,,,
1,Logitech M510 Wireless mouse,HP 63 Ink,HP 65 ink,nonda USB C to USB Adapter,10ft iPHone Charger Cable,HP 902XL ink,Creative Pebble 2.0 Speakers,Cleaning Gel Universal Dust Cleaner,Micro Center 32GB Memory card,YUNSONG 3pack 6ft Nylon Lightning Cable,TopMate C5 Laptop Cooler pad,Apple USB-C Charger cable,HyperX Cloud Stinger Headset,TONOR USB Gaming Microphone,Dust-Off Compressed Gas 2 pack,3A USB Type C Cable 3 pack 6FT,HOVAMP iPhone charger,SanDisk Ultra 128GB card,FEEL2NICE 5 pack 10ft Lighning cable,FEIYOLD Blue light Blocking Glasses
2,,,,,,,,,,,,,,,,,,,,
3,Apple Lightning to Digital AV Adapter,TP-Link AC1750 Smart WiFi Router,Apple Pencil,,,,,,,,,,,,,,,,,
4,,,,,,,,,,,,,,,,,,,,


### B2.  An example of a transaction is below.

In [3]:
df.iloc[1]

Item01               Logitech M510 Wireless mouse
Item02                                  HP 63 Ink
Item03                                  HP 65 ink
Item04                 nonda USB C to USB Adapter
Item05                  10ft iPHone Charger Cable
Item06                               HP 902XL ink
Item07               Creative Pebble 2.0 Speakers
Item08        Cleaning Gel Universal Dust Cleaner
Item09              Micro Center 32GB Memory card
Item10    YUNSONG 3pack 6ft Nylon Lightning Cable
Item11               TopMate C5 Laptop Cooler pad
Item12                  Apple USB-C Charger cable
Item13               HyperX Cloud Stinger Headset
Item14                TONOR USB Gaming Microphone
Item15             Dust-Off Compressed Gas 2 pack
Item16             3A USB Type C Cable 3 pack 6FT
Item17                      HOVAMP iPhone charger
Item18                   SanDisk Ultra 128GB card
Item19       FEEL2NICE 5 pack 10ft Lighning cable
Item20        FEIYOLD Blue light Blocking Glasses


### B3.  Assumption:
* Long as items have been purchased in the passed, we would be able to recommend other items. If no items have been purchased, we could assume that nothing will be recommended.
 


### C1.  See code for data transformation below.  
* For cleaned data set see 'item_transactions_data.csv'

In [4]:
df.shape

(15002, 20)

In [5]:
transactions = df.stack().groupby(level=0).apply(list).tolist()

In [6]:
encoder = TransactionEncoder().fit(transactions)
item_transactions = encoder.transform(transactions)
item_transactions_df = pd.DataFrame(item_transactions, columns = encoder.columns_)
item_transactions_df.head()

Unnamed: 0,10ft iPHone Charger Cable,10ft iPHone Charger Cable 2 Pack,3 pack Nylon Braided Lightning Cable,3A USB Type C Cable 3 pack 6FT,5pack Nylon Braided USB C cables,ARRIS SURFboard SB8200 Cable Modem,Anker 2-in-1 USB Card Reader,Anker 4-port USB hub,Anker USB C to HDMI Adapter,Apple Lightning to Digital AV Adapter,...,hP 65 Tri-color ink,iFixit Pro Tech Toolkit,iPhone 11 case,iPhone 12 Charger cable,iPhone 12 Pro case,iPhone 12 case,iPhone Charger Cable Anker 6ft,iPhone SE case,nonda USB C to USB Adapter,seenda Wireless mouse
0,True,False,False,True,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,True,False
1,False,False,False,False,False,False,False,False,False,True,...,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


In [7]:
item_transactions_df.to_csv('item_transactions_data.csv')

### C2.  Execute the code used to generate association rules with the Apriori algorithm. Provide screenshots that demonstrate the error-free functionality of the code.

In [8]:
frequently_purchased = apriori(item_transactions_df, min_support=.0034, use_colnames=True)
frequently_purchased.head()

Unnamed: 0,support,itemsets
0,0.009065,(10ft iPHone Charger Cable)
1,0.050527,(10ft iPHone Charger Cable 2 Pack)
2,0.005199,(3 pack Nylon Braided Lightning Cable)
3,0.042528,(3A USB Type C Cable 3 pack 6FT)
4,0.019064,(5pack Nylon Braided USB C cables)


### C3.  See code below.

In [9]:
association_rules = association_rules(frequently_purchased, metric='confidence', min_threshold=0.1)

In [10]:
association_rules.head()

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
0,(Anker USB C to HDMI Adapter),(10ft iPHone Charger Cable 2 Pack),0.068391,0.050527,0.006932,0.101365,2.006162,0.003477,1.056572
1,(10ft iPHone Charger Cable 2 Pack),(Anker USB C to HDMI Adapter),0.050527,0.068391,0.006932,0.137203,2.006162,0.003477,1.079755
2,(10ft iPHone Charger Cable 2 Pack),(Apple Lightning to Digital AV Adapter),0.050527,0.087188,0.006266,0.124011,1.422329,0.00186,1.042035
3,(10ft iPHone Charger Cable 2 Pack),(Apple Pencil),0.050527,0.179709,0.009065,0.17942,0.998387,-1.5e-05,0.999647
4,(10ft iPHone Charger Cable 2 Pack),(Apple USB-C Charger cable),0.050527,0.132116,0.007066,0.139842,1.058479,0.00039,1.008982


### C4.  Below are the top 3 rules and their summaries

In [11]:
association_rules.sort_values('support', ascending=False).head(3)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
361,(Dust-Off Compressed Gas 2 pack),(VIVO Dual LCD Monitor Desk mount),0.238368,0.17411,0.059725,0.250559,1.439085,0.018223,1.102008
362,(VIVO Dual LCD Monitor Desk mount),(Dust-Off Compressed Gas 2 pack),0.17411,0.238368,0.059725,0.343032,1.439085,0.018223,1.159314
312,(Dust-Off Compressed Gas 2 pack),(HP 61 ink),0.238368,0.163845,0.05266,0.220917,1.348332,0.013604,1.073256


In [12]:
association_rules.sort_values('confidence', ascending=False).head(3)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
2004,"(SanDisk Ultra 64GB card, Screen Mom Screen Cl...",(Dust-Off Compressed Gas 2 pack),0.005733,0.238368,0.003733,0.651163,2.731752,0.002366,2.183344
751,"(10ft iPHone Charger Cable 2 Pack, Nylon Braid...",(Dust-Off Compressed Gas 2 pack),0.007999,0.238368,0.005066,0.633333,2.656954,0.003159,2.077178
763,"(10ft iPHone Charger Cable 2 Pack, Stylus Pen ...",(Dust-Off Compressed Gas 2 pack),0.006799,0.238368,0.004266,0.627451,2.632276,0.002645,2.04438


In [13]:
association_rules.sort_values('lift', ascending=False).head(3)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
811,"(Dust-Off Compressed Gas 2 pack, Anker 2-in-1 ...",(FEIYOLD Blue light Blocking Glasses),0.009599,0.065858,0.003866,0.402778,6.115863,0.003234,1.564145
139,(Apple Lightning to USB cable),(Falcon Dust Off Compressed Gas),0.015598,0.059992,0.004533,0.290598,4.843951,0.003597,1.325072
813,(Anker 2-in-1 USB Card Reader),"(Dust-Off Compressed Gas 2 pack, FEIYOLD Blue ...",0.029463,0.027596,0.003866,0.131222,4.755044,0.003053,1.119277


### D.

### 1.  Summary:
* The ratio of transactions that contain a particular item is the support and with this information we can review the code above to make the assessment that those items have the highest support and should be recommended in that order.
* The metric of confidence provides a more complete picture of a the support and we could assess that the Dust-Off Compressed Gas 2 pack item would the highest recommended item.
* The metric of lift provides another metric for evaluating the relationship between items in perspective from the antecedent. We can see that these items in the code above have a lift of 6.1, 4.8, and 4.8.

### 2.  Analysis Results:
* After conducting analysis, we could put the results to a practical use such as recommending certain items to customers on the Telecoms website based on user profile and cart items. These items could be recommended during shopping or at check out long as the customer has interest in items already. 
* An example would be based off of the results above:
    * If the customer has purchased a (Dust-Off Compressed Gas 2 pack) in the past or has one in their cart, we could recommend (VIVO Dual LCD Monitor Desk mount).

### 3.  Recommended course of action:
* I would recommend the Telecom company to implement a recommendation system on their website to boost profits on items purchased using the methods above. 

### E.  Panopto video
https://wgu.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=da46cdfa-a393-46f4-a190-ae5f00b59a44

### F.  Third-party code
DataCamp (2022). Market Basket Analysis. DataCamp. https://campus.datacamp.com/courses/market-basket-analysis-in-python