# Part I: Research Question

## A.  Describe the purpose of your data mining report by doing the following:

### A1.  Propose one question relevant to a real-world organizational situation that you will answer using market basket analysis.

*What product pairings have the highest lift values in the dataset?*

### A2.  Define one goal of the data analysis. Ensure your goal is reasonable within the scope of the selected scenario and is represented in the available data.

One goal of the data analysis is to determine which item paring has the highest lift value within the dataset.

# Part II: Market Basket Justification

## B.  Explain the reasons for using market basket analysis by doing the following: 

### B1.  Explain how market basket analyzes the selected data set. Include expected outcomes.

Market basket analysis analyses a data by using the Apriori Algorithm to iteratively find the highest-scoring pairs of items within a dataset (a Pandas dataframe in this case) based on a user-selected criteria of *Lift*, *Confidence*, or *Support*.  Each of these criteria answer a slightly different question about the data depending on the question the user wants to answer in their analysis.

Modules from the *mlxtend* library in Pandas are used to first determine which rows of the original dataframe meet the criteria indicated above (including any minimum values specified by the user) using the *apriori* method, and then the  object created in that step is turned into a dataframe using the *association_rules* method within the same library. 

The outcome after using these methods is a dataframe that displays the two items that make up a particular pairing, along with the values for *Lift*, *Confidence* and *Support* (among other items).  And example can be seen in the following cell.

In [27]:
lift_rules_table[1:2]

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,zhangs_metric
1,(Dust-Off Compressed Gas 2 pack),(10ft iPHone Charger Cable 2 Pack),0.238368,0.050527,0.023064,0.096756,1.914955,0.01102,1.051182,0.62733


### B2.  Provide one example of transactions in the data set.

The 4th row of the original dataframe indicates that a customer purchased the following:
* Item 01: Apple Lightning to Digital AV Adapter
* Item 02: TP-Link AC1750 Smart WiFi Router
* Item 03: Apple Pencil
    
As these were the only items purchased by the client, Items 04-20 are not applicable (and listed as *NaN* in the dataframe itself, as seen below)

In [28]:
df_orig[3:4]

Unnamed: 0_level_0,Item02,Item03,Item04,Item05,Item06,Item07,Item08,Item09,Item10,Item11,Item12,Item13,Item14,Item15,Item16,Item17,Item18,Item19,Item20
Item01,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
Apple Lightning to Digital AV Adapter,TP-Link AC1750 Smart WiFi Router,Apple Pencil,,,,,,,,,,,,,,,,,


### B3.  Summarize one assumption of market basket analysis.

One assumption of market basket analysis is that there exists a quantifiable probability that certain items are bought together by customers (Kadlaskar, 2021).

# Part III: Data Preparation and Analysis

## C.  Prepare and perform market basket analysis by doing the following:

### C1.  Transform the data set to make it suitable for market basket analysis. Include a copy of the cleaned data set.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

sns.set() 

import numpy as np

In [2]:
# import csv

df = pd.read_csv("C:/Users/nick_/_WGU/D212/Task III dataset/Churn/teleco_market_basket.csv", index_col=0)

# "C:\Users\nick_\_WGU\D212\Task III dataset\Churn\teleco_market_basket.csv"

In [3]:
# set notebook up to display all col names

pd.set_option('display.max_columns', None) 

In [4]:
# take look at first handful of rows

df.head(5)

Unnamed: 0_level_0,Item02,Item03,Item04,Item05,Item06,Item07,Item08,Item09,Item10,Item11,Item12,Item13,Item14,Item15,Item16,Item17,Item18,Item19,Item20
Item01,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
,,,,,,,,,,,,,,,,,,,
Logitech M510 Wireless mouse,HP 63 Ink,HP 65 ink,nonda USB C to USB Adapter,10ft iPHone Charger Cable,HP 902XL ink,Creative Pebble 2.0 Speakers,Cleaning Gel Universal Dust Cleaner,Micro Center 32GB Memory card,YUNSONG 3pack 6ft Nylon Lightning Cable,TopMate C5 Laptop Cooler pad,Apple USB-C Charger cable,HyperX Cloud Stinger Headset,TONOR USB Gaming Microphone,Dust-Off Compressed Gas 2 pack,3A USB Type C Cable 3 pack 6FT,HOVAMP iPhone charger,SanDisk Ultra 128GB card,FEEL2NICE 5 pack 10ft Lighning cable,FEIYOLD Blue light Blocking Glasses
,,,,,,,,,,,,,,,,,,,
Apple Lightning to Digital AV Adapter,TP-Link AC1750 Smart WiFi Router,Apple Pencil,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,


In [5]:
# create a copy of the original df in case we need to reference the original values later

df_orig = df

In [6]:
# create a set of col names in case we need to reference those later

original_dataset_cols_set = set(df_orig.columns.tolist()) 

##### **Identification and treatment of null values**

In [7]:
df = df.reset_index() # thank you https://pynative.com/pandas-reset-index/
df

Unnamed: 0,Item01,Item02,Item03,Item04,Item05,Item06,Item07,Item08,Item09,Item10,Item11,Item12,Item13,Item14,Item15,Item16,Item17,Item18,Item19,Item20
0,,,,,,,,,,,,,,,,,,,,
1,Logitech M510 Wireless mouse,HP 63 Ink,HP 65 ink,nonda USB C to USB Adapter,10ft iPHone Charger Cable,HP 902XL ink,Creative Pebble 2.0 Speakers,Cleaning Gel Universal Dust Cleaner,Micro Center 32GB Memory card,YUNSONG 3pack 6ft Nylon Lightning Cable,TopMate C5 Laptop Cooler pad,Apple USB-C Charger cable,HyperX Cloud Stinger Headset,TONOR USB Gaming Microphone,Dust-Off Compressed Gas 2 pack,3A USB Type C Cable 3 pack 6FT,HOVAMP iPhone charger,SanDisk Ultra 128GB card,FEEL2NICE 5 pack 10ft Lighning cable,FEIYOLD Blue light Blocking Glasses
2,,,,,,,,,,,,,,,,,,,,
3,Apple Lightning to Digital AV Adapter,TP-Link AC1750 Smart WiFi Router,Apple Pencil,,,,,,,,,,,,,,,,,
4,,,,,,,,,,,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
14997,Falcon Dust Off Compressed Gas,,,,,,,,,,,,,,,,,,,
14998,,,,,,,,,,,,,,,,,,,,
14999,HP 63XL Ink,Apple USB-C Charger cable,,,,,,,,,,,,,,,,,,
15000,,,,,,,,,,,,,,,,,,,,


In [8]:
##### **Identification and treatment of null values**

# identify col(s) with null values
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15002 entries, 0 to 15001
Data columns (total 20 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Item01  7501 non-null   object
 1   Item02  5747 non-null   object
 2   Item03  4389 non-null   object
 3   Item04  3345 non-null   object
 4   Item05  2529 non-null   object
 5   Item06  1864 non-null   object
 6   Item07  1369 non-null   object
 7   Item08  981 non-null    object
 8   Item09  654 non-null    object
 9   Item10  395 non-null    object
 10  Item11  256 non-null    object
 11  Item12  154 non-null    object
 12  Item13  87 non-null     object
 13  Item14  47 non-null     object
 14  Item15  25 non-null     object
 15  Item16  8 non-null      object
 16  Item17  4 non-null      object
 17  Item18  4 non-null      object
 18  Item19  3 non-null      object
 19  Item20  1 non-null      object
dtypes: object(20)
memory usage: 2.3+ MB


In [9]:
# view # rows before filtering out nulls

df.shape

(15002, 20)

In [10]:
# filter out all values 

df = df[df["Item01"].notna()] # thanks Dr. Kamara
df

Unnamed: 0,Item01,Item02,Item03,Item04,Item05,Item06,Item07,Item08,Item09,Item10,Item11,Item12,Item13,Item14,Item15,Item16,Item17,Item18,Item19,Item20
1,Logitech M510 Wireless mouse,HP 63 Ink,HP 65 ink,nonda USB C to USB Adapter,10ft iPHone Charger Cable,HP 902XL ink,Creative Pebble 2.0 Speakers,Cleaning Gel Universal Dust Cleaner,Micro Center 32GB Memory card,YUNSONG 3pack 6ft Nylon Lightning Cable,TopMate C5 Laptop Cooler pad,Apple USB-C Charger cable,HyperX Cloud Stinger Headset,TONOR USB Gaming Microphone,Dust-Off Compressed Gas 2 pack,3A USB Type C Cable 3 pack 6FT,HOVAMP iPhone charger,SanDisk Ultra 128GB card,FEEL2NICE 5 pack 10ft Lighning cable,FEIYOLD Blue light Blocking Glasses
3,Apple Lightning to Digital AV Adapter,TP-Link AC1750 Smart WiFi Router,Apple Pencil,,,,,,,,,,,,,,,,,
5,UNEN Mfi Certified 5-pack Lightning Cable,,,,,,,,,,,,,,,,,,,
7,Cat8 Ethernet Cable,HP 65 ink,,,,,,,,,,,,,,,,,,
9,Dust-Off Compressed Gas 2 pack,Screen Mom Screen Cleaner kit,Moread HDMI to VGA Adapter,HP 62XL Tri-Color ink,Apple USB-C Charger cable,,,,,,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
14993,SanDisk 32GB Ultra SDHC card,Vsco 70 pack stickers,SanDisk 128GB microSDXC card,,,,,,,,,,,,,,,,,
14995,Apple Lightning to Digital AV Adapter,Nylon Braided Lightning to USB cable,Apple Pencil,USB 2.0 Printer cable,ARRIS SURFboard SB8200 Cable Modem,Apple USB-C Charger cable,,,,,,,,,,,,,,
14997,Falcon Dust Off Compressed Gas,,,,,,,,,,,,,,,,,,,
14999,HP 63XL Ink,Apple USB-C Charger cable,,,,,,,,,,,,,,,,,,


In [11]:
# view # rows after filtering out nulls

df.shape

(7501, 20)

#### Transforming data into a format suitable for Apriori Algorithm

Import mlxtend and relevant module

In [12]:
import mlxtend
from mlxtend.preprocessing import TransactionEncoder

Convert df into list of lists

In [13]:
import time
start = time.time()

In [14]:
rows = []
for i in range(df.shape[0]):
    rows.append([str(df.values[i, j]) for j in range(df.shape[1])]) # thanks Dr. Kamara 
    if i % 500 == 0:
        print(i)

0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
5500
6000
6500
7000
7500


In [15]:
end = time.time()

lapsed = round(end-start,2)

print(f"Time lapsed: {lapsed} seconds")

Time lapsed: 188.59 seconds


In [16]:
# feed list into TransactionEncoder

encoder = TransactionEncoder()
data_array = encoder.fit(rows).transform(rows)

In [17]:
# return output to new df

df_encoded = pd.DataFrame(data_array, columns= encoder.columns_)
df_encoded

Unnamed: 0,10ft iPHone Charger Cable,10ft iPHone Charger Cable 2 Pack,3 pack Nylon Braided Lightning Cable,3A USB Type C Cable 3 pack 6FT,5pack Nylon Braided USB C cables,ARRIS SURFboard SB8200 Cable Modem,Anker 2-in-1 USB Card Reader,Anker 4-port USB hub,Anker USB C to HDMI Adapter,Apple Lightning to Digital AV Adapter,Apple Lightning to USB cable,Apple Magic Mouse 2,Apple Pencil,Apple Pencil 2nd Gen,Apple Power Adapter Extension Cable,Apple USB-C Charger cable,AutoFocus 1080p Webcam,BENGOO G90000 headset,Blue Light Blocking Glasses,Blue Light Blocking Glasses 2pack,Brother Genuine High Yield Toner Cartridge,Cat 6 Ethernet Cable 50ft,Cat8 Ethernet Cable,CicTsing MM057 2.4G Wireless Mouse,Cleaning Gel Universal Dust Cleaner,Creative Pebble 2.0 Speakers,DisplayPort ot HDMI adapter,Dust-Off Compressed Gas,Dust-Off Compressed Gas 2 pack,FEEL2NICE 5 pack 10ft Lighning cable,FEIYOLD Blue light Blocking Glasses,Falcon Dust Off Compressed Gas,HOVAMP Mfi 6pack Lightning Cable,HOVAMP iPhone charger,HP 61 2 pack ink,HP 61 Tri-color ink,HP 61 ink,HP 62XL Tri-Color ink,HP 62XL ink,HP 63 Ink,HP 63 Tri-color ink,HP 63XL Ink,HP 63XL Tri-color ink,HP 64 Tri-Color ink,HP 64 ink,HP 65 ink,HP 902XL ink,HP 952 ink,HP ENVY 5055 printer,HP952XL ink,HooToo USB C Hub,HyperX Cloud Stinger Headset,Jelly Comb 2.4G Slim Wireless mouse,Leader Desk Pad Protector,Logitech M510 Wireless mouse,Logitech MK270 Wireless Keyboard/Mouse,Logitech MK345 Wireless combo,Logitech USB H390 headset,M.2 Screw kit,Mfi-Certified Lightning to USB A Cable,Micro Center 32GB Memory card,Microsot Surface Dock 2,Moread HDMI to VGA Adapter,Mpow HC6 USB Headset,NETGEAR CM500 Cable Modem,NETGEAR Nighthawk WiFi Router,NETGEAR Orbi Home Mesh WiFi System,Nylon Braided Lightning to USB cable,PS4 Headset,Premium Nylon USB Cable,RUNMUS Gaming Headset,SAMSUNG 128GB card,SAMSUNG 256 GB card,SAMSUNG EVO 32GB card,SAMSUNG EVO 64GB card,Sabrent 4-port USB 3.0 hub,SanDisk 128GB Ultra microSDXC card,SanDisk 128GB card,SanDisk 128GB microSDXC card,SanDisk 32GB Ultra SDHC card,SanDisk 32GB card,SanDisk Extreme 128GB card,SanDisk Extreme 256GB card,SanDisk Extreme 32GB 2pack card,SanDisk Extreme Pro 128GB card,SanDisk Extreme Pro 64GB card,SanDisk Ultra 128GB card,SanDisk Ultra 256GB card,SanDisk Ultra 400GB card,SanDisk Ultra 64GB card,Screen Mom Screen Cleaner kit,Stylus Pen for iPad,Syntech USB C to USB Adapter,TONOR USB Gaming Microphone,TP-Link AC1750 Smart WiFi Router,TP-Link AC4000 WiFi router,TopMate C5 Laptop Cooler pad,UNEN Mfi Certified 5-pack Lightning Cable,USB 2.0 Printer cable,USB C to USB Male Adapter,USB Type C Cable,USB Type C to USB-A Charger cable,VIVO Dual LCD Monitor Desk mount,VicTsing Mouse Pad,VicTsing Wireless mouse,Vsco 70 pack stickers,Webcam with Microphone,XPOWER A-2 Air Pump blower,YUNSONG 3pack 6ft Nylon Lightning Cable,hP 65 Tri-color ink,iFixit Pro Tech Toolkit,iPhone 11 case,iPhone 12 Charger cable,iPhone 12 Pro case,iPhone 12 case,iPhone Charger Cable Anker 6ft,iPhone SE case,nan,nonda USB C to USB Adapter,seenda Wireless mouse
0,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,True,True,False,False,True,True,True,False,False,True,False,False,False,False,False,True,False,False,False,False,False,True,True,False,False,False,False,True,False,False,True,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,True,False
1,False,False,False,False,False,False,False,False,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False
2,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False
3,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False
4,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
7496,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,True,False,False
7497,False,False,False,False,False,True,False,False,False,True,False,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False
7498,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False
7499,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False


In [18]:
# remove nan row

cleaned_df_encoded = df_encoded.drop(["nan"], axis=1)
cleaned_df_encoded

Unnamed: 0,10ft iPHone Charger Cable,10ft iPHone Charger Cable 2 Pack,3 pack Nylon Braided Lightning Cable,3A USB Type C Cable 3 pack 6FT,5pack Nylon Braided USB C cables,ARRIS SURFboard SB8200 Cable Modem,Anker 2-in-1 USB Card Reader,Anker 4-port USB hub,Anker USB C to HDMI Adapter,Apple Lightning to Digital AV Adapter,Apple Lightning to USB cable,Apple Magic Mouse 2,Apple Pencil,Apple Pencil 2nd Gen,Apple Power Adapter Extension Cable,Apple USB-C Charger cable,AutoFocus 1080p Webcam,BENGOO G90000 headset,Blue Light Blocking Glasses,Blue Light Blocking Glasses 2pack,Brother Genuine High Yield Toner Cartridge,Cat 6 Ethernet Cable 50ft,Cat8 Ethernet Cable,CicTsing MM057 2.4G Wireless Mouse,Cleaning Gel Universal Dust Cleaner,Creative Pebble 2.0 Speakers,DisplayPort ot HDMI adapter,Dust-Off Compressed Gas,Dust-Off Compressed Gas 2 pack,FEEL2NICE 5 pack 10ft Lighning cable,FEIYOLD Blue light Blocking Glasses,Falcon Dust Off Compressed Gas,HOVAMP Mfi 6pack Lightning Cable,HOVAMP iPhone charger,HP 61 2 pack ink,HP 61 Tri-color ink,HP 61 ink,HP 62XL Tri-Color ink,HP 62XL ink,HP 63 Ink,HP 63 Tri-color ink,HP 63XL Ink,HP 63XL Tri-color ink,HP 64 Tri-Color ink,HP 64 ink,HP 65 ink,HP 902XL ink,HP 952 ink,HP ENVY 5055 printer,HP952XL ink,HooToo USB C Hub,HyperX Cloud Stinger Headset,Jelly Comb 2.4G Slim Wireless mouse,Leader Desk Pad Protector,Logitech M510 Wireless mouse,Logitech MK270 Wireless Keyboard/Mouse,Logitech MK345 Wireless combo,Logitech USB H390 headset,M.2 Screw kit,Mfi-Certified Lightning to USB A Cable,Micro Center 32GB Memory card,Microsot Surface Dock 2,Moread HDMI to VGA Adapter,Mpow HC6 USB Headset,NETGEAR CM500 Cable Modem,NETGEAR Nighthawk WiFi Router,NETGEAR Orbi Home Mesh WiFi System,Nylon Braided Lightning to USB cable,PS4 Headset,Premium Nylon USB Cable,RUNMUS Gaming Headset,SAMSUNG 128GB card,SAMSUNG 256 GB card,SAMSUNG EVO 32GB card,SAMSUNG EVO 64GB card,Sabrent 4-port USB 3.0 hub,SanDisk 128GB Ultra microSDXC card,SanDisk 128GB card,SanDisk 128GB microSDXC card,SanDisk 32GB Ultra SDHC card,SanDisk 32GB card,SanDisk Extreme 128GB card,SanDisk Extreme 256GB card,SanDisk Extreme 32GB 2pack card,SanDisk Extreme Pro 128GB card,SanDisk Extreme Pro 64GB card,SanDisk Ultra 128GB card,SanDisk Ultra 256GB card,SanDisk Ultra 400GB card,SanDisk Ultra 64GB card,Screen Mom Screen Cleaner kit,Stylus Pen for iPad,Syntech USB C to USB Adapter,TONOR USB Gaming Microphone,TP-Link AC1750 Smart WiFi Router,TP-Link AC4000 WiFi router,TopMate C5 Laptop Cooler pad,UNEN Mfi Certified 5-pack Lightning Cable,USB 2.0 Printer cable,USB C to USB Male Adapter,USB Type C Cable,USB Type C to USB-A Charger cable,VIVO Dual LCD Monitor Desk mount,VicTsing Mouse Pad,VicTsing Wireless mouse,Vsco 70 pack stickers,Webcam with Microphone,XPOWER A-2 Air Pump blower,YUNSONG 3pack 6ft Nylon Lightning Cable,hP 65 Tri-color ink,iFixit Pro Tech Toolkit,iPhone 11 case,iPhone 12 Charger cable,iPhone 12 Pro case,iPhone 12 case,iPhone Charger Cable Anker 6ft,iPhone SE case,nonda USB C to USB Adapter,seenda Wireless mouse
0,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,True,True,False,False,True,True,True,False,False,True,False,False,False,False,False,True,False,False,False,False,False,True,True,False,False,False,False,True,False,False,True,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,True,False
1,False,False,False,False,False,False,False,False,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
7496,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False
7497,False,False,False,False,False,True,False,False,False,True,False,False,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
7498,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
7499,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False


In [19]:
cleaned_df_encoded.shape

(7501, 119)

Save output to csv file

In [20]:
cleaned_df_encoded.to_csv("C:/Users/nick_/_WGU/D212/Task III dataset/Churn/market_basket_clean.csv")
print("Successfully saved")

Successfully saved


### C2.  Execute the code used to generate association rules with the Apriori algorithm. Provide screenshots that demonstrate that the code is error free.

In [21]:
# import relevant libraries

from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules

In [22]:
# create Apriori object rules

rules = apriori(cleaned_df_encoded, min_support=0.02, use_colnames=True)
rules

Unnamed: 0,support,itemsets
0,0.050527,(10ft iPHone Charger Cable 2 Pack)
1,0.042528,(3A USB Type C Cable 3 pack 6FT)
2,0.029463,(Anker 2-in-1 USB Card Reader)
3,0.068391,(Anker USB C to HDMI Adapter)
4,0.087188,(Apple Lightning to Digital AV Adapter)
...,...,...
98,0.023730,"(USB 2.0 Printer cable, Screen Mom Screen Clea..."
99,0.035462,"(Screen Mom Screen Cleaner kit, VIVO Dual LCD ..."
100,0.020131,"(USB 2.0 Printer cable, Stylus Pen for iPad)"
101,0.025197,"(Stylus Pen for iPad, VIVO Dual LCD Monitor De..."


In [23]:
# create rules table

lift_rules_table = association_rules(rules, metric = "lift", min_threshold=1)
lift_rules_table

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,zhangs_metric
0,(10ft iPHone Charger Cable 2 Pack),(Dust-Off Compressed Gas 2 pack),0.050527,0.238368,0.023064,0.456464,1.914955,0.011020,1.401255,0.503221
1,(Dust-Off Compressed Gas 2 pack),(10ft iPHone Charger Cable 2 Pack),0.238368,0.050527,0.023064,0.096756,1.914955,0.011020,1.051182,0.627330
2,(Anker USB C to HDMI Adapter),(Dust-Off Compressed Gas 2 pack),0.068391,0.238368,0.024397,0.356725,1.496530,0.008095,1.183991,0.356144
3,(Dust-Off Compressed Gas 2 pack),(Anker USB C to HDMI Adapter),0.238368,0.068391,0.024397,0.102349,1.496530,0.008095,1.037830,0.435627
4,(VIVO Dual LCD Monitor Desk mount),(Anker USB C to HDMI Adapter),0.174110,0.068391,0.020931,0.120214,1.757755,0.009023,1.058905,0.521973
...,...,...,...,...,...,...,...,...,...,...
89,(VIVO Dual LCD Monitor Desk mount),(Screen Mom Screen Cleaner kit),0.174110,0.129583,0.035462,0.203675,1.571779,0.012900,1.093043,0.440468
90,(USB 2.0 Printer cable),(Stylus Pen for iPad),0.170911,0.095054,0.020131,0.117785,1.239135,0.003885,1.025766,0.232768
91,(Stylus Pen for iPad),(USB 2.0 Printer cable),0.095054,0.170911,0.020131,0.211781,1.239135,0.003885,1.051852,0.213256
92,(Stylus Pen for iPad),(VIVO Dual LCD Monitor Desk mount),0.095054,0.174110,0.025197,0.265077,1.522468,0.008647,1.123778,0.379218


### C3.  Provide values for the support, lift, and confidence of the association rules table.

When the objective is to find the pairings with the greatest **Lift** value

In [24]:
rules_table_sorted = lift_rules_table.sort_values(["lift", "confidence", "support"], ascending=[False, False, False])
lift_rules_table_short = lift_rules_table[["antecedents", "consequents", "lift", "confidence", "support"]]
lift_rules_table_short

Unnamed: 0,antecedents,consequents,lift,confidence,support
0,(10ft iPHone Charger Cable 2 Pack),(Dust-Off Compressed Gas 2 pack),1.914955,0.456464,0.023064
1,(Dust-Off Compressed Gas 2 pack),(10ft iPHone Charger Cable 2 Pack),1.914955,0.096756,0.023064
2,(Anker USB C to HDMI Adapter),(Dust-Off Compressed Gas 2 pack),1.496530,0.356725,0.024397
3,(Dust-Off Compressed Gas 2 pack),(Anker USB C to HDMI Adapter),1.496530,0.102349,0.024397
4,(VIVO Dual LCD Monitor Desk mount),(Anker USB C to HDMI Adapter),1.757755,0.120214,0.020931
...,...,...,...,...,...
89,(VIVO Dual LCD Monitor Desk mount),(Screen Mom Screen Cleaner kit),1.571779,0.203675,0.035462
90,(USB 2.0 Printer cable),(Stylus Pen for iPad),1.239135,0.117785,0.020131
91,(Stylus Pen for iPad),(USB 2.0 Printer cable),1.239135,0.211781,0.020131
92,(Stylus Pen for iPad),(VIVO Dual LCD Monitor Desk mount),1.522468,0.265077,0.025197


### C4.  Explain the top three relevant rules generated by the Apriori algorithm. Include a screenshot of the top three relevant rules.

Based on the results from the Apriori table created above, the top three rules in the dataset when evaluating for highest lift are as follows:
1. Purchasers of a *10ft iPhone Charger Cable 2 pack* buying a *Dust-Off Compressed Gas 2 pack*
2. Purchasers of a *Dust-Off Compressed Gas 2 pack* buying a *10ft iPhone Charger Cable 2 pack*
3. Purchasers of a *Dust-Off Compressed Gas 2 pack* buying an *Anker USB C to HDMI Adapter*

In [25]:
lift_rules_table_short.head(3)

Unnamed: 0,antecedents,consequents,lift,confidence,support
0,(10ft iPHone Charger Cable 2 Pack),(Dust-Off Compressed Gas 2 pack),1.914955,0.456464,0.023064
1,(Dust-Off Compressed Gas 2 pack),(10ft iPHone Charger Cable 2 Pack),1.914955,0.096756,0.023064
2,(Anker USB C to HDMI Adapter),(Dust-Off Compressed Gas 2 pack),1.49653,0.356725,0.024397


# Part IV: Data Summary and Implications

## D.  Summarize your data analysis by doing the following:

### D1.  Summarize the significance of support, lift, and confidence from the results of the analysis.

(this analysis operates under the assumption that a customer is putting the first item into their cart, then the second item into their cart.  i.e. one overall transaction, not a customer first buying the first item and then buying the second item in a later transaction.  Both interpretations seem to be viable, so I'm stating mine here in hopes that clears up any questions in my analysis)

The significance of support, lift, and confidence from the results of the analysis are as follows: 
1. **Support** 
    * Instances of a customer adding a *10ft iPhone Charger Cable 2 pack* to their cart and then a *Dust-Off Compressed Gas 2 pack* (and vice versa) occur in **2.31%** of all transactions in the dataset
    * Instances of a customer adding a *Dust-Off Compressed Gas 2 pack* to their cart and then a *10ft iPhone Charger Cable 2 pack* occur in **2.44%** of all transactions in the dataset
    
    * ***Significance: we see that the pairing with the highest lift occurred in slightly over 2.25% of all transactions***
    
    
2. **Lift** 
    * Customers adding a *10ft iPhone Charger Cable 2 pack* to their cart are **1.91x** more likely to also add a *Dust-Off Compressed Gas 2 pack* to their cart and vice-versa
    * Customers adding a *Dust-Off Compressed Gas 2 pack* to their cart are **1.5x** more likely to also add an *Anker USB C to HDMI Adapter* to their cart
    
    * ***Significance: if a customer purchases either a cable charger pack or a compressed gas pack, the likelihood of them purchasing the other item is nearly double what it would otherwise be***
    
    
3. **Confidence** 
    * **45.6%** of all transactions with a *10ft iPhone Charger Cable 2 pack* also contained a *Dust-Off Compressed Gas 2 pack*
    * **9.7%** of transactions with a *Dust-Off Compressed Gas 2 pack* also contained a *10ft iPhone Charger Cable 2 pack*
    * **10.2%** of transactions with a *Dust-Off Compressed Gas 2 pack* also contained an *Anker USB C to HDMI Adapter*
    * ***Significance: nearly half of all transactions containing a cable charger pack also contain a compressed gas pack***

### D2.  Discuss the practical significance of your findings from the analysis.

The three rules that have the highest lift values occur in just over 2.25% of the transactions in the dataset, and the product pairs in each relationship are such that a customer is 1.5 times more likely to buy the second product if the first one is in their cart.  In the case of customers buying a *10ft iPhone Charger Cable 2 pack*, there is nearly a 1 in 2 chance that they will also buy a *Dust-Off Compressed Gas 2 pack*.  

Also of note is that single versions of the cable pack and compressed gas pack were options in the dataset.  This suggests that there is a stronger relationship between the two-packs of these items than there is between a single cable and a single compressed gas unit.

### D3.  Recommend a course of action for the real-world organizational situation from part A1 based on the results from part D1.

In light of the insights gathered here, a logical next step would be to evaluate how products are being presented wherever it is that individuals are shopping (in-store or online).  Many online retailers have a *customers also bought* section on the page of a given product.  Given the results seen here, it would make sense to build out something like that so that clients can see other products that previous customers have also bought, similar to how a scarf may be recommended when the client is on the landing page for a jacket.

# Part V: Attachments

## E.  Provide a Panopto video recording that includes the presenter and a vocalized demonstration showing all code used, the code being executed, and the results of all code used in the task.

Link can be found [here](https://wgu.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=d6899a07-f4db-4c40-adef-b0d3009b733b)

## F.  Record all web sources you used to acquire data or segments of third-party code to support the application. Ensure the web sources are reliable.

_Cited in cells where sources were used_

## G.  Acknowledge sources, using in-text citations and references, for content that is quoted, paraphrased, or summarized.

_A special thank you to Dr. Kamara for taking the time to record these videos for this Task in particular--they were most helpful!!_

Kadlaskar, A. (2021, October 2). Market basket Analysis | Guide on Market Basket Analysis. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2021/10/a-comprehensive-guide-on-market-basket-analysis/

Kamara, Dr. K. (n.d.-a). Data Mining II - D212 Task 3. Retrieved December 9, 2023, from https://wgu.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=db85c4f1-0da5-4bde-a1a4-b07c0019d46d

Kamara, Dr. K. (n.d.-b). Data Mining II - D212 Theory. Retrieved December 9, 2023, from https://wgu.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=9541a29b-2f14-4c5d-9d86-af030005bcf6

Kamara, Dr. K. (n.d.-c). Installing mlxtend in anaconda on Window 10. Retrieved December 9, 2023, from https://wgu.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=339038ae-a6fd-415a-8af8-aeee00e3a0d9

Randeniya, M. (2023, April 16). Performing a Market-Basket Analysis Using the Apriori Algorithm — Python. Medium. https://medium.com/@randeniyamalitha08/performing-a-market-basket-analysis-using-the-apriori-algorithm-python-8a11742253dd

Sivek, Dr. S. C. (2020, November 17). Market Basket Analysis 101: Key Concepts. Medium. https://towardsdatascience.com/market-basket-analysis-101-key-concepts-1ddc6876cd00

Understanding Market Basket Analysis in Data Mining. (n.d.). Www.turing.com. Retrieved December 9, 2023, from https://www.turing.com/kb/market-basket-analysis

WebFOCUS 8 Technical Library. (n.d.). Infocenter.informationbuilders.com. Retrieved December 9, 2023, from https://infocenter.informationbuilders.com/wf80/index.jsp?topic=%2Fpubdocs%2FRStat16%2Fsource%2Ftopic49.htm

Working definitions (Sivek, 2020)

Support: (measuring the total number of transactions made for a particular product pairing divided by total number of transactions made)

Lift: (measuring the ratio of combined transactions to individual transactions)

Confidence: (measuring the ratio of transactions containing both items to the number of transactions with just the first item in it)


Another one (Understanding Market Basket Analysis in Data Mining, n.d.) 

Support: It is the total number of transactions made for a particular product divided by the total number of transactions made.

Confidence: It is the ratio of combined transactions to individual transactions.

Lift: It is the ratio of the confidence percent to the support percent.