# Practice Session 04: Basket analysis

<font size="+2" color="blue">Additional results: experiments on cross-department association rules</font>

Author: <font color="blue">Guillem Escriba Molto</font>

E-mail: <font color="blue">guillem.escriba01@estudiant.upf.edu</font>

Date: <font color="blue">20/10/2022</font>

In [1]:
# Installation of apyori
!pip install apyori 3 



In [2]:
import numpy as np  
import matplotlib.pyplot as plt  
import pandas as pd  
import csv
import gzip
from apyori import apriori

# 1. Playing with apyori

In [3]:
# LEAVE AS-IS

def print_apyori_output (association_results, info=False, info_key=False):
    for relation_record in association_results:
        itemset = list(relation_record.items)
        
        # Consider only itemsets of two elements
        if len(itemset) > 1: 
        
            print("Rules involving itemset %s" % itemset)
            support = relation_record.support

            for rules in relation_record.ordered_statistics:
                antecedent = list(rules.items_base)
                consequent = list(rules.items_add)
                
                if info_key:
                    antecedent = [info.loc[x][info_key] for x in antecedent]
                    consequent = [info.loc[x][info_key] for x in consequent]
                
                confidence = rules.confidence
                lift = rules.lift

                print("%s => %s (support=%.4f, confidence=%.2f, lift=%.2f)" %
                      (antecedent, consequent, support, confidence, lift))
            print()

In [4]:
transactions = [
    ['beer', 'chips', 'nuts', 'olives'],
    ['beer', 'chips', 'olives'],
    ['chips', 'nuts' ],
    ['chips', 'olives'],
    ['beer', 'nuts' ],
    ['chips'],
    ['nuts', 'olives'],
    ['beer', 'nuts'],
    ['beer', 'chips', 'olives'], 
    ['beer', 'nuts', 'olives'], 

]
results = list(apriori(transactions, min_support=0.2, min_confidence=1.0, min_lift=1.0))
print_apyori_output(results)

Rules involving itemset ['chips', 'beer', 'olives']
['chips', 'beer'] => ['olives'] (support=0.3000, confidence=1.00, lift=1.67)



In [5]:
transactions = [
    ['beer', 'chips', 'nuts', 'olives'],
    ['beer', 'chips', 'olives'],
    ['beer', 'chips', 'nuts', 'olives'],
    ['chips', 'nuts' ],
    ['beer', 'chips', 'olives'], 
    ['chips', 'olives'],
    ['beer', 'nuts' ],
    ['beer', 'chips', 'nuts', 'olives'],
    ['chips', 'nuts' ],
    ['chips'],
    ['chips', 'nuts' ],
    ['beer', 'chips', 'olives'], 
    ['chips'],
    ['nuts', 'olives'],
    ['beer', 'nuts'],
    ['beer', 'chips', 'olives'], 
    ['beer', 'nuts', 'olives'], 
    ['beer', 'chips', 'nuts', 'olives'],
    ['beer', 'chips', 'nuts', 'olives'],
    ['beer', 'chips', 'olives'],
    ['chips', 'nuts' ],
    ['chips', 'olives'],
    ['beer', 'nuts' ],
    ['chips'],
    ['chips', 'nuts' ],
    ['chips', 'nuts' ],
    ['nuts', 'olives'],
    ['beer', 'nuts'],
    ['beer', 'chips', 'olives'], 
    ['beer', 'nuts', 'olives'], 

]
results = list(apriori(transactions, min_support=0.36, min_confidence=0.76, min_lift=1.35))
print_apyori_output(results)

Rules involving itemset ['chips', 'beer', 'olives']
['chips', 'beer'] => ['olives'] (support=0.3667, confidence=1.00, lift=1.76)
['chips', 'olives'] => ['beer'] (support=0.3667, confidence=0.85, lift=1.49)



<font size="+1">For the first rule I would count how many transactions contain "chips" and "beer" (11) and then I would divide it by the total amount of transactions (30) so we obtain 11/30, I would do the same for the second rule but instead of using "chips" and "beer" I'd use "olives" and "chips" so it would be 11/30 too. For compute the confidence I would count how many transactions contain A = "chips" and "beer" which is 11 and how many of them also contain B = "olives" (11) so to compute the confidence we divide transactions with A and B by transactions with A so we get 11/11 this means that every time that "chips" and "beer" has been bought, "olives" have been bought also. The same for the second rule but now A and B are only 11 and A are 13 so we get 11/13 as confidence. Lastly to obtain the lift of the first rule we need to count how many times appear B = "olives" that are 17 and dividing by total trandactions to obtain the support of B that is 17/30 and then we divide the confidence computed previously by that support obtaining (11/11)/(17/30) = 1'76 that is the lift. For the second rule we do the same, first compute the support of B = "beer" which is 17/30 too and then we divide the confidence of A->B by the support B and we obtain (11/13)/(17/30) = 1'49 obtaining the lift.</font>

# 2. Load and prepare the shopping baskets

In [6]:
# LEAVE AS-IS

# File names
INPUT_PRODUCTS = "instacart-products.csv"
INPUT_TRANSACTIONS = "instacart-transactions.csv.gz"

# Read into a dataframe
products = pd.read_csv(INPUT_PRODUCTS, delimiter=",")

# Set product_id as index, and drop column aisle_id
products = products.set_index('product_id').drop(columns=['aisle_id'])

products.head(100)

Unnamed: 0_level_0,product_name,department_id
product_id,Unnamed: 1_level_1,Unnamed: 2_level_1
1,Chocolate Sandwich Cookies,19
2,All-Seasons Salt,13
3,Robust Golden Unsweetened Oolong Tea,7
4,Smart Ones Classic Favorites Mini Rigatoni Wit...,1
5,Green Chile Anytime Sauce,13
...,...,...
96,Sprinklez Confetti Fun Organic Toppings,13
97,Organic Chamomile Lemon Tea,7
98,2% Yellow American Cheese,16
99,Local Living Butter Lettuce,4


## 2.1. Select by department

As this file is large and complex, we will focus on one or two departments and try to get some conclusions about the products in those departments. The following cell, which you should leave as-is, list some department names.

<font size="-1" color="gray">(Remove this cell when delivering.)</font>

In [7]:
# LEAVE AS-IS

DEPT_BAKERY = 3
DEPT_VEGGIES = 4
DEPT_ALCOHOL = 5
DEPT_WORLD = 6
DEPT_DRINKS = 7
DEPT_PETS = 8
DEPT_PHARMACY = 11
DEPT_CLEANING = 17
DEPT_BABIES = 18

In [8]:
def select_from_departments(products,product_ids,department_ids): # Main function to avoid unnecesary prints
    selected_products = [] # Store the products belonging to the selected departments
    for product_id in product_ids: # Iterates for each product
        if products.loc[product_id].department_id in department_ids: # Checks if the product belongs to the selected departments
            selected_products.append(product_id) # Store in a list
    return selected_products

In [9]:
def select_from_departments_print(products,product_ids,department_ids): # Function created to print the results for debugging
    selected_products = []
    print("Test case: \n{}\n\nInput products:".format(product_ids))
    for product_id in product_ids:
        print("{} {} (dept {})".format(product_id,products.loc[product_id].product_name,products.loc[product_id].department_id))
        if products.loc[product_id].department_id in department_ids:
            selected_products.append(product_id)
    print("\nSelected products:")
    for product_id in selected_products:
        print("{} {} (dept {})".format(product_id,products.loc[product_id].product_name,products.loc[product_id].department_id))
    return selected_products

In [10]:
product_ids = [21, 26, 45, 54, 57, 71, 111, 112]
department_ids = [DEPT_PETS, DEPT_CLEANING]
selected_products = select_from_departments_print(products,product_ids,department_ids)

Test case: 
[21, 26, 45, 54, 57, 71, 111, 112]

Input products:
21 Small & Medium Dental Dog Treats (dept 8)
26 Fancy Feast Trout Feast Flaked Wet Cat Food (dept 8)
45 European Cucumber (dept 4)
54 24/7 Performance Cat Litter (dept 8)
57 Flat Toothpicks (dept 17)
71 Ultra 7 Inch Polypropylene Traditional Plates (dept 17)
111 Fabric Softener, Geranium Scent (dept 17)
112 Hot Tomatillo Salsa (dept 13)

Selected products:
21 Small & Medium Dental Dog Treats (dept 8)
26 Fancy Feast Trout Feast Flaked Wet Cat Food (dept 8)
54 24/7 Performance Cat Litter (dept 8)
57 Flat Toothpicks (dept 17)
71 Ultra 7 Inch Polypropylene Traditional Plates (dept 17)
111 Fabric Softener, Geranium Scent (dept 17)


In [11]:
# 1 Department
product_ids = [21, 26, 45, 54, 57, 71, 99, 111, 112]
department_ids = [DEPT_ALCOHOL]
selected_products = select_from_departments_print(products,product_ids,department_ids)

Test case: 
[21, 26, 45, 54, 57, 71, 99, 111, 112]

Input products:
21 Small & Medium Dental Dog Treats (dept 8)
26 Fancy Feast Trout Feast Flaked Wet Cat Food (dept 8)
45 European Cucumber (dept 4)
54 24/7 Performance Cat Litter (dept 8)
57 Flat Toothpicks (dept 17)
71 Ultra 7 Inch Polypropylene Traditional Plates (dept 17)
99 Local Living Butter Lettuce (dept 4)
111 Fabric Softener, Geranium Scent (dept 17)
112 Hot Tomatillo Salsa (dept 13)

Selected products:


In [12]:
# 2 Department
product_ids = [3, 4, 21, 26, 45, 54, 57, 71, 97, 99, 111, 112]
department_ids = [DEPT_ALCOHOL, DEPT_DRINKS]
selected_products = select_from_departments_print(products,product_ids,department_ids)

Test case: 
[3, 4, 21, 26, 45, 54, 57, 71, 97, 99, 111, 112]

Input products:
3 Robust Golden Unsweetened Oolong Tea (dept 7)
4 Smart Ones Classic Favorites Mini Rigatoni With Vodka Cream Sauce (dept 1)
21 Small & Medium Dental Dog Treats (dept 8)
26 Fancy Feast Trout Feast Flaked Wet Cat Food (dept 8)
45 European Cucumber (dept 4)
54 24/7 Performance Cat Litter (dept 8)
57 Flat Toothpicks (dept 17)
71 Ultra 7 Inch Polypropylene Traditional Plates (dept 17)
97 Organic Chamomile Lemon Tea (dept 7)
99 Local Living Butter Lettuce (dept 4)
111 Fabric Softener, Geranium Scent (dept 17)
112 Hot Tomatillo Salsa (dept 13)

Selected products:
3 Robust Golden Unsweetened Oolong Tea (dept 7)
97 Organic Chamomile Lemon Tea (dept 7)


In [13]:
# 3 Departments
product_ids = [21, 26, 45, 54, 57, 71, 99, 111, 112]
department_ids = [DEPT_PETS, DEPT_CLEANING, DEPT_VEGGIES]
selected_products = select_from_departments_print(products,product_ids,department_ids)

Test case: 
[21, 26, 45, 54, 57, 71, 99, 111, 112]

Input products:
21 Small & Medium Dental Dog Treats (dept 8)
26 Fancy Feast Trout Feast Flaked Wet Cat Food (dept 8)
45 European Cucumber (dept 4)
54 24/7 Performance Cat Litter (dept 8)
57 Flat Toothpicks (dept 17)
71 Ultra 7 Inch Polypropylene Traditional Plates (dept 17)
99 Local Living Butter Lettuce (dept 4)
111 Fabric Softener, Geranium Scent (dept 17)
112 Hot Tomatillo Salsa (dept 13)

Selected products:
21 Small & Medium Dental Dog Treats (dept 8)
26 Fancy Feast Trout Feast Flaked Wet Cat Food (dept 8)
45 European Cucumber (dept 4)
54 24/7 Performance Cat Litter (dept 8)
57 Flat Toothpicks (dept 17)
71 Ultra 7 Inch Polypropylene Traditional Plates (dept 17)
99 Local Living Butter Lettuce (dept 4)
111 Fabric Softener, Geranium Scent (dept 17)


## 2.2. Read and filter transactions

In [14]:
def extract_transactions(filename,products,department_ids):
    transactions = []
    n_read = 0
    n_stored = 0
    # Open a compressed file
    with gzip.open(INPUT_TRANSACTIONS, "rt") as inputfile:

        # Create a CSV reader
        reader = csv.reader(inputfile, delimiter=",")

        # Iterate through the CSV file
        for row in reader:
            n_read += 1
            if n_read%1000 == 0:
                print("\nNumber of transactions read: {}\nNumer of transactions stored: {}/5000 ({}%)".format(n_read,n_stored,round((n_stored/5000)*100,2)))
            # Convert to integers
            items = [int(x) for x in row]
            selected_items = select_from_departments(products,items,department_ids)
            if len(selected_items) > 1:
                transactions.append(selected_items)
                n_stored += 1
            if n_stored == 5000:
                print("\nNumber of transactions read: {}\nNumer of transactions stored: {}/5000 ({}%)\n\nTransactions extracted succesfuly!".format(n_read,n_stored,round((n_stored/5000)*100,2)))
                break
    return transactions

In [15]:
department_ids = [DEPT_CLEANING]
transactions_cleaning = extract_transactions(INPUT_TRANSACTIONS,products,department_ids)


Number of transactions read: 1000
Numer of transactions stored: 51/5000 (1.02%)

Number of transactions read: 2000
Numer of transactions stored: 99/5000 (1.98%)

Number of transactions read: 3000
Numer of transactions stored: 142/5000 (2.84%)

Number of transactions read: 4000
Numer of transactions stored: 187/5000 (3.74%)

Number of transactions read: 5000
Numer of transactions stored: 229/5000 (4.58%)

Number of transactions read: 6000
Numer of transactions stored: 285/5000 (5.7%)

Number of transactions read: 7000
Numer of transactions stored: 338/5000 (6.76%)

Number of transactions read: 8000
Numer of transactions stored: 381/5000 (7.62%)

Number of transactions read: 9000
Numer of transactions stored: 434/5000 (8.68%)

Number of transactions read: 10000
Numer of transactions stored: 481/5000 (9.62%)

Number of transactions read: 11000
Numer of transactions stored: 537/5000 (10.74%)

Number of transactions read: 12000
Numer of transactions stored: 582/5000 (11.64%)

Number of tra

## 2.3. Extract association rules and comment on them (DEPT_CLEANING)

In [16]:
results = list(apriori(transactions_cleaning, min_support=0.001, min_confidence=0.5, min_lift=1.0)) 
print_apyori_output(results, products, 'product_name')

Rules involving itemset [18229, 21653]
['Plastic Knives'] => ['Compostable Forks'] (support=0.0016, confidence=0.89, lift=123.46)

Rules involving itemset [31801, 18229]
['Plastic Knives'] => ['9 Inch Plates'] (support=0.0010, confidence=0.56, lift=48.73)

Rules involving itemset [41387, 18229]
['Plastic Knives'] => ['Plastic Spoons'] (support=0.0014, confidence=0.78, lift=111.11)

Rules involving itemset [41387, 21653]
['Compostable Forks'] => ['Plastic Spoons'] (support=0.0036, confidence=0.50, lift=71.43)
['Plastic Spoons'] => ['Compostable Forks'] (support=0.0036, confidence=0.51, lift=71.43)

Rules involving itemset [28626, 43534]
['Foam Bowls'] => ['Foam Plates'] (support=0.0010, confidence=0.62, lift=240.38)

Rules involving itemset [31801, 29474]
['Bowls'] => ['9 Inch Plates'] (support=0.0022, confidence=0.52, lift=45.95)

Rules involving itemset [44643, 31066, 17747]
['Gallon Freezer Bags', 'Plastic Wrap'] => ['Aluminum Foil'] (support=0.0010, confidence=0.62, lift=17.76)

Rul

<font size="+1" >As we can clearly see we should recomend to anyone who buys any type plastic cutlery the other parts of the cutlery with a high confidence. For example everybody that have bought plastic spoons and knives have bought compostable forks too. Also the ones who have bought bowls and compostable forks or spoons tend to buy also the remaining item with high confidence. It is also common to buy aluminum foil while buying gallon freezer bags and plastic wrap. There are also 1 to 1 recomendations such as if they buy plastic spoons would be a good idea to recommend compostable forks and viceversa. There are some more recommendations but these are a few examples of the potential of association rules.</font>

## 2.4. Extract association rules and comment on them (other departments)

In [17]:
department_ids = [DEPT_DRINKS,DEPT_ALCOHOL]
transactions_drinks = extract_transactions(INPUT_TRANSACTIONS,products,department_ids)


Number of transactions read: 1000
Numer of transactions stored: 195/5000 (3.9%)

Number of transactions read: 2000
Numer of transactions stored: 380/5000 (7.6%)

Number of transactions read: 3000
Numer of transactions stored: 608/5000 (12.16%)

Number of transactions read: 4000
Numer of transactions stored: 818/5000 (16.36%)

Number of transactions read: 5000
Numer of transactions stored: 1026/5000 (20.52%)

Number of transactions read: 6000
Numer of transactions stored: 1241/5000 (24.82%)

Number of transactions read: 7000
Numer of transactions stored: 1448/5000 (28.96%)

Number of transactions read: 8000
Numer of transactions stored: 1682/5000 (33.64%)

Number of transactions read: 9000
Numer of transactions stored: 1882/5000 (37.64%)

Number of transactions read: 10000
Numer of transactions stored: 2109/5000 (42.18%)

Number of transactions read: 11000
Numer of transactions stored: 2326/5000 (46.52%)

Number of transactions read: 12000
Numer of transactions stored: 2544/5000 (50.88

In [18]:
results = list(apriori(transactions_drinks, min_support=0.003, min_confidence=0.5, min_lift=1.0))
print_apyori_output(results, products, 'product_name')

Rules involving itemset [196, 46149]
['Zero Calorie Cola'] => ['Soda'] (support=0.0046, confidence=0.55, lift=17.33)

Rules involving itemset [11123, 31231]
['Vitamin Water Zero Rise Orange'] => ['Vitamin Water Zero Squeezed Lemonade'] (support=0.0032, confidence=0.53, lift=72.07)

Rules involving itemset [12576, 39947]
['Kiwi Sandia Sparkling Water'] => ['Blackberry Cucumber Sparkling Water'] (support=0.0054, confidence=0.51, lift=39.19)

Rules involving itemset [44632, 14947, 21709]
['Pure Sparkling Water', 'Sparkling Lemon Water'] => ['Sparkling Water Grapefruit'] (support=0.0040, confidence=0.61, lift=8.30)

Rules involving itemset [44632, 14947, 35221]
['Pure Sparkling Water', 'Lime Sparkling Water'] => ['Sparkling Water Grapefruit'] (support=0.0046, confidence=0.72, lift=9.85)

Rules involving itemset [44632, 21709, 20119]
['Sparkling Lemon Water', 'Sparkling Water Berry'] => ['Sparkling Water Grapefruit'] (support=0.0034, confidence=0.63, lift=8.63)

Rules involving itemset [446

<font size="+1">Thanks to the association ruls we can recomend with a good confidence level to the ones who bought two types of Sparkling Water, recommend them another type because we can clearly see that people who bought for example two of the following: Pure Sparkling Water, Sparkling Lemon Water, Lime Sparkling Water and/or Sparkling Water Berry tend to buy also Sparkling Water Grapefruit. It is also frequent to the ones who buy some kind of soda such as Zero Calorie Cola buy also Soda. As we have seen it seems to follow some kind of pattern in which people who buy one drink they also buy another type too.</font>

# EXTRA

In [19]:
def print_apyori_output_difdep(association_results, info=False, info_key=False): # Function that only prints rules of different departments
    for relation_record in association_results:
        itemset = list(relation_record.items)
        
        # Consider only itemsets of two elements
        if len(itemset) > 1: 
        
            not_printed = True # Bool var to print only one time
            for rules in relation_record.ordered_statistics:
                antecedent = list(rules.items_base)
                consequent = list(rules.items_add)
                antecedent_dep = []
                diff_dep = False 
                
                for item in antecedent: # Extract departments of antecedents
                    antecedent_dep.append(info.loc[item].department_id)
                
                for item in consequent: # Checks for every item of consequent if it does not belong to a department of antecedents
                    department = info.loc[item].department_id
                    if department not in antecedent_dep:
                        diff_dep = True # If it does not belongs a bool var is changed to true and the loop is broken
                        break
                
                if info_key:
                    antecedent = [info.loc[x][info_key] for x in antecedent]
                    consequent = [info.loc[x][info_key] for x in consequent]
                if diff_dep: # It only will print the relations if they are between different departments
                    if not_printed: # things to print only one time
                        print()
                        print("Rules involving itemset %s" % itemset)
                        support = relation_record.support
                        not_printed = False
                    
                    confidence = rules.confidence
                    lift = rules.lift

                    print("%s => %s (support=%.4f, confidence=%.2f, lift=%.2f)" %
                          (antecedent, consequent, support, confidence, lift))
            

In [20]:
department_ids = [DEPT_VEGGIES,DEPT_BAKERY]
transactions_vegbak = extract_transactions(INPUT_TRANSACTIONS,products,department_ids)
print("\n\n INVOLVED RULES\n----------------")
results = list(apriori(transactions_vegbak, min_support=0.001, min_confidence=0.5, min_lift=1.0))
print_apyori_output_difdep(results, products, 'product_name')


Number of transactions read: 1000
Numer of transactions stored: 659/5000 (13.18%)

Number of transactions read: 2000
Numer of transactions stored: 1267/5000 (25.34%)

Number of transactions read: 3000
Numer of transactions stored: 1898/5000 (37.96%)

Number of transactions read: 4000
Numer of transactions stored: 2515/5000 (50.3%)

Number of transactions read: 5000
Numer of transactions stored: 3134/5000 (62.68%)

Number of transactions read: 6000
Numer of transactions stored: 3744/5000 (74.88%)

Number of transactions read: 7000
Numer of transactions stored: 4367/5000 (87.34%)

Number of transactions read: 8000
Numer of transactions stored: 4994/5000 (99.88%)

Number of transactions read: 8010
Numer of transactions stored: 5000/5000 (100.0%)

Transactions extracted succesfuly!


 INVOLVED RULES
----------------

Rules involving itemset [12258, 21903]
['Organic White English Muffins'] => ['Organic Baby Spinach'] (support=0.0010, confidence=0.50, lift=4.55)

Rules involving itemset [13

In [21]:
department_ids = [DEPT_PHARMACY,DEPT_CLEANING, DEPT_WORLD]
transactions_pcw = extract_transactions(INPUT_TRANSACTIONS,products,department_ids)
print("\n\n INVOLVED RULES\n----------------")
results = list(apriori(transactions_pcw, min_support=0.0008, min_confidence=0.1, min_lift=1.0))
print_apyori_output_difdep(results, products, 'product_name')


Number of transactions read: 1000
Numer of transactions stored: 107/5000 (2.14%)

Number of transactions read: 2000
Numer of transactions stored: 207/5000 (4.14%)

Number of transactions read: 3000
Numer of transactions stored: 299/5000 (5.98%)

Number of transactions read: 4000
Numer of transactions stored: 392/5000 (7.84%)

Number of transactions read: 5000
Numer of transactions stored: 492/5000 (9.84%)

Number of transactions read: 6000
Numer of transactions stored: 606/5000 (12.12%)

Number of transactions read: 7000
Numer of transactions stored: 706/5000 (14.12%)

Number of transactions read: 8000
Numer of transactions stored: 796/5000 (15.92%)

Number of transactions read: 9000
Numer of transactions stored: 908/5000 (18.16%)

Number of transactions read: 10000
Numer of transactions stored: 1013/5000 (20.26%)

Number of transactions read: 11000
Numer of transactions stored: 1137/5000 (22.74%)

Number of transactions read: 12000
Numer of transactions stored: 1226/5000 (24.52%)

Nu

In [22]:
department_ids = [DEPT_PETS,DEPT_BABIES]
transactions_petbab = extract_transactions(INPUT_TRANSACTIONS,products,department_ids)
print("\n\n INVOLVED RULES\n----------------")
results = list(apriori(transactions_petbab, min_support=0.0008, min_confidence=0.1, min_lift=1.0))
print_apyori_output_difdep(results, products, 'product_name')


Number of transactions read: 1000
Numer of transactions stored: 31/5000 (0.62%)

Number of transactions read: 2000
Numer of transactions stored: 65/5000 (1.3%)

Number of transactions read: 3000
Numer of transactions stored: 102/5000 (2.04%)

Number of transactions read: 4000
Numer of transactions stored: 138/5000 (2.76%)

Number of transactions read: 5000
Numer of transactions stored: 173/5000 (3.46%)

Number of transactions read: 6000
Numer of transactions stored: 203/5000 (4.06%)

Number of transactions read: 7000
Numer of transactions stored: 238/5000 (4.76%)

Number of transactions read: 8000
Numer of transactions stored: 266/5000 (5.32%)

Number of transactions read: 9000
Numer of transactions stored: 297/5000 (5.94%)

Number of transactions read: 10000
Numer of transactions stored: 335/5000 (6.7%)

Number of transactions read: 11000
Numer of transactions stored: 368/5000 (7.36%)

Number of transactions read: 12000
Numer of transactions stored: 404/5000 (8.08%)

Number of transa

<font size="+1">We have test it with three combinations, first of all Veggies and Bakery where we can clearly see a direct interaction between departments concluding that is common to people who have bought anything in Bakery, for example Organic Soft Wheat Bread, buy something in Veggies such as Bag of Organic Bananas. We also can recommend with high confidence to people who bought something organic in Bakery, like Organic Soft Wheat Bread, to buy Bananas or a Bag of Organic Bananas.

<font size="+1">    The second test is among Pharmacy, Cleaning and World deparments. This relation is much less obvious than the previous one and there is no relation with Pharmacy but a little relation between detergents and dish soaps with hand soap or bath tissues.

<font size="+1">    For the last test, between Pets and Babies, we cannot extract any conclusion more than people who buy 2nd Foods Ham and Ham Gravy tends to buy Fancy Feast Wet Classic Chicken Feast Cat Food from Pets department.
</font>

<font size="+2" color="#003300">I hereby declare that, except for the code provided by the course instructors, all of my code, report, and figures were produced by myself.</font>