# Association rule learning

In [1]:
%pylab
%matplotlib inline

%config InlineBackend.figure_format = 'retina'

import numpy as np

Using matplotlib backend: Qt5Agg
Populating the interactive namespace from numpy and matplotlib


### Apriori algorithm 

Apriori algorithm is not implemented on `scikit-learn`. For that, we have to use another implementation: here are going to use the Everaldo Aguiar & Reid Johnson implementation (http://nbviewer.jupyter.org/github/cse40647/cse40647/blob/sp.14/10%20-%20Apriori.ipynb) and it is in the `apriori.py` file.

So, first, let's import `apriori` algorithm:

In [2]:
import apriori

For using that algorithm, we are going to use the following market basket dataset:

In [3]:
dataset = [['Pan', 'Leche'],
           ['Pan', 'Pañales', 'Cerveza', 'Huevos'],
           ['Leche', 'Pañales', 'Cerveza', 'Cola'],
           ['Leche', 'Pan', 'Pañales', 'Cerveza'],
           ['Pañales', 'Pan', 'Leche', 'Cola'],
           ['Pan', 'Leche', 'Pañales'],
           ['Pan', 'Cola']]

In [4]:
# Let's apply apriori algorithm:
F, support = apriori.apriori(dataset, min_support = 0.55, verbose = True)

{Pañales}:  sup = 0.714
{Pan}:  sup = 0.857
{Leche}:  sup = 0.714
{Leche, Pañales}:  sup = 0.571
{Pan, Pañales}:  sup = 0.571
{Pan, Leche}:  sup = 0.571


In [5]:
# Let's generate the rules:
H = apriori.generate_rules(F, support, min_confidence = 0.75, verbose = True)

{Pañales} ---> {Leche}:  conf = 0.8, sup = 0.571
{Leche} ---> {Pañales}:  conf = 0.8, sup = 0.571
{Pañales} ---> {Pan}:  conf = 0.8, sup = 0.571
{Leche} ---> {Pan}:  conf = 0.8, sup = 0.571


### FP-Growth algorithm 

FG-Growth algorithm is enabled on the `pyfpgrowth` library, but it have to installed before its use:

    conda install pyfpgrowth
    

In [6]:
import pyfpgrowth

In [7]:
# Let's apply FG-Growth algorithm
patterns = pyfpgrowth.find_frequent_patterns(dataset, 
                                             support_threshold = 4) #number minimum of appareances on dataset
patterns

{('Leche',): 5,
 ('Leche', 'Pan'): 4,
 ('Pan',): 6,
 ('Pan', 'Pañales'): 4,
 ('Pañales',): 5}

In [8]:
# Let's generate the rules:
rules = pyfpgrowth.generate_association_rules(patterns, confidence_threshold = 0.75)
rules

{('Leche',): (('Pan',), 0.8), ('Pañales',): (('Pan',), 0.8)}