# Part 5 - Association Rule Learning

__Example__:
- Movie recommendation
- Market Basket optimization

In [11]:
# import the libraries that will be used
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd

import sys
sys.path.insert(0, '/disk1/sousae/Classes/udemy_machineLearning/Machine_Learning_A-Z/Part5_Association_Rule_Learning/')

from apyori import apriori

## Apriori

Consists of 3 steps, for example for a movie recommendation system,

1. __Support__
    $$ support(M) = \frac{\text{number of watchlists containing M}}{\text{number user watchlist}} $$
    
2. __Confidence__
    $$ confidence(M_1\rightarrow M_2) = \frac{\text{number of user watchlist containing M1 and M2}}{\text{number of watchlist containing M1}} $$
    
3. __Lift__
    $$ lift(M_1\rightarrow M_2) = \frac{confidence(M_1\rightarrow M_2)}{support(M_2)} $$
    
#### Algorithm:

1. Set a minumum support of confidence
2. Tale all the subsets in transactions having higher support then minimum support
3. Take all the rules of these subsets having higher confidence then minimum confidence
4. Sort the rules by decreasing light

In [3]:
# import dataset
dir1 = '/disk1/sousae/Classes/udemy_machineLearning/Machine_Learning_A-Z/Part5_Association_Rule_Learning/'
data = pd.read_csv(dir1+'Market_Basket_Optimisation.csv', header=None)
data.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,shrimp,almonds,avocado,vegetables mix,green grapes,whole weat flour,yams,cottage cheese,energy drink,tomato juice,low fat yogurt,green tea,honey,salad,mineral water,salmon,antioxydant juice,frozen smoothie,spinach,olive oil
1,burgers,meatballs,eggs,,,,,,,,,,,,,,,,,
2,chutney,,,,,,,,,,,,,,,,,,,
3,turkey,avocado,,,,,,,,,,,,,,,,,,
4,mineral water,milk,energy bar,whole wheat rice,green tea,,,,,,,,,,,,,,,


In [9]:
# get a list of list of each transactions
transations = []

for i in range(0,7501):
    transations.append([str(data.values[i,j]) for j in range(0,20)])

In [12]:
# training Apriori on the dataset
help(apriori)

Help on function apriori in module apyori:

apriori(transactions, **kwargs)
    Executes Apriori algorithm and returns a RelationRecord generator.
    
    Arguments:
        transactions -- A transaction iterable object
                        (eg. [['A', 'B'], ['B', 'C']]).
    
    Keyword arguments:
        min_support -- The minimum support of relations (float).
        min_confidence -- The minimum confidence of relations (float).
        min_lift -- The minimum lift of relations (float).
        max_length -- The maximum length of the relation (integer).



In [13]:
# min_support only items that are purchased at least 3 times a day, 3[time/day]*7[days]/7500[total transactions]
rules = apriori(transations, min_support=0.003, min_confidence=0.3, min_lift=3, min_length = 2)

In [14]:
# Visualising the results
results = list(rules)

In [15]:
results

[RelationRecord(items=frozenset(['escalope', 'mushroom cream sauce']), support=0.005732568990801226, ordered_statistics=[OrderedStatistic(items_base=frozenset(['mushroom cream sauce']), items_add=frozenset(['escalope']), confidence=0.3006993006993007, lift=3.790832696715049)]),
 RelationRecord(items=frozenset(['pasta', 'escalope']), support=0.005865884548726837, ordered_statistics=[OrderedStatistic(items_base=frozenset(['pasta']), items_add=frozenset(['escalope']), confidence=0.3728813559322034, lift=4.700811850163794)]),
 RelationRecord(items=frozenset(['herb & pepper', 'ground beef']), support=0.015997866951073192, ordered_statistics=[OrderedStatistic(items_base=frozenset(['herb & pepper']), items_add=frozenset(['ground beef']), confidence=0.3234501347708895, lift=3.2919938411349285)]),
 RelationRecord(items=frozenset(['tomato sauce', 'ground beef']), support=0.005332622317024397, ordered_statistics=[OrderedStatistic(items_base=frozenset(['tomato sauce']), items_add=frozenset(['groun

## Eclat

- Only have support
- support is calculated the same way as in the Apriori model

#### Algorithm

1. Set a minimum support
2. Take all the subsets in transactions having higher support than minimum support
3. Sort these subsets by decreasing support