# Apriori

## Los datos

Utilizaremos el conjunto de datos [Market Basket Optimisation de Kaggle](https://www.kaggle.com/roshansharma/market-basket-optimization).

El archivo contiene información sobre las compras de diferentes artículos que los clientes realizaron en un centro comercial. Contiene 7501 transacciones, cada una con la lista de artículos vendidos en dicha transacción.

In [1]:
# Importar las librerías
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

In [2]:
# Importar el data set
dataset = pd.read_csv('Market_Basket_Optimisation.csv', header = None)
print(dataset.head())
transactions = []
for i in range(0, 7501):
    transactions.append([str(dataset.values[i, j]) for j in range(0,20)])

              0          1           2                 3             4   \
0         shrimp    almonds     avocado    vegetables mix  green grapes   
1        burgers  meatballs        eggs               NaN           NaN   
2        chutney        NaN         NaN               NaN           NaN   
3         turkey    avocado         NaN               NaN           NaN   
4  mineral water       milk  energy bar  whole wheat rice     green tea   

                 5     6               7             8             9   \
0  whole weat flour  yams  cottage cheese  energy drink  tomato juice   
1               NaN   NaN             NaN           NaN           NaN   
2               NaN   NaN             NaN           NaN           NaN   
3               NaN   NaN             NaN           NaN           NaN   
4               NaN   NaN             NaN           NaN           NaN   

               10         11     12     13             14      15  \
0  low fat yogurt  green tea  honey  sala

In [3]:
transactions[10]

['eggs',
 'pet food',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan',
 'nan']

In [4]:
# Entrenar el algoritmo de Apriori
from apyori import apriori
rules = apriori(transactions, min_support = 0.003 , min_confidence = 0.3,
                min_lift = 3, min_length = 2)

# rules = apriori(transactions, min_support = 0.004 , min_length = 2)
# Visualización de los resultados
results = list(rules)

In [5]:
def inspect(results):
    rh          = [tuple(result[2][0][0]) for result in results]
    lh          = [tuple(result[2][0][1]) for result in results]
    supports    = [result[1] for result in results]
    confidences = [result[2][0][2] for result in results]
    lifts       = [result[2][0][3] for result in results]
    return list(zip(rh, lh, supports, confidences, lifts))

In [6]:
# Este comando crea un frame para ver los resultados
resultDataFrame=pd.DataFrame(inspect(results),
                columns=['rhs','lhs','support','confidence','lift'])

In [7]:
#Imprimimos el dataframe con las reglas
resultDataFrame

Unnamed: 0,rhs,lhs,support,confidence,lift
0,"(mushroom cream sauce,)","(escalope,)",0.005733,0.300699,3.790833
1,"(pasta,)","(escalope,)",0.005866,0.372881,4.700812
2,"(herb & pepper,)","(ground beef,)",0.015998,0.323450,3.291994
3,"(tomato sauce,)","(ground beef,)",0.005333,0.377358,3.840659
4,"(pasta,)","(shrimp,)",0.005066,0.322034,4.506672
...,...,...,...,...,...
97,"(milk, frozen vegetables, mineral water)","(nan, olive oil)",0.003333,0.301205,4.582834
98,"(frozen vegetables, soup)","(nan, milk, mineral water)",0.003066,0.383333,7.987176
99,"(spaghetti, mineral water, shrimp)","(nan, frozen vegetables)",0.003333,0.390625,4.098011
100,"(frozen vegetables, mineral water, tomatoes)","(nan, spaghetti)",0.003066,0.522727,3.002280


In [8]:
resultDataFrame.sort_values(by = ['support'],axis=0,ascending=False).head(30)

Unnamed: 0,rhs,lhs,support,confidence,lift
2,"(herb & pepper,)","(ground beef,)",0.015998,0.32345,3.291994
27,"(herb & pepper,)","(nan, ground beef)",0.015998,0.32345,3.291994
62,"(frozen vegetables, spaghetti)","(nan, ground beef)",0.008666,0.311005,3.165328
18,"(frozen vegetables, spaghetti)","(ground beef,)",0.008666,0.311005,3.165328
67,"(mineral water, shrimp)","(nan, frozen vegetables)",0.007199,0.305085,3.200616
21,"(mineral water, shrimp)","(frozen vegetables,)",0.007199,0.305085,3.200616
71,"(spaghetti, tomatoes)","(nan, frozen vegetables)",0.006666,0.318471,3.341054
26,"(mineral water, herb & pepper)","(ground beef,)",0.006666,0.390625,3.975683
23,"(spaghetti, tomatoes)","(frozen vegetables,)",0.006666,0.318471,3.341054
74,"(mineral water, herb & pepper)","(nan, ground beef)",0.006666,0.390625,3.975683
