# Apriori - Ejemplo - Grocery

**Contexto**  
Este conjunto de datos contiene el detalle de los artículos comprados por clientes en un ticket.

**Contenido**  
El conjunto de datos proviene de kaggle: [Market Basket Optimisation](https://www.kaggle.com/datasets/d4rklucif3r/market-basket-optimisation).  
Contiene 7501 renglones y 20 columnas, una por cada artículo comprado en el ticket.

**Planteamiento del problema**  
Se busca encontrar las asociaciones de productos que se compran al mismo tiempo.

In [1]:
# Importar librerias
import pandas as pd

from apyori import apriori

## Cargar Datos

In [2]:
# Importar los datos
df = pd.read_csv('Market_Basket_Optimisation.csv', header=None)
df.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19
0,shrimp,almonds,avocado,vegetables mix,green grapes,whole weat flour,yams,cottage cheese,energy drink,tomato juice,low fat yogurt,green tea,honey,salad,mineral water,salmon,antioxydant juice,frozen smoothie,spinach,olive oil
1,burgers,meatballs,eggs,,,,,,,,,,,,,,,,,
2,chutney,,,,,,,,,,,,,,,,,,,
3,turkey,avocado,,,,,,,,,,,,,,,,,,
4,mineral water,milk,energy bar,whole wheat rice,green tea,,,,,,,,,,,,,,,


## EDA

In [3]:
# Revisar los datos
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7501 entries, 0 to 7500
Data columns (total 20 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   0       7501 non-null   object
 1   1       5747 non-null   object
 2   2       4389 non-null   object
 3   3       3345 non-null   object
 4   4       2529 non-null   object
 5   5       1864 non-null   object
 6   6       1369 non-null   object
 7   7       981 non-null    object
 8   8       654 non-null    object
 9   9       395 non-null    object
 10  10      256 non-null    object
 11  11      154 non-null    object
 12  12      87 non-null     object
 13  13      47 non-null     object
 14  14      25 non-null     object
 15  15      8 non-null      object
 16  16      4 non-null      object
 17  17      4 non-null      object
 18  18      3 non-null      object
 19  19      1 non-null      object
dtypes: object(20)
memory usage: 1.1+ MB


## Modelado

In [4]:
# Transformar cataframe para uso del algoritmo
transactions = []
for i in range(0, 7501):
    transactions.append([str(df.values[i, j]) for j in range(0,20)])

In [5]:
# Entrenamiento
model = apriori(transactions, min_support = 0.003 , min_confidence = 0.3,
                min_lift = 3, min_length = 2)

## Visualizacion

In [6]:
# Resultados
results = list(model)

In [7]:
def inspect(results):
    rh          = [tuple(result[2][0][0]) for result in results]
    lh          = [tuple(result[2][0][1]) for result in results]
    support     = [result[1] for result in results]
    confidence  = [result[2][0][2] for result in results]
    lift        = [result[2][0][3] for result in results]
    return list(zip(rh, lh, support, confidence, lift))

In [8]:
# Convierte a dataframe para ver los resultados
resultdf=pd.DataFrame(inspect(results),
                      columns=['rhs','lhs','support','confidence','lift'])

In [9]:
resultdf

Unnamed: 0,rhs,lhs,support,confidence,lift
0,"(mushroom cream sauce,)","(escalope,)",0.005733,0.300699,3.790833
1,"(pasta,)","(escalope,)",0.005866,0.372881,4.700812
2,"(herb & pepper,)","(ground beef,)",0.015998,0.323450,3.291994
3,"(tomato sauce,)","(ground beef,)",0.005333,0.377358,3.840659
4,"(pasta,)","(shrimp,)",0.005066,0.322034,4.506672
...,...,...,...,...,...
97,"(mineral water, milk, frozen vegetables)","(olive oil, nan)",0.003333,0.301205,4.582834
98,"(soup, frozen vegetables)","(mineral water, milk, nan)",0.003066,0.383333,7.987176
99,"(mineral water, spaghetti, shrimp)","(frozen vegetables, nan)",0.003333,0.390625,4.098011
100,"(mineral water, tomatoes, frozen vegetables)","(spaghetti, nan)",0.003066,0.522727,3.002280


In [10]:
resultdf.sort_values(by = ['support'],axis=0,ascending=False).head(30)

Unnamed: 0,rhs,lhs,support,confidence,lift
2,"(herb & pepper,)","(ground beef,)",0.015998,0.32345,3.291994
27,"(herb & pepper,)","(nan, ground beef)",0.015998,0.32345,3.291994
62,"(spaghetti, frozen vegetables)","(nan, ground beef)",0.008666,0.311005,3.165328
18,"(spaghetti, frozen vegetables)","(ground beef,)",0.008666,0.311005,3.165328
67,"(mineral water, shrimp)","(frozen vegetables, nan)",0.007199,0.305085,3.200616
21,"(mineral water, shrimp)","(frozen vegetables,)",0.007199,0.305085,3.200616
71,"(spaghetti, tomatoes)","(frozen vegetables, nan)",0.006666,0.318471,3.341054
26,"(mineral water, herb & pepper)","(ground beef,)",0.006666,0.390625,3.975683
23,"(spaghetti, tomatoes)","(frozen vegetables,)",0.006666,0.318471,3.341054
74,"(mineral water, herb & pepper)","(nan, ground beef)",0.006666,0.390625,3.975683
