# Reglas de Asociación

Las **reglas de asociación** son un método de minería de datos utilizado en el análisis de patrones para descubrir relaciones frecuentes entre variables en conjuntos de datos grandes. Estas reglas revelan asociaciones entre diferentes elementos dentro de un conjunto de datos, lo que puede ayudar a identificar patrones de comportamiento, preferencias de los consumidores o relaciones entre productos en ventas, entre otros usos. Se basan en la frecuencia de ocurrencia de combinaciones de elementos y pueden utilizarse en campos como el comercio electrónico, la mercadotecnia, la medicina y más.


## Minería de reglas de asociación: análisis de la cesta de mercado con Mlxtend

El aprendizaje de reglas de asociación es un método de aprendizaje automático basado en reglas para descubrir relaciones entre variables en bases de datos. 
El objetivo es identificar relaciones sólidas descubiertas en conjuntos de datos utilizando medidas como la confianza o el lift.

Una regla de asociación es una expresión de implicación de la forma X→Y, donde X e Y son conjuntos de elementos separados. Un ejemplo más concreto basado en el comportamiento del consumidor sería {Pañales}→{Cerveza}, sugiriendo que las personas que compran pañales también son propensas a comprar cerveza. Para evaluar el "interés" de dicha regla de asociación, se han desarrollado diferentes métricas. La implementación actual hace uso de las métricas de confianza y lift mencionadas anteriormente.


Si un cliente compra pan, hay un 70% de probabilidades de que compre leche.


En la regla de asociación anterior, **el pan es el antecedente y la leche es el consecuente**. Simplemente, se puede entender como una regla de asociación de una tienda minorista para dirigirse mejor a sus clientes. Si la regla anterior es el resultado de un análisis exhaustivo de algunos conjuntos de datos, se puede utilizar no solo para mejorar el servicio al cliente, sino también para aumentar los ingresos de la empresa.

El **algoritmo Apriori** es una de las principales técnicas de minería de reglas de asociación, que básicamente permite encontrar los conjuntos de elementos más frecuentes en un conjunto de datos. Usaremos ese algoritmo también en este tutorial.



In [13]:
# Importar las librerías necesarias
import pandas as pd
import datetime
from mlxtend.frequent_patterns import apriori, association_rules
import openpyxl

In [14]:
# Cargar el conjunto de datos
df = pd.read_excel('/Users/felip/Documents/GitHub/ICC743/clases/semana-11/association-rules/association-rules/data/online-retail.xlsx')
df


  now = datetime.datetime.utcnow()
  now = datetime.datetime.utcnow()


Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
...,...,...,...,...,...,...,...,...
541904,581587,22613,PACK OF 20 SPACEBOY NAPKINS,12,2011-12-09 12:50:00,0.85,12680.0,France
541905,581587,22899,CHILDREN'S APRON DOLLY GIRL,6,2011-12-09 12:50:00,2.10,12680.0,France
541906,581587,23254,CHILDRENS CUTLERY DOLLY GIRL,4,2011-12-09 12:50:00,4.15,12680.0,France
541907,581587,23255,CHILDRENS CUTLERY CIRCUS PARADE,4,2011-12-09 12:50:00,4.15,12680.0,France


In [4]:
# Filtrar los datos para eliminar las filas con valores faltantes
df.dropna(subset=['InvoiceNo', 'StockCode', 'Description', 'Quantity', 'InvoiceDate', 'UnitPrice', 'CustomerID', 'Country'], inplace=True)



In [5]:
# Filtrar solo las transacciones del Reino Unido
df = df[df['Country'] == 'United Kingdom']




In [6]:
# Crear una tabla de transacciones donde cada fila representa una transacción y cada columna un producto
basket = (df.groupby(['InvoiceNo', 'Description'])['Quantity']
          .sum().unstack().reset_index().fillna(0)
          .set_index('InvoiceNo'))



In [7]:
# Convertir las cantidades a valores binarios (0 o 1)
def encode_units(x):
    return 1 if x >= 1 else 0
basket_sets = basket.applymap(encode_units)


  basket_sets = basket.applymap(encode_units)


In [8]:
# Generar conjuntos de ítems frecuentes usando el algoritmo Apriori
frequent_itemsets = apriori(basket_sets, min_support=0.02, use_colnames=True)
frequent_itemsets




Unnamed: 0,support,itemsets
0,0.031626,(6 RIBBONS RUSTIC CHARM)
1,0.021604,(60 CAKE CASES VINTAGE CHRISTMAS)
2,0.029561,(60 TEATIME FAIRY CAKE CASES)
3,0.022360,(72 SWEETHEART FAIRY CAKE CASES)
4,0.034748,(ALARM CLOCK BAKELIKE GREEN)
...,...,...
159,0.020648,"(LUNCH BAG SPACEBOY DESIGN , LUNCH BAG RED RET..."
160,0.020648,"(LUNCH BAG RED RETROSPOT, LUNCH BAG SUKI DESIGN )"
161,0.022007,"(PAPER CHAIN KIT VINTAGE CHRISTMAS, PAPER CHAI..."
162,0.021554,"(WHITE HANGING HEART T-LIGHT HOLDER, RED HANGI..."


### support = que tan recurrente es un "evento"
##### apriori es algoritmo de las reglas de asociación


In [9]:
# Generar reglas de asociación a partir de los conjuntos de ítems frecuentes
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1.0)

# Imprimir los conjuntos de ítems frecuentes

print("\nConjuntos de ítems frecuentes:")
frequent_itemsets




Conjuntos de ítems frecuentes:


Unnamed: 0,support,itemsets
0,0.031626,(6 RIBBONS RUSTIC CHARM)
1,0.021604,(60 CAKE CASES VINTAGE CHRISTMAS)
2,0.029561,(60 TEATIME FAIRY CAKE CASES)
3,0.022360,(72 SWEETHEART FAIRY CAKE CASES)
4,0.034748,(ALARM CLOCK BAKELIKE GREEN)
...,...,...
159,0.020648,"(LUNCH BAG SPACEBOY DESIGN , LUNCH BAG RED RET..."
160,0.020648,"(LUNCH BAG RED RETROSPOT, LUNCH BAG SUKI DESIGN )"
161,0.022007,"(PAPER CHAIN KIT VINTAGE CHRISTMAS, PAPER CHAI..."
162,0.021554,"(WHITE HANGING HEART T-LIGHT HOLDER, RED HANGI..."


In [10]:
# Imprimir las reglas de asociación
print("\nReglas de asociación:")
rules


Reglas de asociación:


Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,zhangs_metric
0,(ALARM CLOCK BAKELIKE RED ),(ALARM CLOCK BAKELIKE GREEN),0.038173,0.034748,0.022863,0.598945,17.236584,0.021537,2.406779,0.979369
1,(ALARM CLOCK BAKELIKE GREEN),(ALARM CLOCK BAKELIKE RED ),0.034748,0.038173,0.022863,0.657971,17.236584,0.021537,2.812121,0.975895
2,(GARDENERS KNEELING PAD KEEP CALM ),(GARDENERS KNEELING PAD CUP OF TEA ),0.037367,0.031576,0.023065,0.617251,19.54824,0.021885,2.530179,0.985676
3,(GARDENERS KNEELING PAD CUP OF TEA ),(GARDENERS KNEELING PAD KEEP CALM ),0.031576,0.037367,0.023065,0.730463,19.54824,0.021885,3.571425,0.979782
4,(GREEN REGENCY TEACUP AND SAUCER),(PINK REGENCY TEACUP AND SAUCER),0.03082,0.024828,0.020345,0.660131,26.588673,0.01958,2.869257,0.992994
5,(PINK REGENCY TEACUP AND SAUCER),(GREEN REGENCY TEACUP AND SAUCER),0.024828,0.03082,0.020345,0.819473,26.588673,0.01958,5.368602,0.986892
6,(GREEN REGENCY TEACUP AND SAUCER),(ROSES REGENCY TEACUP AND SAUCER ),0.03082,0.034144,0.023971,0.777778,22.779253,0.022919,4.346351,0.986505
7,(ROSES REGENCY TEACUP AND SAUCER ),(GREEN REGENCY TEACUP AND SAUCER),0.034144,0.03082,0.023971,0.702065,22.779253,0.022919,3.252989,0.9899
8,(HEART OF WICKER LARGE),(HEART OF WICKER SMALL),0.039835,0.046734,0.020043,0.503161,10.766443,0.018182,1.91866,0.944753
9,(HEART OF WICKER SMALL),(HEART OF WICKER LARGE),0.046734,0.039835,0.020043,0.428879,10.766443,0.018182,1.681195,0.951591


In [13]:
# Evaluar las reglas de asociación
print("\nEvaluación de las reglas de asociación:")
rules_evaluation = rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']]
rules_evaluation


Evaluación de las reglas de asociación:


Unnamed: 0,antecedents,consequents,support,confidence,lift
0,(ALARM CLOCK BAKELIKE RED ),(ALARM CLOCK BAKELIKE GREEN),0.022863,0.598945,17.236584
1,(ALARM CLOCK BAKELIKE GREEN),(ALARM CLOCK BAKELIKE RED ),0.022863,0.657971,17.236584
2,(GARDENERS KNEELING PAD KEEP CALM ),(GARDENERS KNEELING PAD CUP OF TEA ),0.023065,0.617251,19.54824
3,(GARDENERS KNEELING PAD CUP OF TEA ),(GARDENERS KNEELING PAD KEEP CALM ),0.023065,0.730463,19.54824
4,(GREEN REGENCY TEACUP AND SAUCER),(PINK REGENCY TEACUP AND SAUCER),0.020345,0.660131,26.588673
5,(PINK REGENCY TEACUP AND SAUCER),(GREEN REGENCY TEACUP AND SAUCER),0.020345,0.819473,26.588673
6,(GREEN REGENCY TEACUP AND SAUCER),(ROSES REGENCY TEACUP AND SAUCER ),0.023971,0.777778,22.779253
7,(ROSES REGENCY TEACUP AND SAUCER ),(GREEN REGENCY TEACUP AND SAUCER),0.023971,0.702065,22.779253
8,(HEART OF WICKER LARGE),(HEART OF WICKER SMALL),0.020043,0.503161,10.766443
9,(HEART OF WICKER SMALL),(HEART OF WICKER LARGE),0.020043,0.428879,10.766443
