# Association Rule Mining on Real Data

This notebook demonstrates how to perform association rule mining using real-world data. The steps include data preprocessing, frequent itemset generation, and association rule mining.

## Step 1: Import Libraries

In [None]:
import pandas as pd
import numpy as np
from mlxtend.frequent_patterns import apriori, association_rules

## Step 2: Load Dataset

In [None]:
# Sample dataset: Online Retail
data = pd.read_excel('https://archive.ics.uci.edu/ml/machine-learning-databases/00352/Online%20Retail.xlsx')
data.head()

## Step 3: Data Preprocessing

In [None]:
# Removing NaNs
data.dropna(axis=0, subset=['InvoiceNo'], inplace=True)
data['InvoiceNo'] = data['InvoiceNo'].astype('str')

# Remove canceled orders
data = data[~data['InvoiceNo'].str.contains('C')]

# Create a basket for each country
basket = (data[data['Country'] =="France"]
          .groupby(['InvoiceNo', 'Description'])['Quantity']
          .sum().unstack().reset_index().fillna(0)
          .set_index('InvoiceNo'))

## Step 4: Hot Encoding the Data

In [None]:
def hot_encode(x):
    if x <= 0:
        return 0
    if x >= 1:
        return 1

basket_encoded = basket.applymap(hot_encode)
basket_encoded.drop('POSTAGE', inplace=True, axis=1)
basket_encoded.head()

## Step 5: Frequent Itemsets Generation

In [None]:
# Generating frequent itemsets
frequent_itemsets = apriori(basket_encoded, min_support=0.07, use_colnames=True)
frequent_itemsets.sort_values('support', ascending=False).head()

## Step 6: Mining Association Rules

In [None]:
# Generating association rules
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)
rules.head()

## Insights

Review the results of the association rules to derive meaningful insights.