# Assignment 3 – Association Rule Mining

### Instructions
In this assignment, you will explore Association Rule Mining using the Apriori and FP-Growth algorithms. Each task focuses on discovering item relationships, performance comparison, and practical interpretation.

## Task 1 – Apriori Algorithm
**Objective:** Identify strong association rules among purchased items.

### Steps
1. Load and clean OnlineRetail.csv.
2. Transform to basket format.
3. Generate frequent itemsets (min_support=0.02).
4. Extract rules (min_confidence=0.3).
5. Visualize and interpret.

In [None]:
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
import matplotlib.pyplot as plt

df = pd.read_csv('OnlineRetail.csv', encoding='latin1')
df = df.dropna(subset=['Description', 'InvoiceNo'])
df = df[~df['InvoiceNo'].astype(str).str.contains('C')]
df = df[df['Country'] == 'United Kingdom']

basket = pd.crosstab(df['InvoiceNo'], df['Description'])
basket = basket.applymap(lambda x: 1 if x > 0 else 0)

frequent_itemsets_ap = apriori(basket, min_support=0.02, use_colnames=True)
rules_ap = association_rules(frequent_itemsets_ap, metric='confidence', min_threshold=0.3)
rules_ap = rules_ap[rules_ap['lift'] > 1]

plt.scatter(rules_ap['support'], rules_ap['confidence'])
plt.xlabel('Support')
plt.ylabel('Confidence')
plt.title('Apriori - Support vs Confidence')
plt.show()

## Task 2 – FP-Growth Algorithm
Repeat the same process using **fpgrowth** and compare with Apriori.

In [None]:
from mlxtend.frequent_patterns import fpgrowth
import time

start = time.time()
frequent_itemsets_fp = fpgrowth(basket, min_support=0.02, use_colnames=True)
t_fp = time.time() - start

start = time.time()
apriori(basket, min_support=0.02, use_colnames=True)
t_ap = time.time() - start

print('Apriori time:', t_ap)
print('FP-Growth time:', t_fp)

rules_fp = association_rules(frequent_itemsets_fp, metric='confidence', min_threshold=0.3)
rules_fp = rules_fp[rules_fp['lift'] > 1]
rules_fp.head()

## Task 3 – AI Comparison
Use an AI tool (e.g., Perplexity or ChatGPT) to describe how Apriori differs structurally from FP-Growth in terms of candidate generation and tree structure.

## Task 4 – Exploration Challenge
Apply both algorithms to another dataset (e.g., Groceries.csv) and interpret marketing insights from 2-itemsets.