### ***Association Rule Mining (apriori algorithm)***

Association rule mining is a technique used to discover interesting relationships, patterns, or associations among a set of items in large datasets. 

It is commonly used in market basket analysis to identify products that are frequently purchased together by customers. The goal is to find rules that can help businesses make informed decisions, such as product placement, cross-selling, and inventory management.

## ⚙️ Key Terms (Detailed)

### 1. **Item / Itemset**
- **Item** → a single product or element.  
  Example: `'milk'`, `'bread'`
- **Itemset** → a collection of one or more items.  
  Example: `{milk, bread}`

---

### 2. **Support**
**Definition:** How often an itemset appears in all transactions.  
\[
Support(A) = \frac{\text{No. of transactions containing A}}{\text{Total transactions}}
\]

**Example:**  
If `{milk, bread}` appears in 3 of 5 transactions →  
`Support = 3/5 = 0.6`

**Meaning:** 60% of all transactions contain both milk and bread.

---

### 3. **Confidence**
**Definition:** How often items in **B** appear in transactions that contain **A**.  
\[
Confidence(A \rightarrow B) = \frac{Support(A \cup B)}{Support(A)}
\]

**Example:**  
If `{milk, bread}` appears in 3 transactions and `{milk, bread, butter}` in 2 →  
`Confidence = (2/5) / (3/5) = 0.67`

**Meaning:** When people buy milk and bread, 67% of the time they also buy butter.

---

### 4. **Lift**
**Definition:** How much more likely B is bought when A is bought, compared to random chance.  
\[
Lift(A \rightarrow B) = \frac{Confidence(A \rightarrow B)}{Support(B)}
\]

**Interpretation:**
- **> 1:** A and B are positively correlated  
- **= 1:** No correlation (independent)  
- **< 1:** Negative correlation  

---

### 5. **Leverage**
**Definition:** Difference between observed and expected co-occurrence.  
\[
Leverage(A \rightarrow B) = Support(A \cup B) - Support(A) \times Support(B)
\]

**Interpretation:**  
Higher → stronger relationship.

---

### 6. **Conviction**
**Definition:** Measures how strongly A implies B (directional).  
\[
Conviction(A \rightarrow B) = \frac{1 - Support(B)}{1 - Confidence(A \rightarrow B)}
\]

**Interpretation:**  
Conviction > 1 → stronger implication.

---

### 7. **Antecedent & Consequent**
| Term | Meaning |
|------|----------|
| **Antecedent (LHS)** | The “if” part (A) |
| **Consequent (RHS)** | The “then” part (B) |

Example: `{milk, bread} → {butter}`  
→ Antecedent = `{milk, bread}`, Consequent = `{butter}`

---

### 8. **Metric Summary**

| Metric | Meaning | Range | Ideal |
|---------|----------|--------|--------|
| **Support** | Frequency of A & B | 0–1 | Higher = common |
| **Confidence** | Probability of B given A | 0–1 | Higher = strong |
| **Lift** | Dependence of A & B | >0 | >1 = positive correlation |
| **Leverage** | Observed vs expected frequency | −1–1 | Higher = better |
| **Conviction** | Strength of direction | ≥1 | Higher = stronger |

---

***🧩 Order of Implementation***

1. Import libraries  
2. Load dataset  
3. Preprocess data (convert to transactional format)  
4. Encode transactions  
5. Apply **Apriori Algorithm** (find frequent itemsets)  
6. Generate association rules  
7. Analyze and filter rules  
8. Visualize results (optional)

In [None]:
## 🧪 Implementation Example (Apriori Algorithm)

# Step 1: Import libraries
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules

# Step 2: Sample dataset (list of transactions)
dataset = [
    ['milk', 'bread', 'butter'],
    ['beer', 'bread'],
    ['milk', 'bread', 'beer', 'butter'],
    ['bread', 'butter'],
    ['milk', 'bread']
]

# Step 3: Encode transactions
te = TransactionEncoder()
te_data = te.fit(dataset).transform(dataset)
df = pd.DataFrame(te_data, columns=te.columns_)

# Step 4: Find frequent itemsets using Apriori
frequent_itemsets = apriori(df, min_support=0.5, use_colnames=True)

# Step 5: Generate rules from frequent itemsets
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1.0)

# Step 6: Display top rules
rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']].head()