<!-- ![Alt Text](https://raw.githubusercontent.com/msfasha/307304-Data-Mining/main/images/header.png) -->

<div style="display: flex; justify-content: flex-start; align-items: center;">
    <a href="https://colab.research.google.com/github/msfasha/307304-Data-Mining/blob/main/20251/Module%205-Associatin%20Rules%20Mining/association_rules_mining.ipynb" target="_blank">
        <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" style="height: 25px; margin-right: 20px;">
    </a>
</div>

## Introduction to Association Rule Mining ‚Äì Market Basket Analysis (MBA)

Association rule mining is a data mining technique used to uncover **if‚Äìthen relationships** between items in large datasets, such as retail transaction logs. These relationships, known as **association rules**, capture patterns of items that frequently occur together (called **co-occurrences**).

In the context of **market basket analysis**, association rules help answer questions like: *‚ÄúWhich products are often bought together?‚Äù* For example, suppose we find that 75% of customers who buy cereal also buy milk. We can express this as the rule:

$\{\mathrm{cereal}\} \Rightarrow \{\mathrm{milk}\}$

This rule suggests that customers who purchase cereal often purchase milk as well. Such insights can guide **marketing and retail decisions**, including promotional strategies, product bundling, and shelf placement.


<div style="text-align: center;">
    <img src="https://raw.githubusercontent.com/msfasha/307304-Data-Mining/main/images/mba.png" alt="Association Ruels Mining" width="600"/>
</div>

<div style="text-align: center;">
    <img src="https://raw.githubusercontent.com/msfasha/307304-Data-Mining/main/images/mall_bill.png" alt="Mall Bill" width="600"/>
</div>

<div style="display: flex; justify-content: flex-start; align-items: center;">
    <a href="https://www.youtube.com/watch?v=guVvtZ7ZClw" target="_blank">
        <img src="https://raw.githubusercontent.com/msfasha/307304-Data-Mining/main/images/youtube.png" alt="KMeans Youtube Video" style="height: 40px;">
    <a href="https://uopstdedu-my.sharepoint.com/:f:/g/personal/mohammed_fasha_uop_edu_jo/EkNUft0LJ_JJg5SdPETsj90BXHoevd0DlMBwDY6Or8YfCw?e=L73iKu" target="_blank">
        <img src="https://raw.githubusercontent.com/msfasha/307304-Data-Mining/main/images/video_icon.png" alt="Recorded Video Lecture" style="height: 100px;margin-left:20px;"">
    </a>
</div>

### Association Rule Use Cases and Domains

Association rule mining can be applied in various domains beyond market basket analysis. Here are some examples:

 1. **Healthcare**
   - **Disease Diagnosis:** Identifying associations between symptoms and diseases. For example, a rule like *{fever, cough} ‚Üí {flu}* can help predict diseases.
   - **Drug Interactions:** Discovering relationships between medications that are frequently prescribed together or identifying combinations that lead to adverse reactions.

 2. **Web Usage Mining**
   - **Website Optimization:** Analyzing user navigation patterns to determine common paths or clicks, e.g., *{homepage ‚Üí product page} ‚Üí {checkout}*.
   - **Recommendation Systems:** Suggesting content or products based on frequently co-accessed items, e.g., *{clicked 'smartphone'} ‚Üí {clicked 'smartphone accessories'}*.

 3. **Education**
   - **Student Behavior Analysis:** Discovering patterns in course enrollment, such as *{math101, cs101} ‚Üí {stat101}*.
   - **Learning Paths:** Identifying sequences of topics that students study, helping design better curricula.

 4. **Telecommunications**
   - **Customer Churn Analysis:** Detecting combinations of usage patterns that are associated with customers leaving the service, e.g., *{low data usage, few calls} ‚Üí {churn}*.
   - **Service Bundling:** Identifying services that are commonly purchased together, like *{broadband, mobile} ‚Üí {TV subscription}*.

 5. **Banking and Finance**
   - **Fraud Detection:** Uncovering patterns associated with fraudulent transactions, e.g., *{high transaction frequency, odd hours} ‚Üí {fraud}*.
   - **Loan Approvals:** Identifying attributes of successful loan applications, such as *{high income, good credit score} ‚Üí {loan approved}*.

 6. **Manufacturing**
   - **Fault Detection:** Identifying combinations of machine conditions that frequently result in faults, e.g., *{high temperature, low pressure} ‚Üí {equipment failure}*.
   - **Supply Chain Optimization:** Discovering patterns in material usage, e.g., *{material A, material B} ‚Üí {product C}*.

 7. **Retail Beyond Market Basket**
   - **Shelf Placement:** Finding products that are often bought together to optimize store layout.
   - **Customer Segmentation:** Identifying customer groups with similar purchasing behaviors, e.g., *{frequent discount purchases} ‚Üí {low brand loyalty}*.

 8. **Social Media Analysis**
   - **Trending Topics:** Discovering associations between hashtags, e.g., *{#climatechange, #sustainability} ‚Üí {#renewableenergy}*.
   - **User Behavior Patterns:** Understanding engagement behaviors, such as *{likes post, comments on post} ‚Üí {shares post}*.

 9. **Energy Sector**
   - **Usage Patterns:** Identifying associations in energy usage, like *{high A/C usage, weekend} ‚Üí {peak energy consumption}*.
   - **Smart Grid Analysis:** Detecting patterns for predictive maintenance, e.g., *{low voltage, high demand} ‚Üí {power outage}*.

 10. **Transportation and Logistics**
   - **Traffic Analysis:** Discovering patterns in traffic conditions, e.g., *{morning rush hour, bad weather} ‚Üí {traffic jam}*.
   - **Route Optimization:** Identifying frequently used delivery routes, such as *{route A, route B} ‚Üí {delivered faster}*. 

 11. **E-commerce**
   - **User Preferences:** Identifying patterns in user preferences for personalized recommendations.
   - **Cross-Selling:** Suggesting related products based on purchase history.

 12. **Sports Analytics**
   - **Performance Metrics:** Discovering combinations of player actions that lead to victories, e.g., *{high possession, accurate passes} ‚Üí {win}*.
   - **Injury Prevention:** Identifying conditions that precede injuries, such as *{high training load, lack of rest} ‚Üí {injury risk}*. 

---

## The Apriori Algorithm

Apriori is a classic algorithm for **finding frequent itemsets** and **deriving association rules** in transactional databases (e.g., shopping baskets, web logs). It is based on the idea that **every subset of a frequent itemset must also be frequent**.

**Key points:**

* Proposed by **Agrawal & Srikant (1994)**.
* Works on **transaction data** (e.g., sets of items bought together).
* First finds **frequent single items**, then **grows** them into larger itemsets as long as they remain frequent.
* Uses a **bottom-up, level-wise** search with **candidate generation** and testing.
* Stops when **no more frequent extensions** can be found.
* The resulting frequent itemsets are used to build **association rules** (e.g., for **market basket analysis**).


<div style="text-align: center;">
    <img src="https://raw.githubusercontent.com/msfasha/307304-Data-Mining/main/images/apriori-algorithm.jpg" alt="Apriori Algorithm" width="600"/>
</div>

............................................................

### Creating Association Rules
To create association rules, we use 3 metrics:
1. Support: finds the popular items and builds initial rules.
2. Confidence: Enhance initial rules by finding true relationships.
3. Lift: Enhance rules by acomodating for popular items effect (x -> y where x is popular e.g. bread in a bakery)

<div style="text-align: center;">
    <img src="https://raw.githubusercontent.com/msfasha/307304-Data-Mining/main/images/3_mba_metrics.png" alt="The 3 Metrics of MBA Bill" width="600"/>
</div>

#### **1. Support**: 
Support is a measure that represents the frequency or proportion of transactions in the dataset that contain a given item set or pattern. In other words, it measures how often a particular combination of items appears together in the dataset. Support is a **Probability Based** measure that has a value between 0 and 1.

$ \text{Support(X)} = \frac{\text{Number of transactions containing X}}{\text{Total transactions}} $

  **Valid Values for Support:**
  - **Range**: The value of support ranges from **0% to 100%**:
  - **0%**: The itemset does not appear in any transaction.
  - **100%**: The itemset appears in all transactions.

  **Interpreting Support:**
  - **High support**: The itemset is common and may represent a strong association.
  - **Low support**: The itemset is rare and might not provide actionable insights.

**Thresholds in Practice**
  - In practice, a minimum support threshold is specified to filter out infrequent itemsets, ensuring only itemsets that appear frequently in the dataset are considered for further analysis.
  - Generally speaking, the minimum values for **support**, **confidence**, and **lift** depend on the specific dataset and the business context, but here are some common guidelines and starting points:
  - Setting a very high support threshold can exclude less frequent but potentially interesting itemsets.
  - If support is too low, the rule may represent rare occurrences, which might not be meaningful or actionable.
- **Typical Threshold**: 
  - **5-10%**: Frequently used in retail datasets where only a small fraction of products are bought together.
  - **Higher Thresholds (20-30%)**: For smaller datasets or when focusing on very popular combinations.

#### Example

Let‚Äôs assume that we have a small dataset with 12 transactions:

| **Transaction ID** | **Item 1** | **Item 2** | **Item 3** | **Item 4** | **Item 5** |
|---------------------|------------|------------|------------|------------|------------|
| 1                   | Milk       | Egg        | Bread      | Butter     |            |
| 2                   | Milk       | Butter     | Egg        | Ketchup    |            |
| 3                   | Bread      | Butter     | Ketchup    |            |            |
| 4                   | Milk       | Bread      | Butter     |            |            |
| 5                   | Bread      | Butter     | Cookies    |            |            |
| 6                   | Milk       | Bread      | Butter     | Cookies    |            |
| 7                   | Milk       | Cookies    |            |            |            |
| 8                   | Milk       | Bread      | Butter     |            |            |
| 9                   | Bread      | Butter     | Egg        | Cookies    |            |
| 10                  | Milk       | Butter     | Bread      |            |            |
| 11                  | Milk       | Bread      | Butter     |            |            |
| 12                  | Milk       | Bread      | Cookies    | Ketchup    |            |

 First, we set out the **Minimum Support** value to:
 - 50% (focus on items present in at least half of the transactions), this value can be set to 5% or 10% in real-life situations with large datasets.

 **Step 1: Dataset Transformation**
Convert the dataset into a binary format:

| Transaction | Milk | Egg | Bread | Butter | Ketchup | Cookies |
|-------------|------|-----|-------|--------|---------|---------|
| 1           | 1    | 1   | 1     | 1      | 0       | 0       |
| 2           | 1    | 1   | 0     | 1      | 1       | 0       |
| 3           | 0    | 0   | 1     | 1      | 1       | 0       |
| 4           | 1    | 0   | 1     | 1      | 0       | 0       |
| 5           | 0    | 0   | 1     | 1      | 0       | 1       |
| 6           | 1    | 0   | 1     | 1      | 0       | 1       |
| 7           | 1    | 0   | 0     | 0      | 0       | 1       |
| 8           | 1    | 0   | 1     | 1      | 0       | 0       |
| 9           | 0    | 1   | 1     | 1      | 0       | 1       |
| 10          | 1    | 0   | 1     | 1      | 0       | 0       |
| 11          | 1    | 0   | 1     | 1      | 0       | 0       |
| 12          | 1    | 0   | 1     | 0      | 1       | 1       |

**Step 2: Identify Frequent 1-Itemsets**  
The algorithm starts by calculating the support for each single item.

| Item     | Support Count | Support (%) | Frequent? |
|----------|---------------|-------------|-----------|
| Milk     | 9             | 75%         | Yes       |
| Egg      | 3             | 25%         | No        |
| Bread    | 10            | 83%         | Yes       |
| Butter   | 10            | 83%         | Yes       |
| Ketchup  | 3             | 25%         | No        |
| Cookies  | 5             | 42%         | No        |

**Frequent 1-itemsets**: `{Milk}`, `{Bread}`, `{Butter}`.

**Step 3: Generate 2-Itemsets**  
Now, combine frequent 1-itemsets into 2-itemsets and calculate their support.

| Itemset           | Support Count | Support (%) | Frequent? |
|--------------------|---------------|-------------|-----------|
| {Milk, Bread}     | 7             | 58%         | Yes       |
| {Milk, Butter}    | 7             | 58%         | Yes       |
| {Bread, Butter}   | 9             | 75%         | Yes       |

**Frequent 2-itemsets**: `{Milk, Bread}`, `{Milk, Butter}`, `{Bread, Butter}`.

**Step 4: Generate 3-Itemsets**  
Combine frequent 2-itemsets into 3-itemsets and calculate their support.

| Itemset                   | Support Count | Support (%) | Frequent? |
|----------------------------|---------------|-------------|-----------|
| {Milk, Bread, Butter}     | 6             | 50%         | Yes       |

**Frequent 3-itemset**: `{Milk, Bread, Butter}`.

**Step 5: No Larger Itemsets Can Be Created**  
Since there are no frequent 4-itemsets (support would drop below 50%), we stop here.

 **Step 6: Generate Initial Association Rules**

Now, use the frequent itemsets to generate association rules, we will focus on the largest item set, we can create rules from smaller ones if needed.

Rules from `{Milk, Bread, Butter}`:
1. **Rule 1**: $$ \text{Milk, Bread} \rightarrow \text{Butter} $$  

$$Support =  P(\text{Milk, Bread, Butter}) = 6/12 = 50\% $$

1. **Rule 2**: $$ \text{Milk, Butter} \rightarrow \text{Bread} $$  

$$ Support = P(\text{Milk, Bread, Butter}) = 6/12 = 50\% $$

2. **Rule 3**: $$ \text{Bread, Butter} \rightarrow \text{Milk} $$  

$$ Support = P(\text{Milk, Bread, Butter}) = 6/12 = 50\% $$

**The Problem with Support**

Support is useful, but it has a limitation: it is **symmetric**.  

$\text{Support}(\text{milk} \cap \text{diapers})$ is exactly the same whether we write the rule as
$(\text{Milk} \rightarrow \text{Diapers})$ or $(\text{Diapers} \rightarrow \text{Milk})$. It only tells us **how often the two items appear together**, not **in which direction the rule should go**. 

In our example with 1,000 transactions, milk and diapers appear together in 100 of them, so
$
\text{support}(\text{milk} \cap \text{diapers}) = 100/1000 = 10%.
$

That 10% tells us the pair is reasonably common, but it does **not** answer a key business question: *Should we recommend milk to customers who buy diapers, or recommend diapers to customers who buy milk, or both?*

Support alone cannot distinguish between these options because it treats the pair $({\text{milk}, \text{diapers}})$ the same in both directions.

This is why we need **confidence**, which is directional and based on **conditional probability**.  

From the raw data: 500 customers bought milk, 150 bought diapers, and 100 bought both. If we look at the rule $(\text{Diapers} \rightarrow \text{Milk})$, the confidence is:  

$
\text{conf}(\text{Milk} \mid \text{Diapers}) = 100/150 \approx 66%,
$  

meaning 66% of diaper-buyers also buy milk. For the opposite rule $(\text{Milk} \rightarrow \text{Diapers})$, the confidence is:  

$
\text{conf}(\text{Diapers} \mid \text{Milk}) = 100/500 = 20%,
$  
so only 20% of milk-buyers also buy diapers. 

Now we can clearly see that **Diapers ‚Üí Milk** is the more useful rule. In short: support tells us that the pair is common, but because it is **symmetric**, it cannot guide us on *direction*; confidence is needed to decide which way the rule should go.


More Examples about The Asymmetry of Association Rules:  
| Domain             | Rule A ‚Üí B (Strong)                                  | Rule B ‚Üí A (Weak or Misleading)                      | Reason for Asymmetry                                        |
| ------------------ | ---------------------------------------------------- | ---------------------------------------------------- | ----------------------------------------------------------- |
| Retail             | Buys Smartphone ‚Üí Buys Phone Case                    | Buys Phone Case ‚Üí Buys Smartphone                    | Cases are often bought later or as replacements             |
| Web Analytics      | Visits Product Page ‚Üí Adds to Cart                   | Adds to Cart ‚Üí Visits Product Page                   | Visit is required, but not all visits lead to cart          |
| Healthcare         | Diagnosed with Diabetes ‚Üí Prescribed Insulin         | Prescribed Insulin ‚Üí Diagnosed with Diabetes         | Insulin used for other conditions too                       |
| Fraud Detection    | Unusual Login ‚Üí Account Lock                         | Account Lock ‚Üí Unusual Login                         | Locks triggered by various issues                           |
| E-Learning         | Watches Lecture Video ‚Üí Submits Homework             | Submits Homework ‚Üí Watches Lecture Video             | Some students skip the video                                |
| Streaming Services | Subscribes to Premium Plan ‚Üí Watches Exclusive Shows | Watches Exclusive Shows ‚Üí Subscribes to Premium Plan | Content may be shared or watched via someone else's account |
| Finance            | Misses Loan Payment ‚Üí Credit Score Drops             | Credit Score Drops ‚Üí Misses Loan Payment             | Scores drop for many other financial behaviors              |
| HR / Workforce     | Attends Training ‚Üí Improved Job Performance          | Improved Job Performance ‚Üí Attended Training         | Performance may improve for unrelated reasons               |


##### **2 Confidence**:
In order to validate which rule is correct, we need to examine the confidence of the rule, for example, we might have a frequent item set that have two items (Milk, Bread), we can establish two rules from that set:
$$ \text{Milk} \rightarrow \text{Bread} $$
$$ \text{Bread} \rightarrow \text{Milk} $$

To determine which rule is correct for that frequenct item set (the rule that says when x occurs y also most probably occurs), we can use to confidence establish that assertion.

**Confidence** in Apriori is a measure that calculates the **probability** of the rule being true, in other words, it measures the reliability of the rule. Having higher confidence means that when the antecedent occurs, the consequent is likely to follow.
     $$ \text{Confidence(X ‚Üí Y)} = \frac{\text{Support(X ‚à© Y)}}{\text{Support(X)}} $$

**Valid Values for Confidence**
- **Range**: Confidence values range from **0 to 1**:
  - **0**: The rule $ X \rightarrow Y $ never holds true.
  - **1**: The rule $ X \rightarrow Y $ always holds true (perfect confidence).

**Interpreting Confidence**
1. **High Confidence ($ \approx 1 $)**:
   - Indicates that $ Y $ almost always occurs when $ X $ occurs.
   - Example: If $ X = \{\text{bread}\} $ and $ Y = \{\text{butter}\} $, a high confidence implies that customers buying bread are very likely to buy butter.

2. **Low Confidence ($ \approx 0 $)**:
   - Indicates that $ Y $ rarely occurs when $ X $ occurs.
   - Example: If $ X = \{\text{bread}\} $ and $ Y = \{\text{cereal}\} $, a low confidence implies that buying bread is not strongly associated with buying cereal.

**Thresholds in Practice**

The minimum values for **confidence** depend on the specific dataset and the business context, but here are some common guidelines and starting points:
- **Typical Threshold**: 
  - **70-80%**: Commonly used as a starting point for reliable rules.
  - Adjust based on your use case:
    - Lower threshold (50-60%) for exploratory insights.
    - Higher threshold (90%+) for high-accuracy recommendations.
- **Reason**: Rules with low confidence may not reliably predict the consequent item, making them less actionable.

**Example**:  
A rule with 30% confidence means that only 30% of the time, the consequent is bought when the antecedent is bought. This may not justify a marketing strategy.


 **Step 7: Validate the Rules using Confidence**

First, let us set the **Minimum Confidence** value to: 70%.

Rules from `{Milk, Bread, Butter}`:
1. **Rule 1**: $$ \text{Milk, Bread} \rightarrow \text{Butter} $$  
   
$$Confidence = P(\text{Butter}|\text{Milk, Bread}) = \frac{P(\text{Milk, Bread, Butter})}{P(\text{Milk, Bread})} = \frac{6}{7} = 85\%
     $$

1. **Rule 2**: $$ \text{Milk, Butter} \rightarrow \text{Bread} $$  
   
$$Confidence = P(\text{Bread}|\text{Milk, Butter}) = \frac{P(\text{Milk, Bread, Butter})}{P(\text{Milk, Butter})} = \frac{6}{7} = 85\%
     $$

1. **Rule 3**: $$ \text{Bread, Butter} \rightarrow \text{Milk} $$  
   
$$Confidence = P(\text{Milk}|\text{Bread, Butter}) = \frac{P(\text{Milk, Bread, Butter})}{P(\text{Bread, Butter})} = \frac{6}{9} \approx 66\%
     $$

##### **2.3. Lift Metric**

**Confidence alone can be misleading**, especially when the consequent (Y) is very frequent in the dataset. In that case, a rule (X \Rightarrow Y) may have high confidence simply because (Y) is common, not because there is a strong dependency between (X) and (Y).

**Example (bakery: cake and bread)**  
Suppose we analyze orders in a bakery and find:

* 80% of all orders contain **bread**.
* 20% of all orders contain **cake**.
* 16% of all orders contain **both cake and bread**.

Then the confidence of the rule
$\text{cake} \Rightarrow \text{bread}$ is:  
$\text{conf}(\text{cake} \Rightarrow \text{bread})
  = \frac{P(\text{cake} \cap \text{bread})}{P(\text{cake})}
  = \frac{0.16}{0.20}
  = 0.8.$

A confidence of **80%** sounds strong, but notice that **bread is already in 80% of all orders**. So customers who buy cake are **not more likely** to buy bread than an average customer‚Äîbread is just very common in the bakery.

To address this limitation, we use additional measures such as **lift** alongside confidence.  

**Lift** compares $P(Y \mid X)$ to the overall frequency $P(Y)$:  

$
\text{Lift}(X \Rightarrow Y)
= \frac{\text{Confidence}(X \Rightarrow Y)}{\text{Support}(Y)}
= \frac{P(Y \mid X)}{P(Y)}.
$

In other words, Lift measures how likely 
ùëå is when ùëã happens, compared to how likely ùëå is in general.

Equivalently, in terms of supports:  

$
\text{Lift}(X \Rightarrow Y)
= \frac{\text{Support}(X \cap Y)}{\text{Support}(X) \times \text{Support}(Y)}.
$

In the example:  

$
\text{lift}(\text{cake} \Rightarrow \text{bread})
= \frac{P(\text{bread} \mid \text{cake})}{P(\text{bread})}
= \frac{0.8}{0.8}
= 1,
$

which indicates **no real association** beyond what we‚Äôd expect by chance.

So lift tells us whether items co-occur more (or less) frequently than chance, making it a useful complement to confidence.


**Interpreting Lift**

Lift measures how strongly two items or itemsets are associated:

* $(\text{lift} > 1)$: items occur together **more often than expected** (positive association).
* $(\text{lift} = 1)$: items occur together **as often as expected** under independence (no association).
* $(\text{lift} < 1)$: items occur together **less often than expected** (negative association).

So lift tells us whether items co-occur more (or less) frequently than expected under independence, making it a useful complement to confidence.


---

 **Step 8: Validate Rules using the Lift**

First, we set the **Lift > 1** for actionable rules

1. **Rule 1**: $$ \text{Milk, Bread} \rightarrow \text{Butter} $$  
   
     $$
     \text{Lift} = \frac{P(\text{Butter}|\text{Milk, Bread})}{P(\text{Butter})} = \frac{0.85}{0.83} \approx 1.02
     $$

2. **Rule 2**: $$ \text{Milk, Butter} \rightarrow \text{Bread} $$  

     $$
     \text{Lift} = \frac{P(\text{Bread}|\text{Milk, Butter})}{P(\text{Bread})} = \frac{0.85}{0.83} \approx 1.02
     $$

3. **Rule 3**: $$ \text{Bread, Butter} \rightarrow \text{Milk} $$  
   
     $$
     \text{Lift} = \frac{P(\text{Milk}|\text{Bread, Butter})}{P(\text{Milk})} = \frac{0.66}{0.75} \approx .88
     $$

### **Final Results**

Frequent Itemsets:
- **1-itemsets**: `{Milk}`, `{Bread}`, `{Butter}`
- **2-itemsets**: `{Milk, Bread}`, `{Milk, Butter}`, `{Bread, Butter}`
- **3-itemset**: `{Milk, Bread, Butter}`

Rules:
| Rule                           | Support | Confidence | Lift  | Actionable? |
|--------------------------------|---------|------------|-------|-------------|
| $$ \text{Milk, Bread} \rightarrow \text{Butter} $$ | 50%     | 85%      | 1.02  | Yes         |
| $$ \text{Milk, Butter} \rightarrow \text{Bread} $$ | 50%     | 85%      | 1.02  | Yes         |
| $$ \text{Bread, Butter} \rightarrow \text{Milk} $$ | 50%     | 66%      | .88  | No         |

**Interpretation and Actionable Insights**
| Rule                           | Explaniation |
|--------------------------------|--------------|
| $$ \text{Milk, Bread} \rightarrow \text{Butter} $$ |Customers buying Milk and Bread are highly likely (85%) to also buy Butter. Consider bundling these items.         |
| $$ \text{Milk, Butter} \rightarrow \text{Bread} $$ | Strong association; placing these items together could increase sales.         |
| $$ \text{Bread, Butter} \rightarrow \text{Milk} $$ | Suggests that Milk is a complementary product to Bread and Butter, Weak lift thought         |




---

### **Python Example**

We will implement the **Apriori algorithm** using the **`mlxtend`** library for frequent itemset mining and rule generation.

**a. Install Required Libraries**
Make sure you have the required libraries installed:
```bash
pip install pandas mlxtend
```

**b. Load the Dataset**
We will represent the dataset as a **binary transaction matrix** where each row is a transaction, and each column is an item (1 = purchased, 0 = not purchased).

In [1]:
import pandas as pd

# Define the dataset
data = {
    "Milk":    [1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1],
    "Egg":     [1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
    "Bread":   [1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1],
    "Butter":  [1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0],
    "Ketchup": [0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1],
    "Cookies": [0, 0, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1],
}

# Create a DataFrame
df = pd.DataFrame(data)
print("Transaction Dataset:")
df

Transaction Dataset:


Unnamed: 0,Milk,Egg,Bread,Butter,Ketchup,Cookies
0,1,1,1,1,0,0
1,1,1,0,1,1,0
2,0,0,1,1,1,0
3,1,0,1,1,0,0
4,0,0,1,1,0,1
5,1,0,1,1,0,1
6,1,0,0,0,0,1
7,1,0,1,1,0,0
8,0,1,1,1,0,1
9,1,0,1,1,0,0


**c. Apply the Apriori Algorithm**
We will use the `mlxtend` library to find frequent itemsets and generate association rules.

In [3]:
! pip install mlxtend

Collecting mlxtend
  Downloading mlxtend-0.23.4-py3-none-any.whl.metadata (7.3 kB)
Downloading mlxtend-0.23.4-py3-none-any.whl (1.4 MB)
[2K   [38;2;114;156;31m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m1.4/1.4 MB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m[36m0:00:01[0mm eta [36m0:00:01[0m
[?25hInstalling collected packages: mlxtend
Successfully installed mlxtend-0.23.4


In [4]:
from mlxtend.frequent_patterns import apriori, association_rules

# Step 1: Generate frequent itemsets with Apriori
frequent_itemsets = apriori(df, min_support=0.5, use_colnames=True)

# Display frequent itemsets
print("\nFrequent Itemsets:")
print(frequent_itemsets)


Frequent Itemsets:
    support               itemsets
0  0.750000                 (Milk)
1  0.833333                (Bread)
2  0.833333               (Butter)
3  0.583333          (Bread, Milk)
4  0.583333         (Milk, Butter)
5  0.750000        (Bread, Butter)
6  0.500000  (Bread, Milk, Butter)




**d. Generate Association Rules**
We generate rules based on **confidence** and calculate **lift**.

Generate rules based on confidence

In [None]:
# Step 2: Generate association rules based on confidence
rules = association_rules(frequent_itemsets, num_itemsets=len(df), metric="confidence", min_threshold=0.7)

# Display the rules
print("\nAssociation Rules:")
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])


Association Rules:
      antecedents consequents   support  confidence      lift
0         (Bread)      (Milk)  0.583333    0.700000  0.933333
1          (Milk)     (Bread)  0.583333    0.777778  0.933333
2        (Butter)      (Milk)  0.583333    0.700000  0.933333
3          (Milk)    (Butter)  0.583333    0.777778  0.933333
4        (Butter)     (Bread)  0.750000    0.900000  1.080000
5         (Bread)    (Butter)  0.750000    0.900000  1.080000
6   (Milk, Bread)    (Butter)  0.500000    0.857143  1.028571
7  (Butter, Milk)     (Bread)  0.500000    0.857143  1.028571


We can also generate rules based on lift

In [None]:
# Step 2: Generate association rules based on lift
rules = association_rules(frequent_itemsets, num_itemsets=len(df), metric="lift", min_threshold=1.0)

# Display the rules
print("\nAssociation Rules:")
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])


Association Rules:
      antecedents     consequents  support  confidence      lift
0        (Butter)         (Bread)     0.75    0.900000  1.080000
1         (Bread)        (Butter)     0.75    0.900000  1.080000
2   (Milk, Bread)        (Butter)     0.50    0.857143  1.028571
3  (Butter, Milk)         (Bread)     0.50    0.857143  1.028571
4         (Bread)  (Butter, Milk)     0.50    0.600000  1.028571
5        (Butter)   (Milk, Bread)     0.50    0.600000  1.028571


**e. Filter and Interpret Rules**
You can filter rules for high lift or specific itemsets.

In [None]:
# Filter rules with Lift > 1
filtered_rules = rules[rules['lift'] > 1]
print("\nFiltered Rules with Lift > 1:")
print(filtered_rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])


Filtered Rules with Lift > 1:
      antecedents     consequents  support  confidence      lift
0        (Butter)         (Bread)     0.75    0.900000  1.080000
1         (Bread)        (Butter)     0.75    0.900000  1.080000
2   (Milk, Bread)        (Butter)     0.50    0.857143  1.028571
3  (Butter, Milk)         (Bread)     0.50    0.857143  1.028571
4         (Bread)  (Butter, Milk)     0.50    0.600000  1.028571
5        (Butter)   (Milk, Bread)     0.50    0.600000  1.028571


**Interpretation**
1. $ \text{Milk, Bread} \rightarrow \text{Butter} $
   - Support = 58%, Confidence = 87.5%, Lift = 1.05.
   - Suggest bundling these items for promotions.
2. $ \text{Milk, Butter} \rightarrow \text{Bread} $
   - Similar metrics as above, showing strong relationships.
3. $ \text{Bread, Butter} \rightarrow \text{Milk} $
   - High confidence but slightly weaker lift, still actionable.

---

#### Two-Item Rules

If desired, we can focus on association rules where the antecedent (X) contains just **one item** (i.e., 1-item antecedent, 2 items in the rule overall).

Consider the rule:

$
\text{Milk} \rightarrow \text{Bread}
$

Suppose we have **12 transactions**, and we observe:

* **Milk** appears in **9** transactions.
* **Milk and Bread together** appear in **7** transactions.

Then:

* **Support** of the rule is the fraction of all transactions that contain both Milk and Bread:
  $
  \text{supp}(\text{Milk} \rightarrow \text{Bread}) = \frac{7}{12} \approx 58%.$

* **Confidence** of the rule is the fraction of Milk-transactions that also contain Bread:
  $
  \text{conf}(\text{Milk} \rightarrow \text{Bread}) = \frac{7}{9} \approx 78%.
  $

So we can write the rule as:

$\text{Milk} \rightarrow \text{Bread}$ {S = 58%, C = 78%}.

### 2-Item Rules vs 3-Item Rules

#### 2-Item Rules (e.g., $\text{Milk} \rightarrow \text{Bread}$)

* **Simplicity**: Easy to read, explain, and act on.
* **Higher support and confidence (typically)**: Fewer items need to co-occur, so these rules tend to appear more often in the data.
* **Broad applicability**: Capture general trends such as
  $
  \text{Milk} \rightarrow \text{Bread},
  $
  which can inform store layout (placing items nearby) or simple promotions.

#### 3-Item Rules (e.g., $\text{Milk, Bread} \rightarrow \text{Butter}$)

* **Deeper insights**: Reveal more specific patterns, like
  $
  \text{Milk, Bread} \rightarrow \text{Butter}
  $
  that might not be visible in 2-item rules.
* **Lower support (usually)**: All three items must appear together, so these combinations are naturally less frequent.
* **Targeted application**: Well-suited for niche marketing, personalized recommendations, or special bundle offers.


### Which Is Better?

* Use **2-item rules** when you want:

  * General trends,
  * Simple explanations,
  * Broad business strategies (e.g., product placement, popular combos).

* Use **3-item rules** when you want:

  * More specific patterns,
  * Finer-grained understanding of customer behavior,
  * Targeted campaigns or tailored bundles.

There is no universally ‚Äúbetter‚Äù rule size‚Äîthe choice depends on your **data characteristics** (how dense/sparse the transactions are) and your **business objectives** (broad policies vs. targeted actions).