
# 🧪 Class Exercise: Discovering Shopping Patterns with Association Rule Mining

## 🎯 Objective:
You will explore a real-world transactional dataset using **association rule mining** and the **Apriori algorithm**. Your goal is to uncover relationships between items frequently bought together in a grocery store setting.

---

## 🧾 Dataset: `Market_Basket_Optimisation.csv`

- Each row = one shopping basket (transaction)  
- Each column = an item purchased  
- There are up to 20 items per transaction  
- Empty cells mean fewer items were purchased in that transaction  

---

## 📘 Key Concepts

Before you begin, make sure you understand these important terms:

| Term        | Definition |
|-------------|------------|
| **Support** | The proportion of transactions that contain a specific itemset. It tells you how frequent an item or item combination is. |
| **Confidence** | The likelihood that item **B** is purchased **when item A is purchased**. It measures the reliability of the rule. |
| **Lift** | The ratio of observed support to what would be expected if **A** and **B** were independent. Lift > 1 means the rule is stronger than chance. |

**Example**:  
If `mineral water → green tea` has:  
- **Support = 0.01** → in 1% of all transactions, both were bought  
- **Confidence = 0.3** → in 30% of purchases with mineral water, green tea was also bought  
- **Lift = 3** → customers are 3x more likely to buy green tea if they buy mineral water  

---

## 🧰 Requirements:

Make sure you have these Python packages installed:

```bash
pip install pandas apyori
```

---

## ✅ Tasks & Hints

### 🔹 Task 1: Load the Data
**Hint:** Use `pandas` to read the CSV file. Since there is no header row, remember to set `header=None`.

---

### 🔹 Task 2: Convert the Dataset into Transactions
**Hint:** You'll need to loop through each row of the DataFrame and extract the non-empty items (filter out missing values).  
Each transaction should be a list of items.

---

### 🔹 Task 3: Apply the Apriori Algorithm
**Hint:** Use the `apriori()` function from the `apyori` library. You’ll need to provide:
- The list of transactions
- Minimum support (e.g., 0.003)
- Minimum confidence (e.g., 0.2)
- Minimum lift (e.g., 3)
- Set `min_length` and `max_length` to 2 to look for 2-item rules only

The output will be a generator of rules — convert it to a list.

---

### 🔹 Task 4: Interpret the Rules
**Hint:** Each rule object contains:
- The **items** involved in the rule
- The **support** of the itemset
- The **confidence** and **lift** values in the `.ordered_statistics` attribute

Extract this information and store it in a structured format like a table or DataFrame.

---

### 🔹 Task 5: Analyze and Answer

Answer the following based on your results:

1. Pick a rule and explain what its **support**, **confidence**, and **lift** mean.
2. Filter the rules to find those with **lift > 4**. How many such rules are there?
3. Which rule has the **highest confidence**?
4. As a store manager, how could you use these insights for **product placement** or **promotions**?

---

## ⭐ Bonus Challenge (Optional)

Create a **bar chart** showing the lift of the top 10 rules.  
**Hint:** Use `matplotlib.pyplot` and sort your rules by lift value.

---

## 📝 Deliverables

Submit the following:
- Your Python code (Jupyter notebook or `.py` file)
- A brief explanation (1-2 paragraphs) interpreting your results
- Answers to the written questions
