## What is a Decision Tree?

A **decision tree** is a method used in machine learning to make predictions.
It works like a **series of yes/no questions** that split the data into smaller and smaller groups, until you reach an answer (a prediction).

* For **regression**, the answer is a **number** (like a price, temperature, or score).
* For **classification**, the answer is a **category** (like "yes/no", "apple/orange").

It’s called a “tree” because the splits and branches look like an upside-down tree!

---

## Why Do We Use Decision Trees?

1. **Easy to Understand**

   * The “if this, then that” logic is simple and visual—humans can read and follow it.
2. **Works for Numbers and Categories**

   * Can handle both regression (numbers) and classification (labels).
3. **No Need for Complex Data Preparation**

   * Handles missing data and different types of features well.
4. **Flexible and Powerful**

   * Can model complex, non-linear relationships.

---

## In short:

* **A decision tree splits your data using simple questions until it’s easy to make a prediction.**
* **We use it because it’s simple, clear, and works for many different types of problems.**

---


## Step-by-Step: Decision Tree Regression (With Tiny Data)

Suppose you have:

| x | y |
| - | - |
| 1 | 2 |
| 3 | 4 |
| 5 | 6 |

---

### **List all possible splits**

You can only split between values of x.
Possible split points are **x = 2** and **x = 4**.

---

### **Try the first split: x = 2**

* **Left group:** x ≤ 2 → (1, 2)
* **Right group:** x > 2 → (3, 4), (5, 6)

**Group Averages:**

* Left: (2) / 1 = **2** (only one point)
* Right: (4 + 6) / 2 = **5**

**Prediction for each group:**

* Left: predict 2
* Right: predict 5

**Calculate errors (use squared error):**

| x | y | Prediction | Error (y - prediction) | Error² |
| - | - | ---------- | ---------------------- | ------ |
| 1 | 2 | 2          | 0                      | 0      |
| 3 | 4 | 5          | -1                     | 1      |
| 5 | 6 | 5          | 1                      | 1      |

**Total error for this split:** 0 + 1 + 1 = **2**

---

### **Try the second split: x = 4**

* **Left group:** x ≤ 4 → (1, 2), (3, 4)
* **Right group:** x > 4 → (5, 6)

**Group Averages:**

* Left: (2 + 4) / 2 = **3**
* Right: (6) / 1 = **6** (only one point)

**Prediction for each group:**

* Left: predict 3
* Right: predict 6

**Calculate errors (use squared error):**

| x | y | Prediction | Error (y - prediction) | Error² |
| - | - | ---------- | ---------------------- | ------ |
| 1 | 2 | 3          | -1                     | 1      |
| 3 | 4 | 3          | 1                      | 1      |
| 5 | 6 | 6          | 0                      | 0      |

**Total error for this split:** 1 + 1 + 0 = **2**

---

### **Choose the best split**

Both splits (x = 2 and x = 4) have **the same total error (2)**.
You can pick either, or use a rule (like always pick the first, or try further splits).

---

### **Assign predictions for new data**

Suppose you pick **split at x = 2**:

* For x = 1: Is 1 ≤ 2? Yes → **predict 2**
* For x = 3: Is 3 ≤ 2? No → **predict 5**
* For x = 5: Is 5 ≤ 2? No → **predict 5**

---

## **Summary Table**

| Step            | What happens?                                                      |
| --------------- | ------------------------------------------------------------------ |
| List splits     | Try all splits between values (here, x = 2 and x = 4)              |
| Make groups     | For each split, separate data into left/right groups               |
| Group average   | For each group, calculate the average y (prediction)               |
| Calculate error | For each point, squared difference from group prediction           |
| Choose split    | Pick the split with the lowest total error                         |
| Predict         | Use the group average for new x values (follow the split question) |

---