<a href="https://www.kaggle.com/code/saibhossain/decision-tree?scriptVersionId=288842486" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

###  What Is a Decision Tree?
- Used for **classification** and **regression**.
- Structure:
  - **Internal nodes**: Test on a feature (`Age ≤ 30?`)
  - **Branches**: Outcome of the test
  - **Leaf nodes**: Final prediction (class label or numeric value)

###  Decision Trees: Summary

A **decision tree** is a supervised learning model that makes predictions by recursively splitting the data based on feature values, forming a tree-like structure of interpretable **if–else rules**.

###  How It Works
1. Start with all data at the **root**.
2. **Select the best feature** to split the data (based on impurity reduction).
3. **Split** the dataset into subsets.
4. **Repeat recursively** for each subset.
5. Stop when a **stopping condition** is met (e.g., max depth, min samples per leaf).
6. Assign the **majority class** (classification) or **mean value** (regression) at leaf nodes.

### Split Criteria (Minimizing Impurity)

| Algorithm | Criterion | Formula |
|----------|----------|--------|
| **CART** | **Gini Index** | $ \text{Gini} = 1 - \sum p_i^2 $ |
| **ID3 / C4.5** | **Entropy** | $ \text{Entropy} = -\sum p_i \log_2 p_i $ |
| | **Information Gain** | $ \text{IG} = \text{Entropy}_{\text{parent}} - \sum \text{Entropy}_{\text{children}} $ |
| **Regression Trees** | **MSE** | $ \text{MSE} = \frac{1}{n} \sum (y_i - \hat{y})^2 $ |



###  Simple Example

**Rule-based view**:
```python
If Weather == "Sunny":
    If Humidity > 75 → "No"
    Else → "Yes"
Else → "Yes"
```

### Tree visualization:

            Weather
            /     \
        Sunny    Rainy
          |
       Humidity
        /    \
      High   Low
       No     Yes

### Advantages
* Easy to understand and interpret
* Handles numerical and categorical data
* Requires little data preprocessing
* Mimics human decision-making



### Disadvantages
* Prone to overfitting
* Sensitive to small data changes
* Lower accuracy than ensembles (Random Forest, XGBoost)

### Numerical Dataset
We predict Pass (1) / Fail (0) based on Study Hours and Attendance (%)

| Study Hours | Attendance | Result |
| ----------- | ---------- | ------ |
| 2           | 60         | 0      |
| 4           | 70         | 0      |
| 6           | 75         | 1      |
| 8           | 85         | 1      |
| 10          | 95         | 1      |


In [None]:
from sklearn.tree import DecisionTreeClassifier

# Features: [Study Hours, Attendance]
X = [
    [2, 60],
    [4, 70],
    [6, 75],
    [8, 85],
    [10, 95]
]

y = [0, 0, 1, 1, 1] # Labels: 0 = Fail, 1 = Pass

model = DecisionTreeClassifier(criterion="gini")
model.fit(X, y)

prediction = model.predict([[5, 72]])
print("Prediction (0=Fail, 1=Pass):", prediction[0])