# **Decision Tree**

A Decision Tree is a supervised learning algorithm used for classification and regression tasks. It creates a tree-like structure where decisions are made based on feature values. 

---

## **1. Core Concepts**

### **a. Root Node**
- The topmost node in the tree.  
- Represents the entire dataset and is split into child nodes based on the best attribute.

### **b. Splitting**
- The process of dividing the dataset into subsets based on a specific feature and its threshold value.  
- Splitting continues recursively to build the tree.

### **c. Decision Node**
- A node that represents a decision to further split the dataset based on a feature value.

### **d. Leaf Node**
- The terminal node of the tree, representing the final output or class label.  
- No further splitting occurs at this point.

### **e. Branch**
- A sub-section of the tree that connects nodes and shows the path from a parent node to its child node(s).

### **f. Parent and Child Nodes**
- **Parent Node**: A node that is split into child nodes.  
- **Child Node**: A node derived from splitting a parent node.

---

## **2. Attribute Selection Measures (ASM)**

Attribute selection measures are used to determine the best feature for splitting at each step. Common ASMs include:

### **a. Information Gain (IG)**
- Measures the reduction in entropy after a split.  
- Formula for entropy:  
  $$ 
  H(S) = -\sum_{i=1}^n P_i \log_2 P_i 
  $$  
  Where $P_i$ is the probability of class $i$.  
- Information Gain:  
  $$ 
  IG(S, A) = H(S) - \sum_{v \in A} \frac{|S_v|}{|S|} H(S_v) 
  $$  
  Where $S_v$ is the subset after splitting on attribute $A$.

- **When to Use**: Use when entropy-based splitting is required, common in ID3 and C4.5 algorithms.

### **b. Gini Index**
- Measures the impurity of a dataset. Lower values indicate a better split.  
- Formula:  
  $$ 
  Gini(S) = 1 - \sum_{i=1}^n (P_i)^2 
  $$  
  - $P_i$: Probability of class $i$.  
- Gini Index after a split:  
  $$ 
  Gini_{split} = \sum_{v \in A} \frac{|S_v|}{|S|} Gini(S_v) 
  $$

- **When to Use**: Gini Index is faster to compute and is commonly used in CART (Classification and Regression Trees).

---

## **3. CART Algorithm (Classification and Regression Trees)**

The CART algorithm is the backbone of decision trees.  
- **Steps**:
  1. Start with the root node containing the entire dataset.
  2. For each feature, calculate the **Gini Index** or another ASM to determine the best split.
  3. Split the dataset at the feature and value that minimizes impurity.
  4. Repeat the process recursively for each child node until a stopping condition is met (e.g., all samples belong to one class or maximum tree depth is reached).
  5. Assign leaf nodes with the majority class (for classification) or mean value (for regression).

---

## **4. Pruning**

Pruning is the process of reducing the size of a decision tree by removing nodes that do not provide significant information.  
### **a. Types of Pruning**:
- **Pre-Pruning**: Stop tree growth early based on a predefined condition (e.g., maximum depth).  
- **Post-Pruning**: Grow the full tree first and then remove nodes that do not contribute significantly to the model.

### **Advantages**:
- Prevents overfitting.  
- Improves generalization.  

---

## **5. Decision Tree Structure and Terms**

### **Summary of Key Components**:
- **Root Node**: Starting point with the full dataset.  
- **Decision Nodes**: Points where the data splits based on an attribute.  
- **Leaf Nodes**: Endpoints with a class label or regression value.  
- **Branches**: Connections between nodes showing decisions made.  
- **Parent and Child**: Hierarchical relationship between nodes.

---

## **6. Applications of Decision Trees**
- **Classification**: Spam detection, sentiment analysis, etc.  
- **Regression**: Predicting house prices, stock market trends, etc.

Decision trees are intuitive and easy to interpret, making them widely used for various machine learning tasks.
