```{contents}
```

# Assumptions

## **1. Random Forest is Non-Parametric**

* **Non-parametric** means it **doesn’t assume a specific form of the relationship** between features and target (no linearity assumption).
* It can model **complex nonlinear relationships** naturally.

**Implication:** You don’t need to transform features to fit a line or polynomial; trees handle splits automatically.

---

## **2. Key Implicit Assumptions**

While RF is flexible, it still makes a few **practical assumptions**:

### **A. Observations are Independent**

* Random Forest assumes that **training samples are independent**.
* Correlated or time-dependent samples (like time series) may require special handling.

**Example:**

* In stock price prediction, consecutive days are correlated → standard RF may ignore temporal dependency.

---

### **B. Features Should Have Some Predictive Power**

* Random Forest works best if **at least some features are informative**.
* Including completely irrelevant features usually won’t hurt too much because RF selects random subsets, but too many noisy features may reduce performance.

---

### **C. Data Representativeness**

* Training data should be **representative of the population** you want to predict.
* Bagging (bootstrap sampling) assumes each sample is drawn from the same underlying distribution.

---

### **D. Decision Trees Assume Split Criteria are Meaningful**

* RF splits nodes using measures like:

  * **Gini Impurity** or **Entropy** (classification)
  * **Variance reduction / MSE** (regression)
* This assumes that **splitting features can actually reduce impurity**.
* If all features are weak or unrelated, RF won’t perform well.

---

## **3. What Random Forest Does NOT Assume**

* **No linearity:** Can capture nonlinear patterns
* **No normality:** Features or target do not need to be normally distributed
* **No homoscedasticity:** Variance of errors can vary
* **No feature scaling required:** Trees are scale-invariant

---

### **4. Practical Notes**

* RF is robust to **outliers**, **missing values** (some implementations), and **feature correlations**.
* Correlated features reduce the **diversity among trees**, slightly decreasing the benefit of ensembling.

---

### ✅ **Summary**

| Aspect                      | Assumption?                   | Notes                                                  |
| --------------------------- | ----------------------------- | ------------------------------------------------------ |
| Feature-target relationship | No (non-parametric)           | Can capture nonlinear patterns                         |
| Observation independence    | Yes                           | Samples should be independent                          |
| Feature informativeness     | Yes (some features must help) | Random feature selection mitigates irrelevant features |
| Data representativeness     | Yes                           | Training data should reflect population                |
| Scaling / normality         | No                            | RF is scale-invariant and distribution-free            |

