As been discussed, all of the methods explored so far, such as ANOVA F-tests and SelectKBest based on F-scores, are widely used in **genomic and biological research**. In these fields, the goal is often to **rank features** (e.g., genes) according to their statistical association with a phenotype, and the datasets typically have **very high dimensionality and relatively small sample sizes**, which makes univariate filtering methods practical and interpretable.

However, these methods are **less commonly used in general machine learning problems** because they:

- Consider **each feature independently**, ignoring interactions or correlations between features.
- Assume **linear relationships** (in the case of ANOVA F-tests), which may not hold for complex datasets.
- May not align with **model-specific objectives**, where predictive performance depends on multivariate relationships.

# Wrapper Methods for Feature Selection

**Wrapper methods** are a class of feature selection techniques that evaluate subsets of features based on the **performance of a predictive model**. Unlike filter methods, which rely on statistical measures, wrappers **use the model itself as a black box** to decide which features are most relevant.

---

### How Wrapper Methods Work

1. **Select a subset of features**.
2. **Train a model** (e.g., linear regression, decision tree) using only that subset.
3. **Evaluate model performance** using a metric such as accuracy, RÂ², RMSE, or cross-validated score.
4. **Use the performance to guide selection**:
   - Keep features that improve performance.
   - Remove or discard features that do not.

---

## Advantages

- Takes into account **feature interactions** because it evaluates subsets directly.
- Optimized for **model performance**, not just statistical correlation.

---

## Disadvantages

- **Computationally expensive**, especially with large numbers of features, because multiple models must be trained.
- Can **overfit** if the dataset is small, since it optimizes directly on model performance.

---

## Common Wrapper Techniques

1. **Forward Selection (Stepwise Forward Selection)**

   - Start with no features.
   - Add features **one at a time**, keeping the one that improves model performance the most.

2. **Backward Elimination (Stepwise Backward Selection)**

   - Start with all features.
   - Remove features **one at a time**, removing the one whose removal least decreases performance.

3. **Exhaustive Search**
   - Evaluate **all possible subsets** of features.
   - Guarantees the best-performing subset but is **computationally infeasible** for many features.
