# Feature Selection – Embedded Methods

Till now, I explored **Filter** and **Wrapper** methods for feature selection.  
They worked well, but:

- Filter methods are too simple – they ignore interactions between features.
- Wrapper methods are too slow – they train multiple models for different feature subsets.

**Embedded methods solve both problems** by doing feature selection **as part of model training**.  
This makes them **efficient** and allows them to consider **feature interactions**.

In this notebook, I want to go **deep** and really understand:

- What each embedded method does under the hood
- When to use which method
- How to implement them step by step with code
- How performance changes when I actually drop features


## Types of Embedded Methods

**Lasso Regression (L1)** → Performs automatic feature selection by shrinking some coefficients to zero ·  
**Ridge Regression (L2)** → Shrinks coefficients to reduce multicollinearity but keeps all features ·  
**Elastic Net (L1 + L2)** → Combines Lasso and Ridge to handle correlated features while performing selection ·  
**Tree-based Feature Importance (Random Forest / Decision Trees)** → Uses model-inherent feature importance to select relevant features without assuming linearity


## Baseline Model – Breast Cancer Dataset

Before diving into embedded methods, I want to set up a **baseline model** using all features.  
This will help me **compare performance later** when we drop features using Lasso, Ridge, Elastic Net, or Tree-based methods.

We are using **Breast Cancer dataset** from `sklearn` because it has a good number of features and is widely used for classification practice.
