# **Feature Selection vs Feature Extraction**

Both are techniques used in **Dimensionality Reduction**, but they work in *different ways*.

---

## ðŸ”¹ **1. Feature Selection (Choosing the Best Features)**

**Definition**

Feature Selection means **selecting a subset of relevant features** from the original dataset **without modifying** them.

We simply **keep the best features** and **remove the irrelevant ones**.

### **Why use it?**

* Reduces overfitting
* Removes noise
* Reduces training time
* Makes the model easier to interpret

### **Types of Feature Selection**

**A. Filter Methods** (statistical tests)

Use statistical scores to select features.

Examples:

* Correlation
* Chi-square test
* ANOVA test
* Mutual information

**B. Wrapper Methods**

Try different feature combinations and pick the best.

Examples:

* RFE (Recursive Feature Elimination)
* Forward/Backward selection

**C. Embedded Methods**

Feature selection is done during model training.

Examples:

* Lasso Regression (L1 regularization)
* Decision Trees feature importance

### **Simple Example**

Features:
`[Age, Income, Gender, State, Height, Weight]`

If only **Age, Income, Height** are useful, you select them and drop the rest.

---

## ðŸ”¸ **2. Feature Extraction (Creating New Features)**

**Definition**

Feature Extraction means **transforming existing features into new features**.
These new features are usually **compressed**, **more informative**, and **represent hidden structure**.

ðŸ‘‰ You **create new features** from the old ones.

### **Why use it?**

* Removes correlation between features
* Reduces complexity
* Keeps maximum information
* Helps visualization (2D/3D)

### **Common Feature Extraction Methods**

* **PCA (Principal Component Analysis)**
* **LDA (Linear Discriminant Analysis)**
* **t-SNE**
* **UMAP**
* **Autoencoders** (Neural Networks)

### **Simple Example**

Original features:
`[Height, Weight, BMI]`

These features are correlated.

PCA will convert them into:
`[PC1, PC2]`

Where PC1 may represent:

* overall body size
* (a combination of height + weight)

---

### **Key Differences (Easy Table)**

| Feature Selection                    | Feature Extraction                         |
| ------------------------------------ | ------------------------------------------ |
| Keep **subset** of original features | Create **new features** from original ones |
| No transformation                    | Transformation happens                     |
| Simple & interpretable               | Harder to interpret                        |
| Removes irrelevant features          | Combines features to form components       |
| Examples: RFE, Lasso, Correlation    | Examples: PCA, LDA, t-SNE                  |
| Faster                               | More computationally expensive             |

---

### **When to Use What?**

âœ” Use **Feature Selection** when:

* You want interpretability
* You want to remove irrelevant or noisy features
* Speed is important

âœ” Use **Feature Extraction** when:

* Data is highly correlated
* You want to reduce dimensions drastically
* You need visualization
