### Summary: Curse of Dimensionality & PCA

* **Topic Introduction**

  * Starting with **Principal Component Analysis (PCA)**, also known as **dimensionality reduction**.
  * Before PCA, it’s important to understand the **curse of dimensionality**.

* **Curse of Dimensionality**

  * **Dimensionality = number of features** in a dataset.
  * Example: Predicting **house prices** with features like size, bedrooms, bathrooms, location, etc.
  * Adding more **important features** (e.g., from 3 → 6 → 15) can initially improve model accuracy.
    As the number of dimensions (features) increases:
    * The data becomes **sparse** (spread out).
    * Distance and similarity measures (like Euclidean distance) become less meaningful.
    * Models require **exponentially more data** to generalize well.
    * Computation becomes **slower and more complex**.
  * But beyond a point (e.g., 50, 100, 500 features), accuracy **decreases** because:

    * Many features are irrelevant or redundant.
    * The model **overfits**, gets confused, and performance degrades.
    * Computation becomes slower and more complex.
  * Human analogy: if you keep adding too many conditions when estimating a house price, even an expert will get confused.

* **How to Overcome Curse of Dimensionality**

  1. **Feature Selection** – keep only the most important features.
  2. **Feature Extraction (Dimensionality Reduction)** – create new features that summarize the essence of existing ones.

     * PCA is one such method.

* **PCA (Principal Component Analysis)**

  * Transforms the original high-dimensional features into a smaller set of new features (principal components).
  * These new features capture most of the **variance (information)** from the original data.
  * Helps models train faster, avoid overfitting, and perform better.

---

👉 In short:
Too many features cause the **curse of dimensionality** → models get confused and accuracy drops.
We fix this with **feature selection** or **dimensionality reduction**.
PCA is a key dimensionality reduction technique that we’ll explore in detail next.



### **Dimensionality Reduction**

* **Why do it?**

  1. Prevent the *curse of dimensionality* (too many features hurt model performance).
  2. Improve model training efficiency and accuracy.
  3. Enable visualization (humans can only see up to 3D).

---

### **Feature Selection**

* Goal: Select the most important features that strongly impact the target.
* Methods:

  * Use **covariance** and **correlation** (e.g., Pearson correlation) to measure relationships between features and target.
  * Strong positive/negative correlation → feature is important.
  * Near-zero correlation → feature is unimportant and can be dropped.
* Example:

  * **House size** vs. **price** → strong correlation → keep.
  * **Fountain size** vs. **price** → weak correlation → drop.

---

### **Feature Extraction**

* Goal: Create new, informative features from existing ones (instead of dropping).
* Process: Apply transformations to combine or derive features.
* Example:

  * From **room size** + **number of rooms**, derive a new feature: **house size**, which can still predict house price effectively.
* Key point: Some information is lost, but the new feature captures the essence of the originals while reducing dimensions.

---

### **Key Distinction**

* **Feature Selection** → Choose from existing features (drop irrelevant ones).
* **Feature Extraction** → Transform existing features to create new ones.

---

👉 In practice: both are used in dimensionality reduction before applying models or visualization (e.g., PCA for feature extraction).

