

# ðŸ“˜ Principal Component Analysis (PCA) â€“ Theory Notes

## 1. **Definition**

Principal Component Analysis (PCA) is a **dimensionality reduction technique** that transforms a large set of correlated variables into a smaller set of **uncorrelated variables** called **principal components (PCs)**, while retaining most of the variation in the data.

---

## 2. **Objectives of PCA**

* Reduce **dimensionality** while preserving maximum variance.
* Remove **multicollinearity** among features.
* Extract **new features (principal components)** that are orthogonal.
* Improve efficiency in **visualization, storage, and modeling**.

---

## 3. **Step-by-Step PCA Process**

### Step 1: **Standardization**

Ensure all features have equal weight.

$$
z_{ij} = \frac{x_{ij} - \mu_j}{\sigma_j}
$$

where

* $x_{ij}$ = value of feature $j$ for sample $i$
* $\mu_j$ = mean of feature $j$
* $\sigma_j$ = standard deviation of feature $j$

---

### Step 2: **Covariance Matrix Calculation**

To measure relationships between features.

$$
\mathbf{C} = \frac{1}{n-1} Z^T Z
$$

where $Z$ is the standardized data matrix.

---

### Step 3: **Eigenvalues and Eigenvectors**

Solve:

$$
\mathbf{C}v = \lambda v
$$

* **Eigenvectors (v)** â†’ directions of principal components.
* **Eigenvalues (Î»)** â†’ variance explained by each component.

---

### Step 4: **Sort Eigenvalues**

* Arrange eigenvalues in descending order.
* Select top $k$ eigenvectors corresponding to largest eigenvalues.

---

### Step 5: **Form the Projection Matrix**

$$
\mathbf{W} = [v_1, v_2, \dots, v_k]
$$

where $v_i$ are top eigenvectors.

---

### Step 6: **Transform the Data**

Project original data onto new axes:

$$
\mathbf{Z}_{proj} = Z \cdot W
$$

Result = Reduced-dimension dataset with maximum variance retained.

---

## 4. **Variance Explained**

The proportion of variance explained by the $i^{th}$ principal component:

$$
\text{Explained Variance Ratio}_i = \frac{\lambda_i}{\sum_{j=1}^{p} \lambda_j}
$$

where $p$ = number of original features.

---

## 5. **Key Poins**

* PCs are **linear combinations** of original features.
* PCs are **orthogonal (uncorrelated)**.
* First PC â†’ captures maximum variance.
* Second PC â†’ captures next maximum, orthogonal to the first.
* PCA is **unsupervised** (ignores target variable).

---
