# **Principal Component Analysis (PCA)**  
**Principal Component Analysis (PCA)** is a **dimensionality reduction** technique that transforms high-dimensional data into a lower-dimensional space while preserving **maximum variance**. It is widely used in **machine learning**, **computer vision**, and **data visualization**.

---

## **1. Standardization**  
Since PCA is sensitive to scale, we first **standardize** the dataset to have **zero mean** and **unit variance**:

$$
X' = \frac{X - \mu}{\sigma}
$$

where:  
- **X** = original data  
- **μ** = mean of each feature  
- **σ** = standard deviation of each feature  
- **X'** = standardized data  

---

## **2. Compute the Covariance Matrix**  
The **covariance matrix** represents the relationship between features and is computed as:

$$
C = \frac{1}{n} X^T X
$$

where:  
- **C** = covariance matrix  
- **n** = number of samples  
- **X^T** = transpose of the standardized data matrix  

---

## **3. Compute Eigenvalues & Eigenvectors**  
To determine the **principal components**, we solve the eigenvalue problem:

$$
C v = \lambda v
$$

where:  
- **λ** = eigenvalues (representing variance explained by each principal component)  
- **v** = eigenvectors (representing the direction of principal components)  

---

## **4. Select Principal Components**  
- Sort **eigenvalues** (λ₁, λ₂, ..., λ<sub>d</sub>) in **descending order**.  
- Select the **top k** eigenvectors corresponding to the highest eigenvalues.  
- The chosen eigenvectors form the **projection matrix V<sub>k</sub>**.  

The **explained variance ratio** for the first **k** principal components is:

$$
\text{Explained Variance Ratio} = \frac{\sum_{i=1}^{k} \lambda_i}{\sum_{j=1}^{d} \lambda_j}
$$

where:  
- **d** = total number of features  

---

## **5. Transform Data**  
The original data is projected onto the new **lower-dimensional space**:

$$
X_{\text{new}} = X V_k
$$

where:  
- **X_new** = transformed data in the lower-dimensional space  
- **V_k** = matrix of the top **k** eigenvectors  

---

## **Python Implementation of PCA**  
### **Step 1: Standardizing the Data**
```python
from sklearn.preprocessing import StandardScaler
import numpy as np

# Sample Data
X = np.array([[2.5, 2.4], [0.5, 0.7], [2.2, 2.9], [1.9, 2.2], [3.1, 3.0]])

# Standardizing the Data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
```

### **Step 2: Compute PCA**
```python
from sklearn.decomposition import PCA

# Applying PCA to reduce to 1 component
pca = PCA(n_components=1)
X_pca = pca.fit_transform(X_scaled)

print("Transformed Data:\n", X_pca)
```

### **Step 3: Explained Variance Ratio**
```python
print("Explained Variance Ratio:", pca.explained_variance_ratio_)
```

---

# **When to Use PCA?**  
✔️ When working with **high-dimensional** datasets  
✔️ When features are **correlated**, leading to redundancy  
✔️ When reducing dimensions helps **avoid overfitting**  
✔️ When data needs to be **visualized** in **2D or 3D**  
✔️ When reducing computational cost in machine learning models  

---

# **Limitations of PCA**  
❌ PCA assumes **linearity** in data, which may not always be true.  
❌ PCA does not work well if features are **not correlated**.  
❌ PCA reduces interpretability, making it harder to understand transformed features.  
❌ PCA is sensitive to **outliers**, which can distort results.  


In [1]:
from sklearn.preprocessing import StandardScaler
import numpy as np

# Sample Data
X = np.array([[2.5, 2.4], [0.5, 0.7], [2.2, 2.9], [1.9, 2.2], [3.1, 3.0]])

# Standardizing the Data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

In [2]:
from sklearn.decomposition import PCA

# Applying PCA to reduce to 1 component
pca = PCA(n_components=1)
X_pca = pca.fit_transform(X_scaled)

print("Transformed Data:\n", X_pca)

Transformed Data:
 [[ 0.5124457 ]
 [-2.57528445]
 [ 0.69555387]
 [-0.1485184 ]
 [ 1.51580328]]


In [3]:
print("Explained Variance Ratio:", pca.explained_variance_ratio_)

Explained Variance Ratio: [0.96982031]
