# Principal Component Analysis (PCA) Step-by-Step

Principal Component Analysis (PCA) is a dimensionality reduction technique used to transform a dataset into a new coordinate system such that the greatest variances are along the first axes (called principal components).

---

## Step 1: Standardize the Dataset
Since PCA is sensitive to the scale of features, standardize the dataset so that each feature has a mean of 0 and a standard deviation of 1.

The formula for standardization is:  
$z = \frac{x - \mu}{\sigma}$  

Where:  
- $x$ is the original feature value,  
- $\mu$ is the mean of the feature,  
- $\sigma$ is the standard deviation of the feature.

---

## Step 2: Compute the Covariance Matrix
The covariance matrix measures how features co-vary, i.e., how changes in one feature correspond to changes in another.

The covariance matrix is computed as:  
$\text{Cov}(X) = \frac{1}{n-1} \cdot X^T X$  

Where:  
- $X$ is the standardized data matrix (rows are samples, columns are features),  
- $X^T$ is the transpose of $X$.

---

## Step 3: Calculate the Eigenvalues and Eigenvectors
- Compute the **eigenvalues** and **eigenvectors** of the covariance matrix.  
- Eigenvalues represent the amount of variance captured by each principal component.  
- Eigenvectors represent the directions (principal axes) in the feature space.

---

## Step 4: Sort the Eigenvalues and Eigenvectors
- Sort the eigenvalues in descending order.  
- Rearrange the eigenvectors to correspond to the sorted eigenvalues.  
- The eigenvector corresponding to the largest eigenvalue is the **first principal component**, which explains the maximum variance.

---

## Step 5: Select the Top $k$ Principal Components
Decide how many principal components ($k$) to retain. This is based on the **explained variance ratio**:  
$\text{Explained Variance Ratio} = \frac{\text{Eigenvalue}_i}{\sum \text{Eigenvalues}}$  

Choose $k$ such that the cumulative explained variance meets a desired threshold (e.g., 95%).

---

## Step 6: Project the Data onto the New Basis
Transform the original dataset to the new coordinate system defined by the top $k$ principal components. The transformation formula is:  
$Z = X \cdot W$  

Where:  
- $Z$ is the transformed dataset in the reduced dimensionality space,  
- $X$ is the standardized data,  
- $W$ is the matrix of selected eigenvectors (corresponding to the top $k$ eigenvalues).

---

## Step 7: Interpretation
- The transformed dataset $Z$ has reduced dimensions.  
- Each axis corresponds to a principal component, and each component captures a portion of the variance in the original data.

---

### Visualization of PCA Steps:

If your data is 2D or 3D, PCA can be visualized as rotating the original axes to align with the directions of maximum variance and then projecting the data onto these new axes.

