## Principal Component Analysis (PCA)

PCA is a dimensionality reduction technique that is widely used in machine learning. It transforms a set of correlated variables into a set of uncorrelated variables, called principal components. PCA is often used to reduce the number of features in a dataset without losing too much information. This can make machine learning algorithms more efficient and accurate.

### How PCA works?

PCA works by finding the direction of greatest variance in the data. These directions are the principal components. The first principal component is the direction of the greatest variance, the second principal component is the direction of the second greatest variance, and so on.

#### Eigendecomposition
To calculate the principal components of a dataset, PCA uses a technique called Eigendecomposition. Eigendecomposition is a mathematical technique that decomposes a matrix into its eigenvectors and eigenvalues. The eigenvectors of a matrix represent the directions of greatest variance in the matrix, and the eigenvalues of a matrix represent the amount of variance in each direction.

In [None]:
## PCA Formula

The PCA formula can be written as follows:

PCA(X) = W * X

where:

1. `PCA(X)` is the transformed data
2. `X` is the original data
3. `W` is the projection matrix, which contains the eigenvectors of the covariance matrix of X.

The projection matrix `W` is calculated by peforming eigendecomposition on the covariance matrix of `X` and selecting the top n_components eigenvectors, where `n_components` is the number of principal components.

### Benefits of using PCA

1. Dimensionality reduction
2. Noise reduction
3. Feature selection
   
### Pitfalls of PCA

1. PCA is sensitive to outliers.
2. PCA can remove important information from the data.
3. PCA is not invariant to nonlinear transformations.
4. PCA can be computationally expensive.

### Implementation from Scratch in Python

In [2]:
import numpy as np

class PCA:
    def __init__(self, n_components=None):
        self.n_components = n_components
        self.components = None
        self.mean = None
        self.eigenvalues = None

    def fit(self, X):
        # Center the data
        self.mean = np.mean(X, axis=0)
        X_centered = X - self.mean

        # Calculate the covariance matrix
        C = np.cov(X_centered.T)

        # Compute the eigenvalues and eigenvectors
        eigenvalues, eigenvectors = np.linalg.eig(C)

        # Sort the eigenvalues in descending order
        sorted_indices = np.argsort(eigenvalues)[::-1]
        eigenvalues = eigenvalues[sorted_indices]
        eigenvectors = eigenvectors[:, sorted_indices]

        # Select the top n_components eigenvectors
        if self.n_components is not None:
            self.components = eigenvectors[:, :self.n_components]
        
        # Normalize the eigenvectors
        self.components = self.components / np.linalg.norm(self.components, axis=0, keepdims=True)

    def transform(self, X):
        # Center the data points
        X_centered = X - self.mean

        # Project the data onto the principal components
        X_transformed = X_centered.dot(self.components)

        return X_transformed
    
X = np.random.randn(100, 5)
pca = PCA(n_components=2) #Reduce the number of components to 2
pca.fit(X)
X_transformed = pca.transform(X)
X.shape, X_transformed.shape




((100, 5), (100, 2))