# Principal Component Analysis (PCA)

Principal Component Analysis (PCA) is a dimensionality reduction technique that is widely used in machine learning for data preprocessing, visualization, and noise reduction. PCA aims to project high-dimensional data onto a lower-dimensional space while preserving as much of the variance in the data as possible.

## History

PCA was first introduced by Karl Pearson in 1901 as a method for transforming correlated variables into linearly uncorrelated variables. Later in 1933, Harold Hotelling extended PCA to the analysis of multivariate data and provided a solid mathematical foundation for the method.

## Mathematical Equations

PCA involves the following steps:

1. Calculate the covariance matrix of the dataset.
2. Compute the eigenvalues and eigenvectors of the covariance matrix.
3. Sort the eigenvalues in descending order and select the top k eigenvectors corresponding to the k largest eigenvalues.
4. Project the original data onto the lower-dimensional space spanned by the top k eigenvectors.

The covariance matrix (Σ) of a dataset with n features is an n x n symmetric matrix, where the element at the ith row and jth column is the covariance between the ith and jth features:

Σ_ij = Cov(X_i, X_j)

The eigenvalues (λ) and eigenvectors (v) of the covariance matrix satisfy the following equation:

Σv = λv

## Learning Algorithm

The learning algorithm for PCA consists of the following steps:

1. Standardize the dataset (mean = 0, standard deviation = 1) to ensure equal importance of all features.
2. Calculate the covariance matrix of the standardized dataset.
3. Compute the eigenvalues and eigenvectors of the covariance matrix.
4. Sort the eigenvalues in descending order and select the top k eigenvectors.
5. Project the original data onto the lower-dimensional space spanned by the top k eigenvectors.

## Pros and Cons

**Pros:**
- Reduces the dimensionality of the data, which can help overcome the curse of dimensionality and improve the performance of machine learning algorithms.
- Can help visualize high-dimensional data.
- Removes multicollinearity between features and improves interpretability of the results.
- Can be used for noise reduction in the data.

**Cons:**
- Assumes that the principal components are linear combinations of the original features.
- Loss of information due to the reduction in dimensionality.
- Sensitive to the scaling of the features.

## Suitable Tasks and Datasets

PCA can be applied to various tasks, including:

- Data preprocessing
- Visualization of high-dimensional data
- Noise reduction
- Feature extraction

It is suitable for datasets with continuous features and can be particularly helpful when the dataset has a large number of features or when there is multicollinearity between the features.

## References

1. Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2(11), 559-572.
2. Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6), 417-441.


In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler

class PCA:
    def __init__(self, n_components):
        self.n_components = n_components

    def fit_transform(self, X):
        # Standardize the input dataset
        X_standardized = StandardScaler().fit_transform(X)

        # Calculate the covariance matrix
        covariance_matrix = np.cov(X_standardized.T)

        # Compute the eigenvalues and eigenvectors
        eigenvalues, eigenvectors = np.linalg.eig(covariance_matrix)

        # Sort the eigenvalues and eigenvectors in descending order
        sorted_indices = np.argsort(eigenvalues)[::-1]
        sorted_eigenvalues = eigenvalues[sorted_indices]
        sorted_eigenvectors = eigenvectors[:, sorted_indices]

        # Select the top n_components eigenvectors
        top_eigenvectors = sorted_eigenvectors[:, :self.n_components]

        # Project the data onto the lower-dimensional space
        X_reduced = X_standardized.dot(top_eigenvectors)
        return X_reduced

# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Apply PCA to the dataset, reducing the dimensionality to 2
pca = PCA(n_components=2)
X_reduced = pca.fit_transform(X)

# Visualize the results
plt.scatter(X_reduced[:, 0], X_reduced[:, 1], c=y, cmap='viridis')
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.title('PCA on Iris Dataset')
plt.show()
