## Principal Component Analysis

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms a dataset into a new coordinate system such that the greatest variance lies along the first axis (the first principal component), the second greatest along the second axis, and so on.

Given a dataset $X \in \mathbb{R}^{m \times n}$ with $m$ samples and $n$ features, PCA performs the following steps:

1. **Center the data**:
   $$
   X_{\text{centered}} = X - \bar{X}
   $$

2. **Compute the covariance matrix**:
   $$
   \Sigma = \frac{1}{m} X_{\text{centered}}^\top X_{\text{centered}}
   $$

3. **Compute the eigenvalues and eigenvectors** of $\Sigma$:
   $$
   \Sigma \mathbf{v}_i = \lambda_i \mathbf{v}_i
   $$

4. **Select the top $k$ eigenvectors** (principal components) corresponding to the largest eigenvalues and form the projection matrix:
   $$
   W_k = [\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_k]
   $$

5. **Project the data** onto the new $k$-dimensional subspace:
   $$
   Z = X_{\text{centered}} W_k
   $$

PCA reduces noise and redundancy in the data while preserving the directions of maximum variance. It is widely used for data compression, visualization, and preprocessing before machine learning.


In [None]:
class PCA:
    def __init__(self, n_components):
        self.n_components = n_components
        self.components = None
        self.mean = None

    def fit(self, X):
        self.mean = np.mean(X, axis=0)
        X_centered = X - self.mean
        cov = np.cov(X_centered, rowvar=False)
        eigenvalues, eigenvectors = np.linalg.eigh(cov)
        idx = np.argsort(eigenvalues)[::-1]
        eigenvectors = eigenvectors[:, idx]
        self.components = eigenvectors[:, :self.n_components]

    def transform(self, X):
        X_centered = X - self.mean
        return np.dot(X_centered, self.components)

    def fit_transform(self, X):
        self.fit(X)
        return self.transform(X)

## Linear Discriminant Analysis

Linear Discriminant Analysis (LDA) is a supervised dimensionality reduction technique that finds a linear combination of features that best separates two or more classes. Unlike PCA, LDA uses class label information to maximize class separability.

Given a dataset with $C$ classes, LDA computes two scatter matrices:

- **Within-class scatter**:
  $$
  S_W = \sum_{c=1}^{C} \sum_{\mathbf{x}_i \in \mathcal{D}_c} (\mathbf{x}_i - \boldsymbol{\mu}_c)(\mathbf{x}_i - \boldsymbol{\mu}_c)^\top
  $$

- **Between-class scatter**:
  $$
  S_B = \sum_{c=1}^{C} n_c (\boldsymbol{\mu}_c - \boldsymbol{\mu})(\boldsymbol{\mu}_c - \boldsymbol{\mu})^\top
  $$

where \( \boldsymbol{\mu}_c \) is the mean of class \( c \), \( \boldsymbol{\mu} \) is the overall mean, and \( n_c \) is the number of samples in class \( c \).

LDA finds the projection matrix \( W \) that maximizes the ratio:

$$
W = \arg\max_W \frac{|W^\top S_B W|}{|W^\top S_W W|}
$$

This leads to solving the generalized eigenvalue problem:

$$
S_W^{-1} S_B \mathbf{w} = \lambda \mathbf{w}
$$

The top $k$ eigenvectors form the transformation matrix, and the data is projected onto this subspace for classification or visualization.


In [1]:
class LDA:
    def __init__(self, n_components):
        self.n_components = n_components
        self.linear_discriminants = None

    def fit(self, X, y):
        n_features = X.shape[1]
        class_labels = np.unique(y)
        mean_overall = np.mean(X, axis=0)
        S_W = np.zeros((n_features, n_features))
        S_B = np.zeros((n_features, n_features))

        for c in class_labels:
            X_c = X[y == c]
            mean_c = np.mean(X_c, axis=0)
            S_W += np.dot((X_c - mean_c).T, (X_c - mean_c))
            n_c = X_c.shape[0]
            mean_diff = (mean_c - mean_overall).reshape(n_features, 1)
            S_B += n_c * np.dot(mean_diff, mean_diff.T)

        A = np.linalg.inv(S_W).dot(S_B)
        eigenvalues, eigenvectors = np.linalg.eig(A)
        idx = np.argsort(np.abs(eigenvalues))[::-1]
        self.linear_discriminants = eigenvectors[:, idx[:self.n_components]].real

    def transform(self, X):
        return np.dot(X, self.linear_discriminants)