# PCA

## Index
1. [**Eigenvalues and Eigenvectors**](#eigen)
2. [**PCA in Python**](#PCA)
3. [**Importance and limitations**](#limit)

## Eigenvalues and Eigenvectors <a class="anchor" id="eigen"></a>

Eigenvalues and eigenvectors are fundamental concepts in linear algebra with significant applications in machine learning and data science.

Given a square matrix **A**, an eigenvector **v** and an eigenvalue **λ** satisfy the equation:

$$ A \mathbf{v} = \lambda \mathbf{v} $$

- **Eigenvector**: A non-zero vector **v** that only changes by a scalar factor when the matrix **A** is applied to it.
- **Eigenvalue**: The scalar **λ** that represents how much the eigenvector is scaled during the transformation.

### Applications of Eigenvectors and Eigenvalues in Data Science

1. **Principal Component Analysis (PCA)**:
   - PCA is a dimensionality reduction technique that transforms data into a set of orthogonal (uncorrelated) components.
   - It uses eigenvalues and eigenvectors of the covariance matrix of the data to identify the principal components.
   - The eigenvectors represent the directions of maximum variance, and the eigenvalues indicate the magnitude of variance in these directions.

2. **Feature Reduction**:
   - By selecting the top eigenvectors (principal components) with the largest eigenvalues, we can reduce the number of features while retaining most of the data's variability.
   - This helps in reducing computational cost and avoiding overfitting.

3. **Stability and Dynamics**:
   - In systems modeled by differential equations, eigenvalues can indicate stability. For instance, in Markov chains, eigenvalues help understand the long-term behavior of the system.

4. **Graph Theory**:
   - Eigenvalues and eigenvectors of adjacency matrices or Laplacian matrices of graphs are used in spectral clustering, which is a technique for identifying communities within a graph.

5. **Data Transformation**:
   - Eigenvectors can be used to transform data into a new coordinate system, simplifying the problem and making patterns more apparent.

## PCA in Python <a class="anchor" id="PCA"></a>

Here's a simple example of performing PCA using Python's `numpy` and `scikit-learn` libraries:

In [1]:
import numpy as np
from sklearn.decomposition import PCA

# Sample data
data = np.array([[2.5, 2.4],
                 [0.5, 0.7],
                 [2.2, 2.9],
                 [1.9, 2.2],
                 [3.1, 3.0],
                 [2.3, 2.7],
                 [2, 1.6],
                 [1, 1.1],
                 [1.5, 1.6],
                 [1.1, 0.9]])

# Perform PCA
pca = PCA(n_components=2)
principal_components = pca.fit_transform(data)

print("Principal Components:\n", principal_components)
print("Eigenvalues:\n", pca.explained_variance_)
print("Eigenvectors:\n", pca.components_)

Principal Components:
 [[ 0.82797019  0.17511531]
 [-1.77758033 -0.14285723]
 [ 0.99219749 -0.38437499]
 [ 0.27421042 -0.13041721]
 [ 1.67580142  0.20949846]
 [ 0.9129491  -0.17528244]
 [-0.09910944  0.3498247 ]
 [-1.14457216 -0.04641726]
 [-0.43804614 -0.01776463]
 [-1.22382056  0.16267529]]
Eigenvalues:
 [1.28402771 0.0490834 ]
Eigenvectors:
 [[ 0.6778734   0.73517866]
 [ 0.73517866 -0.6778734 ]]


In this example:
- `principal_components` are the transformed data points.
- `pca.explained_variance_` gives the eigenvalues.
- `pca.components_` provides the eigenvectors.

## Importance and limitations of PCA Analysis <a class="anchor" id="limit"></a>

As previously stated, PCA helps simplify complex datasets while retaining important information. It manages this due to the following factors:

1. **Dimensional Reduction**: Many datastes have a high number of features, which can be innefficient or even lead to overfitting. In this point PCA reduces the number of variables while preserving key patterns, making the data easier to analyze and visualize.
2. **Redundancy and Correlation removal**: Actual data that is used often has multicollinearity features. When PCA is applied, correlated features are transformed into a set of uncorrelated `principal components`. This prevents redundancy and improves the stability of the future model.
3. **Handling Noise**: By identifying the `principal components` that cause the most variance in the data, PCA filters out noise. This improves the robustness by removing features considered irrelevant, or less informative.
4. **Data Visualization Enhancement**: PCA allows projection into 2D or 3D space, making it possible to visualize patterns, clusters, and trends in the data.
5. **Feature Extraction and Engineering**: PCA is used in feature extraction by generating new features (principal components) from existing ones.
These new features often capture more meaningful relationships, enhancing the predictive power of machine learning models.

Therefore, PCA is recommended in various domains, such as `Image Processing`, `Finance`, `Genomics` and `Recommendation Systems`. 

Nevertheless, it has its limitations:

- It assumes linear realtionships between variables;
- Interpretability of transformed components is difficult.
- It can possibly not work well when data variance does not capture important patterns.

## Resources

- **Gilbert Strang - Linear Algebra and Its Applications**
- [MIT OpenCourseWare - Lecture 21: Eigenvalues and Eigenvectors](https://ocw.mit.edu/courses/18-06-linear-algebra-spring-2010/resources/lecture-21-eigenvalues-and-eigenvectors/)
- [Benyamin Ghojogh, Mark Crowley - Unsupervised and Supervised Principal Component Analysis: Tutorial](https://arxiv.org/abs/1906.03148)