# Day 7: Eigenvalues and Eigenvectors

Welcome to Day 7! Today, we explore one of the most important concepts in linear algebra for machine learning: **eigenvalues** and **eigenvectors**. These concepts are at the heart of dimensionality reduction techniques like Principal Component Analysis (PCA) and help us understand the underlying structure of data.

## Objectives for Today:
- Understand the definition of eigenvalues and eigenvectors.
- Grasp their geometric interpretation (vectors that only scale under a transformation).
- Learn how to calculate eigenvalues and eigenvectors using NumPy.
- Verify the fundamental eigenvector equation: `Av = λv`.
- Connect these concepts to their application in PCA.

In [1]:
# Import necessary libraries
import numpy as np

## 1. What are Eigenvalues and Eigenvectors?

For a given square matrix `A` (which represents a linear transformation), an **eigenvector** is a special non-zero vector `v` that, when the matrix `A` is multiplied by it, yields a new vector that is simply a scaled version of the original vector `v`.

The scalar used to scale the eigenvector is the **eigenvalue**, denoted by `λ` (lambda).

This relationship is captured by the fundamental equation:

### **`Av = λv`**

Where:
-   `A` is an `n x n` square matrix (the transformation).
-   `v` is an `n x 1` non-zero column vector (the **eigenvector**).
-   `λ` is a scalar (the **eigenvalue**).

### Geometric Interpretation
Imagine a transformation `A` that rotates and stretches space. Most vectors will change their direction after the transformation. However, **eigenvectors** are special because they **do not change their direction** (or point in the exact opposite direction). They are only stretched or shrunk by a factor of their corresponding eigenvalue `λ`.

- If `λ > 1`, the eigenvector is stretched.
- If `0 < λ < 1`, the eigenvector is shrunk.
- If `λ < 0`, the eigenvector is flipped and then scaled.

## 2. Eigen-decomposition with NumPy

Calculating eigenvalues and eigenvectors by hand involves solving the characteristic equation `det(A - λI) = 0`, which can be complex. Fortunately, NumPy provides a convenient function: `np.linalg.eig()`.

This function takes a square matrix `A` and returns a tuple containing:
1.  A 1D array of eigenvalues.
2.  A 2D array where each **column** is a corresponding eigenvector.

In [2]:
# Define a 2x2 matrix
A = np.array([[1, 4],
              [2, 3]])

print("Matrix A:\n", A)

# Calculate eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

print("\n---\n")
print("Eigenvalues:\n", eigenvalues)
print("\nEigenvectors (each column is an eigenvector):\n", eigenvectors)

Matrix A:
 [[1 4]
 [2 3]]

---

Eigenvalues:
 [-1.  5.]

Eigenvectors (each column is an eigenvector):
 [[-0.89442719 -0.70710678]
 [ 0.4472136  -0.70710678]]


### **Exercise 1: Find Eigenvalues and Eigenvectors**

1.  Create the following matrix `B`:
    ```
    [[ 2, -1],
     [-2,  3]]
    ```
2.  Use `np.linalg.eig()` to find its eigenvalues and eigenvectors.
3.  Print both the eigenvalues and the eigenvectors.

In [3]:
# Your code for Exercise 1 here

In [4]:
# Solution
B = np.array([[2, -1],
              [-2, 3]])

eig_vals_B, eig_vecs_B = np.linalg.eig(B)

print("Matrix B:\n", B)
print("\nEigenvalues of B:\n", eig_vals_B)
print("\nEigenvectors of B:\n", eig_vecs_B)

Matrix B:
 [[ 2 -1]
 [-2  3]]

Eigenvalues of B:
 [1. 4.]

Eigenvectors of B:
 [[-0.70710678  0.4472136 ]
 [-0.70710678 -0.89442719]]


## 3. Verifying the Eigenvector Equation

Let's confirm the `Av = λv` relationship using the results from our first example.
We will check for the first eigenvalue and its corresponding eigenvector.

In [5]:
# Get the first eigenvalue and eigenvector
lambda1 = eigenvalues[0]
v1 = eigenvectors[:, 0] # First column

print("First Eigenvalue (λ1):", np.round(lambda1, 2))
print("First Eigenvector (v1):\n", v1)

# Calculate A * v1
Av1 = A @ v1

print("\n---")
print("Check 1: A @ v1\n", Av1)

# Calculate λ1 * v1
lambda1_v1 = lambda1 * v1
print("\nCheck 2: λ1 * v1\n", lambda1_v1)

# Use np.allclose() to check for equality with a tolerance for floating point errors
print("\nAre they close?", np.allclose(Av1, lambda1_v1))

First Eigenvalue (λ1): -1.0
First Eigenvector (v1):
 [-0.89442719  0.4472136 ]

---
Check 1: A @ v1
 [ 0.89442719 -0.4472136 ]

Check 2: λ1 * v1
 [ 0.89442719 -0.4472136 ]

Are they close? True


### **Exercise 2: Verify the Relationship**

1.  Take the **second** eigenvalue (`eig_vals_B[1]`) and eigenvector (`eig_vecs_B[:, 1]`) you calculated for matrix `B` in Exercise 1.
2.  Verify that `B @ v` is approximately equal to `λ * v`.
3.  Print the results of both calculations and use `np.allclose()` to confirm.

In [6]:
# Your code for Exercise 2 here

In [7]:
# Solution

lambda2_B = eig_vals_B[1]
v2_B = eig_vecs_B[:, 1]

print(f"Second Eigenvalue (λ2): {np.round(lambda2_B, 2)}")
print(f"Second Eigenvector (v2): {v2_B}\n")

print("---")
print(f"B @ v2 = {B @ v2_B}")
print(f"λ2 * v2 = {lambda2_B * v2_B}\n")
print(f"Are they close? {np.allclose(B @ v2_B, lambda2_B * v2_B)}")

Second Eigenvalue (λ2): 4.0
Second Eigenvector (v2): [ 0.4472136  -0.89442719]

---
B @ v2 = [ 1.78885438 -3.57770876]
λ2 * v2 = [ 1.78885438 -3.57770876]

Are they close? True


## 4. Visualization and ML Connection (PCA)

In Machine Learning, one of the most significant applications of eigenvectors is **Principal Component Analysis (PCA)**.

PCA is a dimensionality reduction technique. It works by finding the directions of **maximum variance** in the data. These directions turn out to be the eigenvectors of the data's covariance matrix.

-   The **eigenvector with the largest eigenvalue** is the **first principal component**. It is the direction in which the data varies the most.
-   The eigenvector with the second-largest eigenvalue is the second principal component, and so on.

By projecting the data onto a smaller number of principal components (the top eigenvectors), we can reduce the number of features while retaining the most important information (variance).

### **Exercise 3: Conceptual Discussion**

Consider a dataset with 10 features. You compute the covariance matrix and find its 10 eigenvalues. The first three eigenvalues are `[50.2, 25.1, 0.5]` and the remaining seven are all less than `0.1`.

1.  What does the large magnitude of the first two eigenvalues tell you about the data?
2.  If you were to use PCA to reduce the dimensionality of this dataset, how many principal components (eigenvectors) would you likely keep? Why?

*Write your answer in the markdown cell below.*

*(Your answer for Exercise 3 here)*

**Solution:**
1. The large magnitude of the first two eigenvalues (50.2 and 25.1) indicates that most of the variance (spread) in the data lies along the directions of their corresponding eigenvectors (the first two principal components). The data is highly correlated and not just a random cloud of points.

2. You would likely keep just **two** principal components. The first two eigenvalues account for the vast majority of the total variance (50.2 + 25.1 = 75.3), while the other eight eigenvalues are very small, meaning their corresponding eigenvectors represent directions with very little information (variance). By projecting the 10-dimensional data onto the first two principal components, you can reduce the dataset to 2 dimensions while preserving most of its important structure.

## Day 7 Summary and Key Takeaways

Great job today! Eigenvalues and eigenvectors are a deep topic, but understanding them from a practical standpoint is a huge step forward.

Here's what we covered:
-   **Eigenvectors** are vectors whose direction is unchanged by a linear transformation.
-   **Eigenvalues (`λ`)** are the scalars by which eigenvectors are scaled during the transformation.
-   The core relationship is **`Av = λv`**.
-   NumPy's **`np.linalg.eig()`** is the essential tool for finding eigenvalues and eigenvectors.
-   In Machine Learning, eigenvalues and eigenvectors are the foundation of **PCA**, where they help find the directions of maximum variance in the data for effective dimensionality reduction.

Tomorrow, we'll look at another powerful matrix decomposition technique that is even more general: Singular Value Decomposition (SVD).