# The Gram-Schmidt Process

### From Linearly Independent Vectors to an Orthonormal Basis

This notebook explains the Gram-Schmidt process, a fundamental algorithm in linear algebra for converting a set of linearly independent vectors into an **orthonormal** set that spans the same subspace. The explanation, provides both the theoretical steps and a practical Python implementation.

## Why Do We Need Orthonormal Bases? 🤔

As the notes point out, life in linear algebra is much easier if we can work with an **orthonormal basis**.

A set of vectors is **orthonormal** if every vector in the set is:
1.  **Orthogonal** to every other vector in the set (their dot product is 0).
2.  A **Normal** (or unit) vector, meaning its length (magnitude) is 1.

Starting with a set of linearly independent vectors, $V = \{v_1, v_2, ..., v_m\}$, which are not necessarily orthogonal or of unit length, the Gram-Schmidt process provides a systematic way to construct a corresponding orthonormal basis $E = \{e_1, e_2, ..., e_m\}$.

**Key benefits of orthonormal bases:**
* **Simpler Computations:** Calculating coordinates and projections becomes much simpler. The coordinates of a vector in an orthonormal basis are just the dot products of the vector with the basis vectors.
* **Numerical Stability:** They are preferred in numerical algorithms as they can reduce rounding errors.
* **Geometric Intuition:** They align with our standard Cartesian coordinate system ($ \hat{i}, \hat{j}, \hat{k} $), making geometric interpretations more intuitive.

## A Quick Refresher: Normalizing Vectors

Before diving into the process, let's clarify what it means to **normalize** a vector.

Normalization is the process of scaling a vector so that its length (magnitude or L2-norm) becomes 1, while its direction remains unchanged.

For a non-zero vector $w$, its normalized version, $e$, is calculated as:

$$ e = \frac{w}{||w||} $$

Where $||w||$ is the magnitude (norm) of $w$.

As noted, normalization is crucial for projections. If we project a vector $v$ onto another vector $w$ that is *not* normalized, the length of the projection will be distorted. By normalizing the vector we are projecting onto, we ensure the geometric properties are preserved.

In [2]:
import numpy as np

# Define a vector
w = np.array([2, 3])

# Calculate its magnitude (L2 norm)
w_norm = np.linalg.norm(w)
print(f"Vector w: {w}")
print(f"Magnitude of w: {w_norm:.4f}")

# Normalize the vector
e = w / w_norm
print(f"Normalized vector e: {e}")

# Verify the magnitude of the normalized vector is 1
e_norm = np.linalg.norm(e)
print(f"Magnitude of e: {e_norm:.4f}")

Vector w: [2 3]
Magnitude of w: 3.6056
Normalized vector e: [0.5547002  0.83205029]
Magnitude of e: 1.0000


## The Gram-Schmidt Algorithm: A Step-by-Step Guide

Let's start with a set of linearly independent vectors $V = \{v_1, v_2, v_3, ...\}$. Our goal is to produce an orthonormal set $E = \{e_1, e_2, e_3, ...\}$.

---

### Step 1: The First Basis Vector ($e_1$)

The first step is the simplest. We take the first vector from our original set, $v_1$, and just normalize it. This becomes our first orthonormal basis vector, $e_1$.

$$ e_1 = \frac{v_1}{||v_1||} $$

This vector $e_1$ sets the initial direction for our new basis.

### Step 2: The Second Basis Vector ($e_2$)

Now, we take the second vector, $v_2$. We need to create a new vector that is orthogonal to our first basis vector, $e_1$. The key idea is to decompose $v_2$ into two components:
1.  A component **parallel** to $e_1$.
2.  A component **perpendicular** to $e_1$.

The parallel component is simply the vector projection of $v_2$ onto $e_1$. Since $e_1$ is already a unit vector, the projection formula is:

$$ \text{proj}_{e_1}(v_2) = (v_2 \cdot e_1) e_1 $$



To get the perpendicular component (which we'll call $w_2$), we subtract the parallel component from the original vector $v_2$:

$$ w_2 = v_2 - \text{proj}_{e_1}(v_2) = v_2 - (v_2 \cdot e_1) e_1 $$

This new vector $w_2$ is, by construction, orthogonal to $e_1$. The final step is to normalize $w_2$ to get our second orthonormal basis vector, $e_2$.

$$ e_2 = \frac{w_2}{||w_2||} $$

### Step 3: The Third Basis Vector ($e_3$) and Generalization

The process continues in the same way. For the third vector, $v_3$, we want to find a component that is perpendicular to the entire plane spanned by $e_1$ and $e_2$.

We do this by subtracting the projections of $v_3$ onto *both* $e_1$ and $e_2$:

$$ w_3 = v_3 - \text{proj}_{e_1}(v_3) - \text{proj}_{e_2}(v_3) $$
$$ w_3 = v_3 - (v_3 \cdot e_1) e_1 - (v_3 \cdot e_2) e_2 $$

The resulting vector $w_3$ is orthogonal to both $e_1$ and $e_2$. We then normalize it to get $e_3$:

$$ e_3 = \frac{w_3}{||w_3||} $$

#### The General Formula

For any k-th vector $v_k$, we construct the orthogonal vector $w_k$ by subtracting the projections of $v_k$ onto all the previously computed orthonormal basis vectors ($e_1, e_2, ..., e_{k-1}$):

$$ w_k = v_k - \sum_{i=1}^{k-1} (v_k \cdot e_i) e_i $$

And then we normalize to find $e_k$:

$$ e_k = \frac{w_k}{||w_k||} $$

## Python Implementation Example

Let's apply this process to a set of 3 linearly independent vectors in $\mathbb{R}^3$.

We will start with the set $V = \{v_1, v_2, v_3\}$ where:
* $v_1 = [1, 1, 0]$
* $v_2 = [1, 2, 0]$
* $v_3 = [2, 1, 2]$

In [3]:
# Our initial set of linearly independent vectors
v1 = np.array([1, 1, 0], dtype=float)
v2 = np.array([1, 2, 0], dtype=float)
v3 = np.array([2, 1, 2], dtype=float)

# --- Step 1: Find e1 ---
e1 = v1 / np.linalg.norm(v1)
print(f"e1 = {e1}\n")

# --- Step 2: Find e2 ---
# Project v2 onto e1
proj_e1_v2 = np.dot(v2, e1) * e1
# Subtract the projection to get the orthogonal vector w2
w2 = v2 - proj_e1_v2
# Normalize w2 to get e2
e2 = w2 / np.linalg.norm(w2)
print(f"w2 = {w2}")
print(f"e2 = {e2}\n")

# --- Step 3: Find e3 ---
# Project v3 onto e1
proj_e1_v3 = np.dot(v3, e1) * e1
# Project v3 onto e2
proj_e2_v3 = np.dot(v3, e2) * e2
# Subtract both projections to get the orthogonal vector w3
w3 = v3 - proj_e1_v3 - proj_e2_v3
# Normalize w3 to get e3
e3 = w3 / np.linalg.norm(w3)
print(f"w3 = {w3}")
print(f"e3 = {e3}")

e1 = [0.70710678 0.70710678 0.        ]

w2 = [-0.5  0.5  0. ]
e2 = [-0.70710678  0.70710678  0.        ]

w3 = [ 1.27675648e-15 -3.33066907e-16  2.00000000e+00]
e3 = [ 6.38378239e-16 -1.66533454e-16  1.00000000e+00]


## Verification

Now, let's verify that our new set of vectors $\{e_1, e_2, e_3\}$ is indeed orthonormal.

1.  **Orthogonality:** The dot product of any two distinct vectors should be 0.
2.  **Normality:** The magnitude (norm) of each vector should be 1.

In [4]:
# Create a matrix from the resulting basis vectors (as rows)
E = np.array([e1, e2, e3])
print("Our orthonormal basis E:")
print(E)

# 1. Check for orthogonality
print("\n--- Orthogonality Check (dot products should be ~0) ---")
print(f"e1 . e2 = {np.dot(e1, e2):.10f}")
print(f"e1 . e3 = {np.dot(e1, e3):.10f}")
print(f"e2 . e3 = {np.dot(e2, e3):.10f}")

# A more elegant way is to compute E @ E.T, which should be the identity matrix
identity_matrix = E @ E.T
print("\nE @ E.T (should be the Identity Matrix):")
print(np.round(identity_matrix, 10)) # Round to handle floating point inaccuracies


# 2. Check for normality
print("\n--- Normality Check (magnitudes should be 1) ---")
print(f"||e1|| = {np.linalg.norm(e1):.4f}")
print(f"||e2|| = {np.linalg.norm(e2):.4f}")
print(f"||e3|| = {np.linalg.norm(e3):.4f}")

Our orthonormal basis E:
[[ 7.07106781e-01  7.07106781e-01  0.00000000e+00]
 [-7.07106781e-01  7.07106781e-01  0.00000000e+00]
 [ 6.38378239e-16 -1.66533454e-16  1.00000000e+00]]

--- Orthogonality Check (dot products should be ~0) ---
e1 . e2 = 0.0000000000
e1 . e3 = 0.0000000000
e2 . e3 = -0.0000000000

E @ E.T (should be the Identity Matrix):
[[ 1.  0.  0.]
 [ 0.  1. -0.]
 [ 0. -0.  1.]]

--- Normality Check (magnitudes should be 1) ---
||e1|| = 1.0000
||e2|| = 1.0000
||e3|| = 1.0000


## A General Gram-Schmidt Function

We can encapsulate the logic into a single Python function that can handle any number of input vectors.

In [5]:
def gram_schmidt(vectors):
    """
    Performs the Gram-Schmidt process on a list of vectors.
    
    Args:
        vectors (list of np.array): A list of linearly independent vectors.
        
    Returns:
        np.array: A matrix where each row is an orthonormal basis vector.
    """
    basis = []
    for v in vectors:
        # Subtract the projections of v onto the existing basis vectors
        w = v - sum(np.dot(v, e) * e for e in basis)
        
        # Check for linear dependence. If w is the zero vector, the original
        # vectors were not linearly independent.
        if np.linalg.norm(w) > 1e-10: # Use a small tolerance for floating point
            basis.append(w / np.linalg.norm(w))
    return np.array(basis)

# Our original vectors as a list
V = [v1, v2, v3]

# Apply the function
E_func = gram_schmidt(V)

print("Orthonormal basis from the function:")
print(E_func)

# Verify the result
print("\nVerification (E @ E.T):")
print(np.round(E_func @ E_func.T, 10))

Orthonormal basis from the function:
[[ 7.07106781e-01  7.07106781e-01  0.00000000e+00]
 [-7.07106781e-01  7.07106781e-01  0.00000000e+00]
 [ 6.66133815e-16 -2.22044605e-16  1.00000000e+00]]

Verification (E @ E.T):
[[ 1.  0.  0.]
 [ 0.  1. -0.]
 [ 0. -0.  1.]]


## Conclusion

The Gram-Schmidt process is a powerful and intuitive algorithm for constructing an orthonormal basis from any set of linearly independent vectors. By iteratively subtracting projections and normalizing, it "straightens out" the original vectors into a perfectly perpendicular, unit-length framework. This resulting basis simplifies countless problems in linear algebra, numerical analysis, and beyond.