# Gram-Schmidt Process: From Linearly Independent to Orthonormal

## 🎯 Learning Objectives
By the end of this exercise, you will be able to:
- Implement the Gram-Schmidt orthogonalization process step by step
- Transform any set of linearly independent vectors into an orthonormal basis
- Calculate the dimension of a vector space spanned by a given set of vectors
- Understand the geometric intuition behind orthogonalization

## 📖 Mathematical Background

The **Gram-Schmidt process** is a fundamental algorithm in linear algebra that converts a set of linearly independent vectors into an **orthonormal basis** for the same vector space. This process is crucial in many applications including:

- **QR decomposition** of matrices
- **Principal Component Analysis (PCA)**
- **Least squares** regression
- **Signal processing** and **data compression**

### Key Concepts

**Orthogonal vectors**: Two vectors $\mathbf{u}$ and $\mathbf{v}$ are orthogonal if their dot product is zero: $\mathbf{u} \cdot \mathbf{v} = 0$

**Orthonormal vectors**: Vectors that are both orthogonal to each other and have unit length (norm = 1)

**Vector projection**: The projection of vector $\mathbf{a}$ onto vector $\mathbf{b}$ is:
$$\text{proj}_{\mathbf{b}}\mathbf{a} = \frac{\mathbf{a} \cdot \mathbf{b}}{\mathbf{b} \cdot \mathbf{b}}\mathbf{b}$$

### The Algorithm
Given vectors $\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_n$, the Gram-Schmidt process produces orthonormal vectors $\mathbf{u}_1, \mathbf{u}_2, \ldots, \mathbf{u}_n$:

1. **First vector**: $\mathbf{u}_1 = \frac{\mathbf{v}_1}{\|\mathbf{v}_1\|}$

2. **Subsequent vectors**: For $k = 2, 3, \ldots, n$:
   - Remove projections: $\mathbf{w}_k = \mathbf{v}_k - \sum_{j=1}^{k-1} \text{proj}_{\mathbf{u}_j}\mathbf{v}_k$
   - Normalize: $\mathbf{u}_k = \frac{\mathbf{w}_k}{\|\mathbf{w}_k\|}$ (if $\mathbf{w}_k \neq \mathbf{0}$)

## 🎯 Your Mission
In this assignment you will write a function to perform the Gram-Schmidt procedure, which takes a list of vectors and forms an orthonormal basis from this set.
As a corollary, the procedure allows us to determine the dimension of the space spanned by the basis vectors, which is equal to or less than the space which the vectors sit.

You'll start by completing a function for 4 basis vectors, before generalising to when an arbitrary number of vectors are given.

Again, a framework for the function has already been written.
Look through the code, and you'll be instructed where to make changes.
We'll do the first two rows, and you can use this as a guide to do the last two.

### Matrices in Python
Remember the structure for matrices in *numpy* is:
```python
A[0, 0]  A[0, 1]  A[0, 2]  A[0, 3]
A[1, 0]  A[1, 1]  A[1, 2]  A[1, 3]
A[2, 0]  A[2, 1]  A[2, 2]  A[2, 3]
A[3, 0]  A[3, 1]  A[3, 2]  A[3, 3]
```

**Key Operations You'll Need:**
- Access individual elements: `A[n, m]`
- Access whole rows: `A[n]` or `A[n, :]`
- **Access whole columns**: `A[:, m]` ← *This selects the m'th column*
- **Dot product**: Use `u @ v` to compute $\mathbf{u} \cdot \mathbf{v}$
- **Vector norm**: Use `la.norm(v)` to compute $\|\mathbf{v}\|$

### 💡 Implementation Tips
1. **Work column by column**: Each column represents a vector to orthogonalize
2. **Subtract projections**: Remove components parallel to previous orthonormal vectors
3. **Check for linear dependence**: If a vector becomes very small after projection removal, it's linearly dependent
4. **Normalize carefully**: Only normalize non-zero vectors

All the code you should complete will be at the same level of indentation as the instruction comment.

### How to submit
Edit the code in the cell below to complete the assignment.
Once you are finished and happy with it, press the *Submit Assignment* button at the top of this notebook.

Please don't change any of the function names, as these will be checked by the grading script.

If you have further questions about submissions or programming assignments, here is a [list](https://www.coursera.org/learn/linear-algebra-machine-learning/discussions/weeks/1/threads/jB4klkn5EeibtBIQyzFmQg) of Q&A. You can also raise an issue on the discussion forum. Good luck!

In [3]:
# GRADED FUNCTION
import numpy as np
import numpy.linalg as la

verySmallNumber = 1e-14 # That's 1×10⁻¹⁴ = 0.00000000000001

# Our first function will perform the Gram-Schmidt procedure for 4 basis vectors.
# We'll take this list of vectors as the columns of a matrix, A.
# We'll then go through the vectors one at a time and set them to be orthogonal
# to all the vectors that came before it. Before normalising.
# Follow the instructions inside the function at each comment.
# You will be told where to add code to complete the function.
def gsBasis4(A) :
    """
    Perform Gram-Schmidt orthogonalization on exactly 4 vectors.
    
    Parameters:
    A : numpy array of shape (n, 4)
        Matrix where each column is a vector to be orthogonalized
        
    Returns:
    B : numpy array of shape (n, 4)
        Matrix where each column is an orthonormal vector
    """
    B = np.array(A, dtype=np.float_) # Make B as a copy of A, since we're going to alter it's values.
    
    # STEP 1: Handle the first vector (column 0)
    # The zeroth column is easy, since it has no other vectors to make it normal to.
    # All that needs to be done is to normalise it. I.e. divide by its modulus, or norm.
    B[:, 0] = B[:, 0] / la.norm(B[:, 0])
    
    # STEP 2: Handle the second vector (column 1)
    # For the first column, we need to subtract any overlap with our new zeroth vector.
    # This removes the component of v₁ that's parallel to u₀
    B[:, 1] = B[:, 1] - B[:, 1] @ B[:, 0] * B[:, 0]
    # If there's anything left after that subtraction, then B[:, 1] is linearly independant of B[:, 0]
    # If this is the case, we can normalise it. Otherwise we'll set that vector to zero.
    if la.norm(B[:, 1]) > verySmallNumber :
        B[:, 1] = B[:, 1] / la.norm(B[:, 1])
    else :
        B[:, 1] = np.zeros_like(B[:, 1])
    
    # STEP 3: Handle the third vector (column 2)
    # Now we need to repeat the process for column 2.
    # Insert two lines of code, the first to subtract the overlap with the zeroth vector,
    # and the second to subtract the overlap with the first.
    # HINT: B[:, 2] = B[:, 2] - (B[:, 2] @ B[:, ?]) * B[:, ?]
    
    
    # Again we'll need to normalise our new vector.
    # Copy and adapt the normalisation fragment from above to column 2.
    # HINT: Check if la.norm(B[:, 2]) > verySmallNumber, then normalize or set to zero

    
    # STEP 4: Handle the fourth vector (column 3)
    # Finally, column three:
    # Insert code to subtract the overlap with the first three vectors.
    # HINT: You need three lines, one for each previous orthonormal vector

    
    # Now normalise if possible
    # HINT: Same pattern as above - check norm, then normalize or zero out
   
    
    # Finally, we return the result:
    return B

# The second part of this exercise will generalise the procedure.
# Previously, we could only have four vectors, and there was a lot of repeating in the code.
# We'll use a for-loop here to iterate the process for each vector.
def gsBasis(A) :
    """
    Perform Gram-Schmidt orthogonalization on any number of vectors.
    
    Parameters:
    A : numpy array of shape (n, m)
        Matrix where each column is a vector to be orthogonalized
        
    Returns:
    B : numpy array of shape (n, m)
        Matrix where each column is an orthonormal vector
    """
    B = np.array(A, dtype=np.float_) # Make B as a copy of A, since we're going to alter it's values.
    
    # Loop over all vectors, starting with zero, label them with i
    for i in range(B.shape[1]) :
        # Inside that loop, loop over all previous vectors, j, to subtract.
        for j in range(i) :
            # Complete the code to subtract the overlap with previous vectors.
            # you'll need the current vector B[:, i] and a previous vector B[:, j]
            # HINT: B[:, i] = B[:, i] - ...what goes here?...

        # Next insert code to do the normalisation test for B[:, i]
        # HINT: Same pattern as in gsBasis4 - check if norm > verySmallNumber

            
        
            
    # Finally, we return the result:
    return B

# This function uses the Gram-schmidt process to calculate the dimension
# spanned by a list of vectors.
# Since each vector is normalised to one, or is zero,
# the sum of all the norms will be the dimension.
def dimensions(A) :
    """
    Calculate the dimension of the space spanned by the columns of A.
    
    Parameters:
    A : numpy array
        Matrix where each column is a vector
        
    Returns:
    int : The dimension of the span of the columns
    """
    return np.sum(la.norm(gsBasis(A), axis=0))

IndentationError: expected an indented block after 'for' statement on line 88 (611512097.py, line 100)

## 🧪 Test Your Implementation

Before submitting, it's crucial to test your code thoroughly! Run the cell above (select it and press Shift+Enter) to implement your functions.

The test cases below will help you verify that your implementation works correctly:

1. **Basic functionality test**: Does your algorithm produce orthonormal vectors?
2. **Idempotency test**: Running Gram-Schmidt on already orthonormal vectors should give the same result
3. **Dimension detection**: Can your algorithm detect linear dependence?
4. **Edge cases**: How does it handle non-square matrices and degenerate cases?

### 🎯 What to Look For:
- **Orthogonality**: Dot products between different output vectors should be ≈ 0
- **Normalization**: Each output vector should have length ≈ 1 (unless it's the zero vector)
- **Span preservation**: The output vectors should span the same space as the input vectors
- **Linear dependence handling**: Linearly dependent vectors should become zero vectors

Try out your code on the test cases below, and feel free to create your own tricky examples!

In [2]:
# Let's visualize the Gram-Schmidt process to build geometric intuition
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

def visualize_gram_schmidt_2d():
    """Visualize Gram-Schmidt process in 2D for intuition"""
    # Two linearly independent vectors in 2D
    v1 = np.array([3, 1])
    v2 = np.array([1, 2])
    
    # Apply Gram-Schmidt manually for visualization
    u1 = v1 / la.norm(v1)  # Normalize first vector
    
    # Project v2 onto u1 and subtract to get orthogonal component
    proj_v2_u1 = (v2 @ u1) * u1
    w2 = v2 - proj_v2_u1
    u2 = w2 / la.norm(w2)  # Normalize
    
    # Create the plot
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
    
    # Plot 1: Original vectors
    ax1.quiver(0, 0, v1[0], v1[1], angles='xy', scale_units='xy', scale=1, color='red', width=0.005, label='$v_1$')
    ax1.quiver(0, 0, v2[0], v2[1], angles='xy', scale_units='xy', scale=1, color='blue', width=0.005, label='$v_2$')
    ax1.set_xlim(-0.5, 4)
    ax1.set_ylim(-0.5, 3)
    ax1.grid(True, alpha=0.3)
    ax1.set_aspect('equal')
    ax1.legend()
    ax1.set_title('Original Vectors')
    ax1.set_xlabel('x')
    ax1.set_ylabel('y')
    
    # Plot 2: Gram-Schmidt process visualization
    ax2.quiver(0, 0, u1[0], u1[1], angles='xy', scale_units='xy', scale=1, color='red', width=0.005, label='$u_1$ (normalized $v_1$)')
    ax2.quiver(0, 0, proj_v2_u1[0], proj_v2_u1[1], angles='xy', scale_units='xy', scale=1, color='orange', width=0.005, 
              linestyle='--', alpha=0.7, label='proj$_{u_1}v_2$')
    ax2.quiver(0, 0, v2[0], v2[1], angles='xy', scale_units='xy', scale=1, color='blue', width=0.003, 
              alpha=0.5, label='$v_2$ (original)')
    ax2.quiver(0, 0, u2[0], u2[1], angles='xy', scale_units='xy', scale=1, color='green', width=0.005, label='$u_2$ (orthogonalized)')
    
    # Show the orthogonal component
    ax2.quiver(proj_v2_u1[0], proj_v2_u1[1], w2[0], w2[1], angles='xy', scale_units='xy', scale=1, 
              color='purple', width=0.003, alpha=0.8, label='$v_2 - $ proj$_{u_1}v_2$')
    
    ax2.set_xlim(-0.5, 4)
    ax2.set_ylim(-0.5, 3)
    ax2.grid(True, alpha=0.3)
    ax2.set_aspect('equal')
    ax2.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
    ax2.set_title('Gram-Schmidt Process')
    ax2.set_xlabel('x')
    ax2.set_ylabel('y')
    
    plt.tight_layout()
    plt.show()
    
    # Verify orthogonality
    print(f"Original vectors v1 and v2:")
    print(f"v1 = {v1}")
    print(f"v2 = {v2}")
    print(f"v1 · v2 = {v1 @ v2:.3f} (not orthogonal)")
    print(f"|v1| = {la.norm(v1):.3f}, |v2| = {la.norm(v2):.3f} (not unit length)")
    print()
    print(f"After Gram-Schmidt u1 and u2:")
    print(f"u1 = {u1}")
    print(f"u2 = {u2}")
    print(f"u1 · u2 = {u1 @ u2:.10f} (orthogonal!)")
    print(f"|u1| = {la.norm(u1):.3f}, |u2| = {la.norm(u2):.3f} (unit length!)")

# Run the visualization
visualize_gram_schmidt_2d()

NameError: name 'np' is not defined

In [None]:
# Test Case 1: Four linearly independent vectors in 4D space
# These vectors are chosen to be clearly non-orthogonal initially
print("🔍 Test Case 1: Four linearly independent vectors")
print("=" * 50)

V = np.array([[1, 0, 3, 5],
              [0, 2, 1, 4], 
              [4, 1, 2, 3],
              [2, -1, 4, 1]], dtype=np.float_)

print("Original matrix V:")
print(V)
print(f"Shape: {V.shape}")

# Test your gsBasis4 function
result = gsBasis4(V)
print("\nAfter Gram-Schmidt (gsBasis4):")
print(result)

# Let's verify orthogonality and normalization
print("\n📊 Verification:")
print("Column norms:", [f"{la.norm(result[:, i]):.6f}" for i in range(4)])

# Check orthogonality by computing all pairwise dot products
print("\nDot product matrix (should be close to identity):")
dot_matrix = result.T @ result
print(np.round(dot_matrix, 6))

# Check if it's close to identity matrix
is_orthonormal = np.allclose(dot_matrix, np.eye(4), atol=1e-10)
print(f"\nIs result orthonormal? {is_orthonormal}")
print(f"Maximum off-diagonal element: {np.max(np.abs(dot_matrix - np.eye(4))):.2e}")

array([[ 0.40824829, -0.1814885 ,  0.04982278,  0.89325973],
       [ 0.        ,  0.1088931 ,  0.99349591, -0.03328918],
       [ 0.81649658,  0.50816781, -0.06462163, -0.26631346],
       [ 0.40824829, -0.83484711,  0.07942048, -0.36063281]])

In [None]:
# Test Case 2: Idempotency Test
# Once you've done Gram-Schmidt once, doing it again should give you the same result
print("🔄 Test Case 2: Idempotency (Running Gram-Schmidt twice)")
print("=" * 60)

U = gsBasis4(V)
U_again = gsBasis4(U)

print("Original orthonormal matrix U:")
print(np.round(U, 6))

print("\nAfter applying Gram-Schmidt again:")
print(np.round(U_again, 6))

# Check if they're the same
difference = np.max(np.abs(U - U_again))
print(f"\nMaximum difference between U and U_again: {difference:.2e}")

is_same = np.allclose(U, U_again, atol=1e-12)
print(f"Are they the same (within tolerance)? {is_same}")

if is_same:
    print("✅ Great! Gram-Schmidt is idempotent on orthonormal matrices.")
else:
    print("❌ There might be an issue with your implementation.")

array([[ 0.40824829, -0.1814885 ,  0.04982278,  0.89325973],
       [ 0.        ,  0.1088931 ,  0.99349591, -0.03328918],
       [ 0.81649658,  0.50816781, -0.06462163, -0.26631346],
       [ 0.40824829, -0.83484711,  0.07942048, -0.36063281]])

In [None]:
# Test Case 3: Compare gsBasis4 with general gsBasis function
print("⚖️ Test Case 3: Comparing specific vs general implementation")
print("=" * 65)

result_specific = gsBasis4(V)
result_general = gsBasis(V)

print("Result from gsBasis4:")
print(np.round(result_specific, 6))

print("\nResult from gsBasis (general):")
print(np.round(result_general, 6))

# Check if they produce the same result
max_diff = np.max(np.abs(result_specific - result_general))
print(f"\nMaximum difference between the two methods: {max_diff:.2e}")

are_same = np.allclose(result_specific, result_general, atol=1e-12)
print(f"Do both methods give the same result? {are_same}")

if are_same:
    print("✅ Excellent! Your general implementation matches the specific one.")
else:
    print("❌ The implementations differ. Check your general gsBasis function.")

array([[ 0.40824829, -0.1814885 ,  0.04982278,  0.89325973],
       [ 0.        ,  0.1088931 ,  0.99349591, -0.03328918],
       [ 0.81649658,  0.50816781, -0.06462163, -0.26631346],
       [ 0.40824829, -0.83484711,  0.07942048, -0.36063281]])

In [None]:
# Test Case 4: Non-square matrices (More vectors than dimensions)
print("📐 Test Case 4: Non-square matrix (4×3 - more rows than columns)")
print("=" * 70)

# 4 rows, 3 columns - vectors in 4D space but only 3 vectors
A = np.array([[2, 1, 4],
              [1, 3, -2],
              [3, 2, 1],
              [1, 4, 5]], dtype=np.float_)

print("Original matrix A (4×3):")
print(A)
print(f"Shape: {A.shape}")
print("Each column is a vector in 4D space")

result_A = gsBasis(A)
print("\nAfter Gram-Schmidt:")
print(np.round(result_A, 6))

# Verify orthonormality
print("\n📊 Analysis:")
dot_matrix_A = result_A.T @ result_A
print("Gram matrix (A^T @ A):")
print(np.round(dot_matrix_A, 6))

print(f"\nColumn norms: {[f'{la.norm(result_A[:, i]):.6f}' for i in range(A.shape[1])]}")
print(f"Is orthonormal? {np.allclose(dot_matrix_A, np.eye(A.shape[1]), atol=1e-10)}")

# Check that we still span the same space
original_rank = np.linalg.matrix_rank(A)
result_rank = np.linalg.matrix_rank(result_A)
print(f"\nOriginal matrix rank: {original_rank}")
print(f"Result matrix rank: {result_rank}")
print(f"Rank preserved? {original_rank == result_rank}")

array([[ 0.23643312,  0.18771349,  0.22132104],
       [ 0.15762208,  0.74769023, -0.64395812],
       [ 0.15762208,  0.57790444,  0.72904263],
       [ 0.94573249, -0.26786082, -0.06951101]])

In [None]:
# Test Case 5: Dimension calculation
print("🔢 Test Case 5: Calculating dimension of vector space")
print("=" * 55)

dim_A = dimensions(A)
expected_dim = np.linalg.matrix_rank(A)

print(f"Matrix A (4×3):")
print(f"Calculated dimension using our function: {dim_A}")
print(f"Expected dimension (matrix rank): {expected_dim}")
print(f"Match? {abs(dim_A - expected_dim) < 1e-10}")

# Let's also check the original 4×4 matrix
dim_V = dimensions(V)
expected_dim_V = np.linalg.matrix_rank(V)

print(f"\nMatrix V (4×4):")
print(f"Calculated dimension using our function: {dim_V}")
print(f"Expected dimension (matrix rank): {expected_dim_V}")
print(f"Match? {abs(dim_V - expected_dim_V) < 1e-10}")

print(f"\n💡 Insight: The dimension equals the number of linearly independent vectors.")

3.0

In [None]:
# Test Case 6: Wide matrix (More columns than rows)
print("📏 Test Case 6: Wide matrix (3×5 - more columns than rows)")
print("=" * 65)

# 3 rows, 5 columns - 5 vectors in 3D space (some must be linearly dependent!)
B = np.array([[1, 3, 2, 4, 7],
              [2, 1, 4, 1, 9],
              [3, 2, 1, 3, 8]], dtype=np.float_)

print("Original matrix B (3×5):")
print(B)
print(f"Shape: {B.shape}")
print("We have 5 vectors in 3D space - some must be linearly dependent!")

result_B = gsBasis(B)
print("\nAfter Gram-Schmidt:")
print(np.round(result_B, 6))

# Count non-zero columns
non_zero_cols = np.sum(np.linalg.norm(result_B, axis=0) > 1e-10)
print(f"\n📊 Analysis:")
print(f"Number of non-zero columns: {non_zero_cols}")
print(f"Expected (max rank in 3D): 3")
print(f"Column norms: {[f'{la.norm(result_B[:, i]):.6f}' for i in range(B.shape[1])]}")

# Check which columns became zero (indicating linear dependence)
zero_cols = [i for i in range(B.shape[1]) if la.norm(result_B[:, i]) < 1e-10]
if zero_cols:
    print(f"Zero columns (linearly dependent): {zero_cols}")
else:
    print("No zero columns found.")

array([[ 0.93704257, -0.12700832, -0.32530002,  0.        ,  0.        ],
       [ 0.31234752,  0.72140727,  0.61807005,  0.        ,  0.        ],
       [ 0.15617376, -0.6807646 ,  0.71566005,  0.        ,  0.        ]])

In [None]:
# Test Case 7: Dimension of wide matrix
print("🔢 Test Case 7: Dimension of wide matrix")
print("=" * 45)

dim_B = dimensions(B)
expected_dim_B = np.linalg.matrix_rank(B)

print(f"Matrix B (3×5):")
print(f"Calculated dimension: {dim_B}")
print(f"Expected dimension (matrix rank): {expected_dim_B}")
print(f"Match? {abs(dim_B - expected_dim_B) < 1e-10}")

print(f"\n💡 Key insight: Even with 5 vectors in 3D space, the dimension is at most 3!")
print(f"   The extra vectors are linear combinations of the first {int(dim_B)} vectors.")

3.0

In [None]:
# Test Case 8: Explicit linear dependence
print("🔗 Test Case 8: Vectors with obvious linear dependence")
print("=" * 60)

# Create a matrix where the third column is exactly the first column
# This tests how well your algorithm detects linear dependence
C = np.array([[1, 0, 1],    # Third column = first column
              [0, 1, 0], 
              [2, 3, 2]], dtype=np.float_)   # Third column = first column

print("Original matrix C:")
print(C)
print("Notice: Column 3 = Column 1 (exact linear dependence)")

result_C = gsBasis(C)
print("\nAfter Gram-Schmidt:")
print(np.round(result_C, 10))

print(f"\n📊 Analysis:")
print(f"Column norms: {[f'{la.norm(result_C[:, i]):.10f}' for i in range(C.shape[1])]}")

# Check which column became zero
zero_threshold = 1e-10
zero_cols = [i for i in range(C.shape[1]) if la.norm(result_C[:, i]) < zero_threshold]
print(f"Columns that became zero: {zero_cols}")

if 2 in zero_cols:
    print("✅ Correct! The linearly dependent column (column 3) became zero.")
else:
    print("❌ Expected column 3 to become zero since it's linearly dependent.")

# Verify the remaining vectors are orthonormal
non_zero_result = result_C[:, [i for i in range(C.shape[1]) if i not in zero_cols]]
if non_zero_result.shape[1] > 0:
    gram_matrix = non_zero_result.T @ non_zero_result
    print(f"\nGram matrix of non-zero vectors:")
    print(np.round(gram_matrix, 6))
    is_orthonormal = np.allclose(gram_matrix, np.eye(non_zero_result.shape[1]), atol=1e-10)
    print(f"Are remaining vectors orthonormal? {is_orthonormal}")

array([[ 0.70710678,  0.        ,  0.        ],
       [ 0.        ,  1.        ,  0.        ],
       [ 0.70710678,  0.        ,  0.        ]])

In [None]:
# Test Case 9: Dimension with linear dependence
print("🔢 Test Case 9: Final dimension check")
print("=" * 40)

dim_C = dimensions(C)
expected_dim_C = np.linalg.matrix_rank(C)

print(f"Matrix C (with linear dependence):")
print(f"Calculated dimension: {dim_C}")
print(f"Expected dimension (matrix rank): {expected_dim_C}")
print(f"Match? {abs(dim_C - expected_dim_C) < 1e-10}")

print(f"\n🎯 Summary of all dimension tests:")
print(f"Matrix V (4×4): dimension = {dimensions(V)} (rank = {np.linalg.matrix_rank(V)})")
print(f"Matrix A (4×3): dimension = {dimensions(A)} (rank = {np.linalg.matrix_rank(A)})")
print(f"Matrix B (3×5): dimension = {dimensions(B)} (rank = {np.linalg.matrix_rank(B)})")
print(f"Matrix C (3×3): dimension = {dimensions(C)} (rank = {np.linalg.matrix_rank(C)})")

print(f"\n✨ If all tests pass, your Gram-Schmidt implementation is working correctly!")

2.0

In [None]:
---

## 🔍 Solutions and Explanations

<details>
<summary><strong>Click here to reveal the complete solutions</strong> (Try implementing it yourself first!)</summary>

### Solution for `gsBasis4(A)`

The key insight is to follow the Gram-Schmidt algorithm step by step:

```python
def gsBasis4(A):
    B = np.array(A, dtype=np.float_)
    
    # Step 1: Normalize first vector
    B[:, 0] = B[:, 0] / la.norm(B[:, 0])
    
    # Step 2: Orthogonalize and normalize second vector
    B[:, 1] = B[:, 1] - B[:, 1] @ B[:, 0] * B[:, 0]
    if la.norm(B[:, 1]) > verySmallNumber:
        B[:, 1] = B[:, 1] / la.norm(B[:, 1])
    else:
        B[:, 1] = np.zeros_like(B[:, 1])
    
    # Step 3: Orthogonalize third vector against first two
    B[:, 2] = B[:, 2] - B[:, 2] @ B[:, 0] * B[:, 0]  # Remove component parallel to u₀
    B[:, 2] = B[:, 2] - B[:, 2] @ B[:, 1] * B[:, 1]  # Remove component parallel to u₁
    
    # Normalize third vector
    if la.norm(B[:, 2]) > verySmallNumber:
        B[:, 2] = B[:, 2] / la.norm(B[:, 2])
    else:
        B[:, 2] = np.zeros_like(B[:, 2])
    
    # Step 4: Orthogonalize fourth vector against first three
    B[:, 3] = B[:, 3] - B[:, 3] @ B[:, 0] * B[:, 0]  # Remove component parallel to u₀
    B[:, 3] = B[:, 3] - B[:, 3] @ B[:, 1] * B[:, 1]  # Remove component parallel to u₁
    B[:, 3] = B[:, 3] - B[:, 3] @ B[:, 2] * B[:, 2]  # Remove component parallel to u₂
    
    # Normalize fourth vector
    if la.norm(B[:, 3]) > verySmallNumber:
        B[:, 3] = B[:, 3] / la.norm(B[:, 3])
    else:
        B[:, 3] = np.zeros_like(B[:, 3])
    
    return B
```

### Solution for `gsBasis(A)` (General Version)

The general version uses loops to avoid repetition:

```python
def gsBasis(A):
    B = np.array(A, dtype=np.float_)
    
    for i in range(B.shape[1]):
        # Remove projections onto all previous orthonormal vectors
        for j in range(i):
            B[:, i] = B[:, i] - (B[:, i] @ B[:, j]) * B[:, j]
        
        # Normalize the vector (or set to zero if linearly dependent)
        if la.norm(B[:, i]) > verySmallNumber:
            B[:, i] = B[:, i] / la.norm(B[:, i])
        else:
            B[:, i] = np.zeros_like(B[:, i])
    
    return B
```

### 🧠 Key Concepts Explained

**Vector Projection Formula**: 
The projection of vector $\mathbf{a}$ onto unit vector $\mathbf{u}$ is:
$$\text{proj}_{\mathbf{u}}\mathbf{a} = (\mathbf{a} \cdot \mathbf{u})\mathbf{u}$$

**Why This Works**:
1. **Dot product** $\mathbf{a} \cdot \mathbf{u}$ gives the "amount" of $\mathbf{a}$ in the direction of $\mathbf{u}$
2. **Multiplying by $\mathbf{u}$** converts this scalar back to a vector in the direction of $\mathbf{u}$
3. **Subtracting the projection** leaves only the component perpendicular to $\mathbf{u}$

**Linear Dependence Detection**:
If a vector becomes very small (near zero) after removing all projections, it means the vector was a linear combination of the previous vectors, so we set it to exactly zero.

**Numerical Stability**:
We use `verySmallNumber = 1e-14` as a threshold to distinguish between "truly zero" and "numerically small due to floating-point errors."

</details>

---

## 🚀 Extensions and Advanced Exercises

Once you've completed the basic implementation, try these challenges to deepen your understanding:

### 🎯 Challenge Problems

1. **Modified Gram-Schmidt**: Research and implement the "Modified Gram-Schmidt" algorithm, which is more numerically stable than the classical version you just implemented.

2. **QR Decomposition**: Use your Gram-Schmidt function to implement QR decomposition of a matrix, where $A = QR$ with $Q$ orthogonal and $R$ upper triangular.

3. **Projection Matrix**: Given an orthonormal basis, write a function to compute the projection matrix onto the subspace spanned by those vectors.

4. **Gram-Schmidt with Complex Numbers**: Extend your implementation to work with complex-valued vectors (hint: use `np.conj()` for the complex conjugate).

### 🧪 Create Your Own Test Cases

Try creating matrices that test edge cases:
- Nearly linearly dependent vectors (what happens?)
- Very large or very small numbers (numerical stability)
- Matrices with more extreme aspect ratios

### 📚 Applications to Explore

- **Principal Component Analysis (PCA)**: How does Gram-Schmidt relate to finding principal components?
- **Least Squares Regression**: How is orthogonalization used in solving linear regression problems?
- **Signal Processing**: How does orthogonalization help in signal separation and noise reduction?

### 💡 Reflection Questions

1. Why is it important to check for linear dependence during the process?
2. What would happen if we didn't normalize the vectors?
3. How does the order of input vectors affect the final orthonormal basis?
4. In what situations might the classical Gram-Schmidt process be numerically unstable?

In [None]:
# 🎮 Playground: Experiment with your own test cases here!
# Try creating interesting matrices and testing your implementation

# Example: Create your own test matrix
# my_matrix = np.array([[...], [...], ...], dtype=np.float_)
# result = gsBasis(my_matrix)
# print("My test result:")
# print(result)
