# ENGR 240 - Worksheet 4.2: LU Factorization Demo

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/WCC-Engineering/ENGR240/blob/main/Class%20Demos%20and%20Activities/Week%204/LU_Factorization_Demo.ipynb)

This notebook demonstrates the computational efficiency gained by using LU factorization when solving multiple linear systems with the same coefficient matrix but different right-hand sides. This is based on Task 3 from Worksheet 4.2.

## Problem Statement

Given the linear system:

$$\begin{bmatrix} 2 & -1 & 5 \\ 3 & 2 & 1 \\ 1 & -4 & 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} k \\ 5 \\ -4 \end{bmatrix}$$

Solve for the values of the $x$ vector that correspond to $k$ values ranging from -5 to 5 in increments of 0.0001. 

We'll compare three approaches:
1. A naive implementation using Gaussian elimination with no LU factorization
2. Explicitly using LU factorization to decompose the matrix once, then solving for each right-hand side
3. Using NumPy's `linalg.solve`, which uses LU factorization under the hood

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import time
import scipy.linalg

## Define the Problem

First, we'll define our coefficient matrix A and create an array of k values.

In [None]:
# Define the coefficient matrix A (same as in the worksheet)
A = np.array([
    [2, -1, 5],
    [3, 2, 1],
    [1, -4, 2]
], dtype=float)

# Create a range of k values from -5 to 5 with small increments
k = np.arange(-5, 5.0001, 0.0001)
print(f"Matrix A:\n{A}")
print(f"Number of k values to process: {len(k)}")

## Method 1: Naive Implementation (Gaussian Elimination)

First, let's implement a naive Gaussian elimination algorithm without explicitly using LU factorization. This will show the baseline performance without any matrix decomposition.

In [None]:
def gaussian_elimination(A, b):
    """Solve Ax = b using Gaussian elimination without pivoting."""
    n = len(b)
    # Make copies to avoid modifying the original arrays
    A_work = A.copy()
    b_work = b.copy()
    
    # Forward elimination
    for k in range(n-1):
        for i in range(k+1, n):
            factor = A_work[i, k] / A_work[k, k]
            b_work[i] -= factor * b_work[k]
            for j in range(k, n):
                A_work[i, j] -= factor * A_work[k, j]
    
    # Back substitution
    x = np.zeros(n)
    for i in range(n-1, -1, -1):
        sum_ax = 0
        for j in range(i+1, n):
            sum_ax += A_work[i, j] * x[j]
        x[i] = (b_work[i] - sum_ax) / A_work[i, i]
    
    return x

# Test the implementation works
test_b = np.array([1, 5, -4], dtype=float)
test_x_naive = gaussian_elimination(A, test_b)
test_x_numpy = np.linalg.solve(A, test_b)
print(f"Test solution using naive implementation: {test_x_naive}")
print(f"Test solution using numpy.linalg.solve: {test_x_numpy}")
print(f"Difference: {np.abs(test_x_naive - test_x_numpy).max()}")

In [None]:
# Solve using naive Gaussian elimination
start_time = time.time()
xk_naive = np.ones((len(k), 3))  # preallocate space for solutions

for ndx in range(len(k)):
    # Build RHS vector using the current value of k
    b = np.array([k[ndx], 5, -4], dtype=float)
    
    # Solve using our Gaussian elimination function
    x = gaussian_elimination(A, b)
    
    # Store the solution
    xk_naive[ndx, :] = x

naive_solve_time = time.time() - start_time
print(f"Time taken with naive Gaussian elimination: {naive_solve_time:.4f} seconds")

## Method 2: Explicit LU Factorization

Now, let's use explicit LU factorization where we compute the decomposition once and reuse it for all right-hand sides.

In [None]:
# With explicit LU Factorization
start_time = time.time()
xk_LU = np.ones((len(k), 3))  # preallocate space for solutions

# Compute LU factorization of A once - this is the key step
lu, piv = scipy.linalg.lu_factor(A)

for ndx in range(len(k)):
    # Build RHS vector using the current value of k
    b = np.array([k[ndx], 5, -4], dtype=float)
    
    # Solve the system using the factorized matrix
    x = scipy.linalg.lu_solve((lu, piv), b)
    
    # Store the solution
    xk_LU[ndx, :] = x

lu_solve_time = time.time() - start_time
print(f"Time taken with explicit LU factorization: {lu_solve_time:.4f} seconds")

## Method 3: NumPy's linalg.solve (LU Under the Hood)

Finally, let's use NumPy's built-in solver, which uses LU factorization internally but recalculates it for each system.

In [None]:
# Using NumPy's linalg.solve
start_time = time.time()
xk_numpy = np.ones((len(k), 3))  # preallocate space for solutions

for ndx in range(len(k)):
    # Build RHS vector using the current value of k
    b = np.array([k[ndx], 5, -4], dtype=float)
    
    # Solve the system using numpy's solver
    x = np.linalg.solve(A, b)
    
    # Store the solution
    xk_numpy[ndx, :] = x

numpy_solve_time = time.time() - start_time
print(f"Time taken with numpy.linalg.solve: {numpy_solve_time:.4f} seconds")

## Comparison and Performance Analysis

In [None]:
print(f"Method 1: Naive Gaussian elimination time:  {naive_solve_time:.4f} seconds")
print(f"Method 2: Explicit LU factorization time:   {lu_solve_time:.4f} seconds")
print(f"Method 3: NumPy's linalg.solve time:        {numpy_solve_time:.4f} seconds")

print(f"\nSpeed improvement with explicit LU vs. naive: {naive_solve_time/lu_solve_time:.2f}x faster")
print(f"Speed improvement with NumPy vs. naive:        {naive_solve_time/numpy_solve_time:.2f}x faster")
print(f"Speed ratio of explicit LU vs. NumPy:          {lu_solve_time/numpy_solve_time:.2f}x")

# Verify that solutions match
print(f"\nMaximum difference between naive and LU solutions: {np.max(np.abs(xk_naive - xk_LU)):.2e}")
print(f"Maximum difference between naive and NumPy solutions: {np.max(np.abs(xk_naive - xk_numpy)):.2e}")
print(f"Maximum difference between LU and NumPy solutions: {np.max(np.abs(xk_LU - xk_numpy)):.2e}")

## Explanation of Results

You'll likely notice that Method 2 (explicit LU) is significantly faster than Method 1 (naive Gaussian elimination), but Method 3 (NumPy's solver) might be close to Method 2's performance or possibly even faster. This happens because:

1. **Method 1**: Our naive implementation performs the full O(n³) Gaussian elimination process for each system, with pure Python loops (which are slow).

2. **Method 2**: We explicitly perform LU factorization once (an O(n³) operation), and then use it to solve each system with only O(n²) operations. The implementation is optimized C/Fortran code in SciPy.

3. **Method 3**: NumPy's `linalg.solve` uses optimized LAPACK routines that perform LU decomposition internally for each system. While it's recalculating the decomposition each time (O(n³)), the underlying implementation is highly optimized in compiled code, making it much faster than our Python implementation.

For small matrices (like our 3×3 example), the overhead of Python function calls might make the performance differences less dramatic. With larger matrices and more systems to solve, the advantage of explicit LU factorization (Method 2) would become more apparent.

## Visualizing the Solutions

Let's visualize how the solution components ($x_1$, $x_2$, $x_3$) change with the parameter $k$:

In [None]:
plt.figure(figsize=(12, 6))
plt.plot(k, xk_LU[:, 0], label='$x_1$')
plt.plot(k, xk_LU[:, 1], label='$x_2$')
plt.plot(k, xk_LU[:, 2], label='$x_3$')
plt.grid(True)
plt.xlabel('k value')
plt.ylabel('Solution components')
plt.title('Solution components vs. parameter k')
plt.legend()
plt.xlim(-5, 5)
plt.show()

## Discussion

### Computational Efficiency

The results demonstrate the efficiency of using LU factorization when solving multiple linear systems with the same coefficient matrix but different right-hand sides. Here's why this happens:

1. **Naive Gaussian Elimination**: For each new right-hand side, we perform the full O(n³) elimination process.

2. **LU Factorization Method**: We perform LU factorization once (an O(n³) operation), but for each new right-hand side, we only need to perform forward and backward substitution (each an O(n²) operation).

3. **NumPy's linalg.solve**: While it uses optimized LAPACK routines under the hood (which use LU decomposition), it recalculates the decomposition for each right-hand side. Its speed comes from being implemented in highly-optimized compiled code.

For solving multiple systems with the same coefficient matrix, explicitly factoring the matrix once and reusing it (Method 2) is theoretically more efficient. This advantage becomes more pronounced with larger matrices and more systems to solve.

### NumPy Implementation Details

NumPy's `linalg.solve` calls LAPACK routines like DGESV (for double precision) which perform:
1. LU factorization with partial pivoting
2. Forward and backward substitution

But it doesn't have a mechanism to reuse the factorization for multiple right-hand sides in separate function calls. This is why for multiple systems, explicitly using SciPy's `lu_factor` and `lu_solve` can be more efficient.

### Math Behind LU Factorization

LU factorization decomposes a matrix A into a product of a lower triangular matrix L and an upper triangular matrix U:

$$A = LU$$

When partial pivoting is used (which is typical for numerical stability), we have:

$$PA = LU$$

where P is a permutation matrix.

To solve the system $Ax = b$, we:

1. Compute the LU factorization $PA = LU$
2. Solve $Ly = Pb$ using forward substitution
3. Solve $Ux = y$ using backward substitution

### Applications

This technique is valuable in many engineering applications, such as:

- Finite element analysis with multiple load cases
- Circuit analysis with multiple sources
- Parameter sensitivity studies (as demonstrated in this worksheet)
- Control system analysis with varying inputs