# Numerical Linear Algebra [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ua-2025q3-astr501-513/ua-2025q3-astr501-513.github.io/blob/main/513/02/notes.ipynb)

[![Matrix transform](fig/matrix_transform.png)](https://xkcd.com/184/)

```{note} TAP Computation and Data Intuitive Meeting

Date: Every Thursday  
Time: 2-3pm  
Room: SO N305  
Zoom: [one-click](https://arizona.zoom.us/j/88694275321?pwd=XiFa1kbUVl90MYtoAa47W6FCcuRowU.1), id: 886 9427 5321, password: tapcdi  
Schedule: [Google Sheet](https://docs.google.com/spreadsheets/d/1VQkQGZYwSEJ_N6UIHJQ-Tjvn02k9rClImgCCYo4ucrg/edit?usp=sharing)

Upcoming topic: "Book keeping of your simulations (or large data sets)"
```

```{note} HPC Workshop

UA HPC provides HPC workshop during this Fall:

| Date | Time | Session
--- | --- | ---
Friday Sep 12th | 10am-3pm | Introduction to HPC
Friday Sep 19th | 10am-3pm | Software on HPC
Friday Sep 26th | 10am-3pm | Machine Learning and GPUs

Register with this
[Google Form](https://docs.google.com/forms/d/e/1FAIpQLSfjRhn1xF7wcd6G_wyVKtdYqosxxPaM_2V-nfTJZa8BXEe5lA/viewform).
```

```{admonition} Homework Set #1

Use this GitHub Classroom Link:
https://classroom.github.com/a/r-eqz-mO
to accept it.

Please make sure you merge from the upstream repository so all the
autograding and template are in place.
```

Linear algebra is a fundamental part of modern mathematics.
It supports fields from calculus and statistics to geometry, control
theory, and functional analysis.
Most linear systems are well understood.
Even nonlinear problems are often studied through linear
approximations.

Numerical linear algebra extends these ideas to computation, enabling
solutions of PDEs, optimization tasks, eigenvalue problems, and
more.
It addressing some of the hardest problems in physics and engineering.

### Motivations from Physics

* Normal Modes:
  Vibrations near equilibrium reduce to generalized eigenvalue
  problems.
  Linear algebra therefore reveals resonance in materials, acoustics,
  and plasma waves.

* Quantum Mechanics:
  Described by the Schrödinger equation, quantum systems are
  inherently linear.

* Discretized PDEs:
  Discretizing PDEs yields large sparse linear systems.
  They can solved numerically by methods such as conjugate gradient.

* Nonlinear Problems:
  Nonlinear physics problems including turbulence are sometimes
  untrackable.
  Linearizing them with perturbation theory reduces them to sequences
  of linear systems.

### Motivations from Computation

* Large-Scale Data:
  Modern sensors and simulations produce massive datasets.
  Matrix decompositions (e.g., SVD, PCA) provide compression, noise
  reduction, and feature extraction.

* Neural Networks:
  Core operations in training, i.e., backpropagation, is dominated by
  large matrix multiplications.
  Efficient linear algebra routines are therefore critical for scaling
  deep learning.

* Hardware Accelerators:
  GPUs and TPUs are optimized for matrix operations, making vectorized
  linear algebra essential for both neural networks and scientific
  computing.

```{note}

> If everything were linear, we wouldn't need computers.
> <div style="text-align: right">- Arrogant mathematicians, including CK many years ago ...</div>

The above quote suggests that in a purely linear world, everything
would be easy to solve analytically.
However, this is an oversimplification.

Even perfectly linear problems can pose significant computational
challenges due to two key factors.
* High dimensionality makes solving linear systems computationally
  intensive.
  For example, systems with millions of unknowns are common in
  numerical PDEs or massive machine learning models.
  Processing such large-scale data requires significant computational
  power, regardless of linearity.
* Real-world computations face constraints from finite precision.
  Hardware limitations, such as floating-point arithmetic, introduce
  numerical stability and conditioning challenges, even in linear
  systems.
  Addressing these issues requires robust algorithms to ensure
  accurate and efficient solutions.

Some of the most exciting development in numerical analysis recently
is to apply randomized algorithms to solve large scale linear algebra
problems.
See [this reference](https://arxiv.org/pdf/2402.17873) for an
introductory course.
```

## Direct Solvers

Direct methods are often the first approach taught for solving linear
systems $A\mathbf{x} = \mathbf{b}$.
They involve algebraic factorizations that can be computed in a fixed
number of steps (roughly $\mathcal{O}(n^3)$) for an $n \times n$
matrix.

### Gaussian Elimination

Gaussian Elimination transforms the system $A \mathbf{x} = \mathbf{b}$
into an equivalent upper-triangular form $U \mathbf{x} = \mathbf{c}$
by systematically applying row operations.  Once in an
upper-triangular form, one can perform back-substitution to solve for
$\mathbf{x}$.

1. Row Operations
   * Subtract a multiple of one row from another to eliminate entries
     below the main diagonal.
   * Aim to create zeros in column $j$ below row $j$.

2. Partial Pivoting
   * When a pivot (diagonal) element is small (or zero), swap the
     current row with a row below that has a larger pivot element in
     the same column.
   * This step mitigates numerical instability by reducing the chance
     that small pivots lead to large rounding errors in subsequent
     operations.

3. Result
   * After eliminating all sub-diagonal entries, the matrix is in
     upper-triangular form $U$.
   * Solve $U\mathbf{x} = \mathbf{c}$ via back-substitution.


Here is an
[example](https://en.wikipedia.org/wiki/Gaussian_elimination):

```{list-table}
:header-rows: 1
* + System of equations
  + Row operations
  + Augmented matrix

* + \begin{alignat}{4}
       2x &{}+{}& y &{}-{}&  z &{}={}&   8 & \\
      -3x &{}-{}& y &{}+{}& 2z &{}={}& -11 & \\
      -2x &{}+{}& y &{}+{}& 2z &{}={}&  -3 &
    \end{alignat}
  +
  + \begin{align}
    \left[\begin{array}{rrr|r}
       2 &  1 & -1 &   8 \\
      -3 & -1 &  2 & -11 \\
      -2 &  1 &  2 &  -3
    \end{array}\right]
    \nonumber
    \end{align}

* + \begin{alignat}{4}
      2x &{}+{}&          y &{}-{}&          z &{}={}& 8 & \\
         &     & \tfrac12 y &{}+{}& \tfrac12 z &{}={}& 1 & \\
         &     &         2y &{}+{}&          z &{}={}& 5 &
    \end{alignat}
  + \begin{align}
      L_2 + \tfrac32 L_1 &\to L_2 \\
      L_3 +          L_1 &\to L_3
    \end{align}
  + \begin{align}
    \left[\begin{array}{rrr|r}
      2 &      1  &     -1  & 8 \\
      0 & \frac12 & \frac12 & 1 \\
      0 &      2  &      1  & 5
    \end{array}\right]
    \end{align}

* + \begin{alignat}{4}
      2x &{}+{}&          y &{}-{}&          z &{}={}& 8 & \\
         &     & \tfrac12 y &{}+{}& \tfrac12 z &{}={}& 1 & \\
         &     &            &     &         -z &{}={}& 1 &
    \end{alignat}
  + \begin{align}
      L_3 + -4 L_2 \to L_3
    \end{align}
  + \begin{align}
    \left[\begin{array}{rrr|r}
      2 &      1  &     -1  & 8 \\
      0 & \frac12 & \frac12 & 1 \\
      0 &      0  &     -1  & 1
    \end{array}\right]
    \end{align}
```

The matrix is now in echelon form (also called triangular form):

```{list-table}
:header-rows: 1
* + System of equations
  + Row operations
  + Augmented matrix

* + \begin{alignat}{4}
      2x &{}+{}&          y &     &   &{}={}       7  & \\
         &     & \tfrac12 y &     &   &{}={} \tfrac32 & \\
         &     &            &{}-{}& z &{}={}       1  &
    \end{alignat}
  + \begin{align}
      L_1 -          L_3 &\to L_1\\
      L_2 + \tfrac12 L_3 &\to L_2
    \end{align}
  + \begin{align}
    \left[\begin{array}{rrr|r}
      2 &      1  &  0 &      7  \\
      0 & \frac12 &  0 & \frac32 \\
      0 &      0  & -1 &      1
    \end{array}\right]
    \end{align}

* + \begin{alignat}{4}
      2x &{}+{}& y &\quad&   &{}={}&  7 & \\
         &     & y &\quad&   &{}={}&  3 & \\
         &     &   &\quad& z &{}={}& -1 &
    \end{alignat}
  + \begin{align}
       2 L_2 &\to L_2 \\
      -L_3 &\to L_3
    \end{align}
  + \begin{align}
    \left[\begin{array}{rrr|r}
      2 & 1 & 0 &  7 \\
      0 & 1 & 0 &  3 \\
      0 & 0 & 1 & -1
    \end{array}\right]
    \end{align}

* + \begin{alignat}{4}
      x &\quad&   &\quad&   &{}={}&  2 & \\
        &\quad& y &\quad&   &{}={}&  3 & \\
        &\quad&   &\quad& z &{}={}& -1 &
    \end{alignat}
  + \begin{align}
               L_1 - L_2 &\to L_1 \\
      \tfrac12 L_1       &\to L_1
    \end{align}
  + \begin{align}
    \left[\begin{array}{rrr|r}
      1 & 0 & 0 &  2 \\
      0 & 1 & 0 &  3 \\
      0 & 0 & 1 & -1
    \end{array}\right]
    \end{align}
```

Below is a simple code for naive (no pivoting) Gaussian Elimination in
python.
Although normally we want to avoid for loops in python for
performance, let's stick with for loop this time so we can directly
implement the algorithm we just described.

In [None]:
import numpy as np

def solve_Gaussian(A, b):
    """
    Perform naive (no pivoting) Gaussian elimination to solve the
    matrix equation A x = b.
    Returns the solution vector x.
    """
    assert A.ndim == 2 and A.shape[0] == A.shape[1]  # must be square matrix
    assert b.ndim == 1 and b.shape[0] == A.shape[1]  # must be a vector
    
    A = A.astype(float)  # ensure floating-point, create copy by default
    b = b.astype(float)  # ensure floating-point, create copy by default
    n = b.shape[0]

    # Forward elimination
    for k in range(n-1):
        for i in range(k+1, n):
            if A[k, k] == 0:
                raise ValueError("Zero pivot encountered (no pivoting).")
            factor    = A[i, k] / A[k, k]
            A[i, k:] -= factor * A[k, k:]
            b[i]     -= factor * b[k]

    # Back-substitution
    x = np.zeros(n)
    for i in reversed(range(n)):
        s = b[i]
        for j in range(i+1, n):
            s -= A[i, j] * x[j]
        x[i] = s / A[i, i]

    return x

In [None]:
A = np.random.random((3, 3))
b = np.random.random((3))

x_naive = solve_Gaussian(A, b)
print(x_naive)

Let's now improve the above naive (no pivoting) Gaussian elimination by adding pivoting.

In [None]:
# HANDSON: improve the above naive (no pivoting) Gaussian elimination
#          by adding pivoting.

def solve_Gaussian_pivot(A, b):
    """
    Perform Gaussian elimination with partial pivoting to solve
    the matrix equation A x = b.
    Returns the solution vector x.
    """
    assert A.ndim == 2 and A.shape[0] == A.shape[1]  # must be square matrix
    assert b.ndim == 1 and b.shape[0] == A.shape[1]  # must be a vector
    
    A = A.astype(float)  # ensure floating-point, create copy by default
    b = b.astype(float)  # ensure floating-point, create copy by default
    n = b.shape[0]

    # Forward elimination
    for k in range(n-1):
        # TODO: pivoting: find max pivot in column k

        # TODO: swap rows if needed
        
        for i in range(k+1, n):
            ### No longer a problem
            # if A[k, k] == 0:
            #     raise ValueError("Zero pivot encountered (no pivoting).")
            factor    = A[i, k] / A[k, k]
            A[i, k:] -= factor * A[k, k:]
            b[i]     -= factor * b[k]

    # Back-substitution
    x = np.zeros(n)
    for i in reversed(range(n)):
        s = b[i]
        for j in range(i+1, n):
            s -= A[i, j] * x[j]
        x[i] = s / A[i, i]

    return x

In [None]:
x_pivot = solve_Gaussian_pivot(A, b)
print(x_pivot)

Let's also compare with numpy's solver.

In [None]:
x_naive = solve_Gaussian(A, b)
x_pivot = solve_Gaussian_pivot(A, b)
x_numpy = np.linalg.solve(A, b)

print(x_naive, "naive")
print(x_pivot, "pivot")
print(x_numpy, "numpy")