# Systems of Linear Equations

## 02.01 Systems of Linear Equations

#### Matrix-Vector Product

Think of $Ax=b$ as linear combination of the columns of A.

$
\begin{align}
Ax &=
\begin{bmatrix}
a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\
a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\
\vdots & \vdots &  & \vdots \\
a_{m,1} & a_{m,2} & \cdots & a_{m,n} \\
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_n \\
\end{bmatrix} \\
&= 
x_1
\begin{bmatrix}
a_{1,1} \\
\vdots \\
a_{m,1} \\
\end{bmatrix}
+
x_2
\begin{bmatrix}
a_{1,2} \\
\vdots \\
a_{m,2} \\
\end{bmatrix}
+
\cdots
+
x_n
\begin{bmatrix}
a_{1,n} \\
\vdots \\
a_{m,n} \\
\end{bmatrix}
\end{align}
$

The following terms are equivalent:
* span
* column space
* range

**Def** For $A \in \mathbb{R}^{m \times n}$, $\text{span}(A) = \{ Ax : x \in \mathbb{R}^n \}$.
* Span of a matrix is the set of all possible linear combinations.

Solving a linear system $Ax =b$ is really asking the question: Is $b \in \text{span}(A)$?

#### Singularity

A matrix is **nonsingular** when:
1. There exists $A^{-1}$ such that $AA^{-1} = A^{-1}A = I$
2. $\text{det}(A) \neq 0$
3. $\text{rank}(A) = n$
4. For any vector $z \neq 0$, then $Az \neq 0$.

#### Uniqueness

When A is **nonsingular**, then b must be unique.
* Geometric interpretation: b is the intersection of the hyperplanes formed by the intersections of each of the equations that form A.

| A | b | # solutions |
| :-: | :-: | :-----------: |
| nonsingular | arbitrary | 1 |
| singular | system consistent eg $b \in \text{span}(A)$ | $\infty$ |
| singular | system not consistent eg $b \notin \text{span}(A)$ | 0 |

Demonstrate that when $A$ is nonsingular, a unique solution exists for every abritrary $b$.

In [45]:
import numpy as np

A = np.array([[2, 3],[5, 4]])
assert(np.linalg.det(A) != 0.)  # A is nonsingular.

# Since A is nonsingular, solution exists for arbitrary b.
b = np.random.random(2)
x = np.linalg.solve(A, b)
np.testing.assert_almost_equal(np.matmul(A, x), b)  # Definition.

Demonstrate that when $A$ is singular, no solution exists for arbitrary $b$.

In [46]:
import numpy as np

A = np.array([[2, 3],[4, 6]])
assert(np.linalg.det(A) == 0.)  # A is singular.

# Since A is singular, there is no **unique** solution.
# NOTE: numpy returns error when there are infinite solutions.
b = np.random.random(2)
try:
    x = np.linalg.solve(A, b)
except np.linalg.LinAlgError as ex:
    print("expected: ", ex)

expected:  Singular matrix


## 02.02 Norms and Condition Number

#### Vector Norms

The notion of **magnitude** generalizes to **norm** for vectors.
$$
||x||_p = \left( \sum_{i=1}^{n} |x_i|^p \right)^{1/p}
$$

In general, for any vector $x \in \mathbb{R}$, then $ ||x||_1 \leq ||x||_2 \leq ||x||_{\infty}$.

Properties
* Triange inequality $||x + y|| \leq ||x|| + ||y||$.

Demonstrate p-norms with numpy.

In [26]:
import numpy as np

# Demonstrate computation of p-norm for same vector.
x = np.array([-1.6, 1.2])
print("1-norm:   ", np.linalg.norm(x, ord=1))
print("2-norm:   ", np.linalg.norm(x, ord=2))
print("inf-norm: ", np.linalg.norm(x, ord=np.inf))

1-norm:    2.8
2-norm:    2.0
inf-norm:  1.6


#### Matrix Norms

The notion of norms extends to matrices.

$$
||A|| = \text{max}_{x \neq 0} \frac{||Ax||}{||x||}
$$

* Note that $Ax$ is a vector, thus $||Ax||$ will be a vector norm.
* The ratio of $\frac{||Ax||}{||x||}$ measures the amount of stretching that matrix $A$ applies to the vector $x$ from $A$.

Properties
* $||AB|| \leq ||A|| \cdot ||B||$.

#### Condition Number

Matrix norms are used to define condition number.

$$
\text{cond}(A) = ||A|| \cdot ||A^{-1}|||
$$

* Larger values of $\text{cond}(A)$ imply closer to singularity.
* Conceptually, the more stretching that matrix $A$ applies, then the larger the condition number and closer to singular.

Properties
* For any matrix $A$, then $\text{cond}(A) \geq 1$.
* For identity matrix $I$, then $\text{cond}(I) = 1$.
* For any matrix $A$ and scalar $\gamma$, then $\text{cond}(\gamma A) = \text{cond}(A)$
* For any diagonal matrix $D$, then $\text{cond}(D) = \frac{\text{max}|d_i|}{\text{min}|d_i|}$.

Demonstrate condition number with numpy.

In [41]:
import numpy as np

p = 2  # Use 2-norm as example.

A = np.array([[2, -1, 1],[1, 0, 1],[3, -1, 4]])
Ainv = np.linalg.inv(A)

# Compute the norm of each matrix.
normA = np.linalg.norm(A, ord=p)
normAinv = np.linalg.norm(Ainv, ord=p)

# Compute the condition number from norm and compare.
condA = normA * normAinv
np.testing.assert_almost_equal(condA, np.linalg.cond(A, p=p))

# Compute the condition number of diagonal matrix and compare.
D = np.diag(np.arange(1, 10))
condD = np.max(np.diag(D)) / np.min(np.diag(D))
np.testing.assert_almost_equal(condD, np.linalg.cond(D, p=p))

## 02.03 Assessing Accuracy

#### Error Bounds

Let $x$ be solution to $Ax = b$ and $\hat{x}$ be solution to $A\hat{x} = b + \Delta{b}$.

Use $\Delta{x} = \hat{x} - x$ to bound the change in $x$ to the change in $b$.

$$
\frac{||\Delta{x}||}{||x||} \leq \text{cond}(A)\frac{||\Delta{b}||}{||b||}
$$

#### Residual

Residual vector $r$ of approximate solution $\hat{x}$ to $Ax=b$.

$$
r = b - A \hat{x}
$$

* Useful as measure of error when $\text{cond}(A)$ is small.
* A small residual does not necessarily imply solution is accurate.

Demonstrate example where small residual and ill-conditioned matrix does not imply an accurate solution.

In [61]:
import numpy as np

# Consider an example with an ill-conditioned matrix A.
A = np.array([[0.913, 0.659],[0.457, 0.330]])
b = np.array([0.254, 0.127]).reshape(2, 1)

print("condA: ", np.linalg.cond(A))

# Consider two approximate solutions.
xhat1 = np.array([-0.0827, 0.5]).reshape(2, 1)
xhat2 = np.array([0.999, -1.001]).reshape(2, 1)

# Compute the residual for each.
resid1 = b - np.matmul(A, xhat1)
resid2 = b - np.matmul(A, xhat2)

# Since norm(resid1) < norm(resid2), 
# then we might think that xhat1 is better than xhat2.
print("norm1: ", np.linalg.norm(resid1, ord=2))
print("norm2: ", np.linalg.norm(resid2, ord=2))

# True solution is [1, -1]^T and xhat2 is a better approximation.
x = np.linalg.solve(A, b)
print("deltax: ", np.linalg.norm(xhat1 - x, ord=2))
print("deltax: ", np.linalg.norm(xhat2 - x, ord=2))

condA:  12485.031415973846
norm1:  0.000206163090780105
norm2:  0.0017579968714420229
deltax:  1.8499295364961637
deltax:  0.0014142135623337656


## 02.04 Solving Linear Systems

**General Approach** Replace a difficult problem by an easier one having the same or closely related solution.

#### Linear Systems
$$
\begin{aligned}
Ax &= b \\
MAx &= Mb \\
x &= (MA)^{-1}Mb \\
x &= A^{-1} M^{-1} Mb \\
x &= A^{-1} Ib \\
x &= A^{-1}b
\end{aligned}
$$
* Solution: Premultiply each side of $Ax = b$ by nonsingular matrix $M$.

#### Permutations
Let $P$ be the permutation matrix.
* $P$ is an identity matrix with rows or columns permuted such that each row and column has exactly one cell with value of `1` and all other cells are `0`.
* $PAx = Pb$ reorders rows, but solution $x$ unchanged.
* $P^T = P^{-1}$ reverses permutation. 

#### Diagonal Scaling
Let $D$ be a diagnonal matrix.
* $DAx = Db$ multiplies each row by corresponding entry of $D$, but solution $x$ is unchanged.

#### Triangular 
Let $U$ be an upper triangular system and $L$ be a lower triangular system.
* $U$ can be solved by back substitution.
$$
x_i = \frac{\left( b_i - \sum_{j=i+1}^{n} U_{ij}x_j \right)}{U_{ii}} \qquad i=n-1,\cdots,1
$$
* $L$ can be solved by forward substitution.
$$
x_i = \frac{\left( b_i - \sum_{j=1}^{i-1} L_{ij}x_j \right)}{L_{ii}} \qquad i=1,\cdots,n
$$
* Any system can be permuted into $L$ or $U$ using $P$ and $D$.


Demonstrate back substitution.

In [75]:
import numpy as np

def backsubstitution(U, b):
    """
    Perform backsubstitution to solve [U|b] for x.
    """
    x = np.zeros((U.shape[1], b.shape[1]))

    for i in range(U.shape[0]-1, -1, -1):
        x[i] = (b[i] - np.sum(np.dot(U[i,i+1:], x[i+1:]))) / U[i,i]
    
    return x

# Solve Ux = b for x.
U = np.array([[2,4,-2],[0,1,1],[0,0,4]])
b = np.array([2,4,8]).reshape(3,1)
x = backsubstitution(U, b)
np.testing.assert_almost_equal(x, np.array([-1,2,2]).reshape(3,1))

Demonstrate forward substitution.

In [83]:
import numpy as np

def forwardsubstitution(L, b):
    """
    Perform forward substitution to solve [L|b] for x.
    """
    x = np.zeros((L.shape[1], b.shape[1]))
    
    for i in range(U.shape[0]):
        x[i] = (b[i] - np.sum(np.dot(L[i,:i], x[:i]))) / L[i,i]

    return x

# Solve Lx = b for x.
L = np.array([[2,0,0],[-1,2,0],[1,-1,1]])
b = np.array([28,-40,33]).reshape(3, 1)
x = forwardsubstitution(L, b)
np.testing.assert_almost_equal(x, np.array([14,-13,6]).reshape(3,1))

## 02.05 Elementary Elimination Matrices

Devise a nonsingular **linear transformation** that transforms linear system $Ax = b$ to a triangular system that we solve via substitution.

In general any row can be eliminated by adding a multiple $m_i = \frac{-a_i}{a_k}, i=k+1,\cdots,n$ where $a_k$ is reffered to as the **pivot**.
$$
\begin{bmatrix}
1 & 0 \\
\frac{-a_2}{a_1} & 1 \\
\end{bmatrix}
\begin{bmatrix}
a_1 \\
a_2 \\
\end{bmatrix}
=
\begin{bmatrix}
a_1 \\
0 \\
\end{bmatrix}
$$

## 02.06 LU Factorization by Gaussian Elimination

#### Gaussian Elimination
Gaussian elimination transforms the systems of equations described by $Ax = b$ to an equivalent system described by $Ux = c$ where $U$ is an upper triangular matrix.

Solve.
* Solve $Ux = c$ for $x$ by back substitution.

#### LU Factorization
LU factorization transforms the systems of equations described by $Ax = b$ to an equivalent system described by $LUx = b$ where $L$ is a lower triangular matrix and $U$ is an upper triangular matrix.

Solve.
* Solve $Ly = b$ for $y$ by forward substitution.
* Solve $Ux = y$ for $x$ by back substitution.


Demonstrate Gaussian elimination.

In [95]:
import numpy as np

def gaussian_elimination(A, b):
    """
    Use gaussian elimination to transform [A|b] to [U|c].
    """
    for j in range(A.shape[1]):  # Columns.
        for i in range(j+1, A.shape[0]):  # Rows.
            m_i = -1.0 * A[i,j] / A[j,j]
            A[i,:] += m_i * A[j,:]
            b[i,:] += m_i * b[j,:]

    return A, b

# Transform Ab to Uc.
A = np.array([[1,2,2],[4,4,2],[4,6,4]], dtype='d')
b = np.array([3,6,10], dtype='d').reshape(3,1)
A, b = gaussian_elimination(A, b)
np.testing.assert_almost_equal(A, np.array([[1,2,2],[0,-4,-6],[0,0,-1]]))
np.testing.assert_almost_equal(b, np.array([3,-6,1]).reshape(3,1))

Demonstrate LU factorization.

In [99]:
import numpy as np

def LU(A):
    """
    Use LU factorization to transform [A] to [L|U].
    """
    for j in range(A.shape[1]):  # Columns.
        for i in range(j+1, A.shape[0]):  # Rows.
            A[i,j] = A[i,j] / A[j,j]
            A[i,j+1:] += -1.0 * A[i,j] * A[j,j+1:]
    return A

# Transform A to LU.
A = np.array([[1,2,2],[4,4,2],[4,6,4]], dtype='d')
A = LU(A)
np.testing.assert_almost_equal(A, np.array([[1,2,2],[4,-4,-6],[4,0.5,-1]]))

## 02.07 Pivoting

## 02.08 Residual

## 02.09 Implementing Gaussian Elimination

## 02.10 Updating Solutions

## 02.11 Improving Accuracy

## 02.12 Special Types of Linear Systems

## 02.13 Software Linear Systems