# Linear Algebra - Matrices (Pt 4)


## 1. System of linear equations

Where there is a system of linear equations (i.e. no terms like $x^2, xy, \sqrt{x}, \sin{(x)}$ etc), the equations can be represented as matrix-vector multiplication.

$$
\begin{array}{c}
\underbrace{
\begin{matrix}
    \vphantom{} \\
    \vphantom{} \\
    x \quad y \quad z
\end{matrix}
}_{\text{Unknown variables}}
\quad\quad
\underbrace{
\begin{matrix}
    2x+5y+3z = -3 \\
    4x+0y+8z = 0 \\
    1x+3y+0z = 2
\end{matrix}
}_{\text{Equations}}
\quad
\rightarrow
\quad
\overbrace{
\underbrace{
\begin{bmatrix}
    2 & 5 & 3 \\
    4 & 0 & 8 \\
    1 & 3 & 0
\end{bmatrix}
}_{\text{Coefficients}}
}^{\textbf{A}}
\overbrace{
\underbrace{
\begin{bmatrix}
    x \\
    y \\
    z
\end{bmatrix}
}_{\text{Variables}}
}^{\textbf{x}}
=
\overbrace{
\underbrace{
\begin{bmatrix}
    -3 \\
    0 \\
    2
\end{bmatrix}
}_{\text{Constants}}
}^{\textbf{b}}
\end{array}
$$

- <mark>Geometric intuition</mark>: In this example, matrix $\textbf{A}$ linearly transforms the $\mathbb{R}^{n}$ vector space in such a way that the vector $\textbf{x}$ gets transformed into $\textbf{b}$.
- To solve $\textbf{Ax} = \textbf{b}$, means to find the vector $\textbf{x}$. To do this, we must find the inverse of $\textbf{A}$. (This is not always possible).
  

## 2. Matrix inverse

### 2.1. Is it invertible?

- All non-square matrices are **not invertible** (they have no determinant). Some square matrices are also **not invertible**:
- **Singular matrices** are those which have **no inverses** (like how 0 has no inverse)
    - $\text{det}(\textbf{A}) = 0$: No inverse, because the matrix collapses the vector space (dimensionality is lost)
- <mark>**Non-singular matrices** are those which **have an inverse**</mark>
    - $\text{det}(\textbf{A}) \ne 0$: Inverse exists, because the matrix preserves the vector space (dimensionality is maintained)
    - The inverse of a matrix is unique: An invertible matrix **only has one inverse**.

### 2.2. Finding the inverse


- Only square, non-singular matrices ($\mathbb{R}^{n\times n}$) are invertible. 
- $\textbf{A}$ is invertible if there exists a matrix $\textbf{A}^{-1}$ such that $\textbf{A} \textbf{A}^{-1} = \textbf{A}^{-1} \textbf{A} = \textbf{I}_n$.
    - akin to how how $3 \times \frac{1}{3} = 1$
- <mark>Geometric intuition</mark>: If you apply $\textbf{A}$ and then apply its inverse $\textbf{A}^{-1}$, you get back to the original vector. In other words, you did "nothing" to the vector, hence the identity matrix $\textbf{I}_n$.
- For a $2 \times 2$ matrix, the inverse is:
$$
\begin{split}
\textbf{A}^{-1}
=
\begin{bmatrix}
a & b \\
c & d\\
\end{bmatrix}^{-1} 
= 
\frac{1}{|\textbf{A}|} \times
\text{adj(}\textbf{A}\text{)}
=
\frac{1}{ad-bc}\begin{bmatrix}
d & -b \\
-c & a\\
\end{bmatrix}
\end{split}
$$

Where $\text{adj(}\textbf{A}\text{)}$ is the **adjugate matrix** of a matrix $\textbf{A}$ (i.e. the transpose of the **cofactor matrix** of $\textbf{A}$). See link for computation details: [Adjugate matrix](https://en.wikipedia.org/wiki/Adjugate_matrix)


### 2.3. Using the inverse to solve a system of linear equations

Using the earlier example:

$$
\begin{align*}
\overbrace{
\underbrace{
\begin{bmatrix}
    2 & 5 & 3 \\
    4 & 0 & 8 \\
    1 & 3 & 0
\end{bmatrix}
}_{\text{Coefficients}}
}^{\textbf{A}}
\overbrace{
\underbrace{
\begin{bmatrix}
    x \\
    y \\
    z
\end{bmatrix}
}_{\text{Variables}}
}^{\textbf{x}}
=
\overbrace{
\underbrace{
\begin{bmatrix}
    -3 \\
    0 \\
    2
\end{bmatrix}
}_{\text{Constants}}
}^{\textbf{b}}
\end{align*}
$$

We can solve for $\textbf{x}$ by multiplying both sides by the inverse of $\textbf{A}$:

$$
\begin{align*}
\textbf{A}\textbf{x} &= \textbf{b} \\
\textbf{A}^{-1}\textbf{A}\textbf{x} &= \textbf{A}^{-1}\textbf{b} \\
\textbf{x} &= \textbf{A}^{-1}\textbf{b}
\end{align*}
$$

Finding the inverse of $\textbf{A}$:
$$
\begin{align*}
\textbf{A}^{-1}
&=
\frac{1}{|\textbf{A}|} \times
\text{adj(}\textbf{A}\text{)} \\
\end{align*}
$$

In [7]:
import numpy as np
from numpy.linalg import det, inv

A = np.array([[2, 5, 3], [4, 0, 8], [1, 3, 0]])
print("M:\n", A)

# Recall, M has non-zero determinant so is invertible
print(f"Determinant: {det(A):.1f}")
I = np.eye(3)
print("\nI:\n", I)
print("\nM*I:\n", np.dot(A, I))

# print(f"Determinant: {det(M):.1f}")
# print("Inv M:\n", inv(M))

# P = np.array([[0, 1, 0], [0, 0, 0], [1, 0, 1]])
# print("det(p):", det(P))
# print("Inv P:\n", inv(P))  # <-- LinAlgError thrown because P is Singular (not invertible)


M:
 [[2 5 3]
 [4 0 8]
 [1 3 0]]
Determinant: 28.0

I:
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

M*I:
 [[2. 5. 3.]
 [4. 0. 8.]
 [1. 3. 0.]]


### 1.2. Ill-Conditioned Matrices

- An ill-conditioned matrix is one which is **close to being singular** 
    - Its determinant will be close to 0 (problematic in the same way dividing by a tiny number is)
    - Computation errors (overflow, underflow, round-off errors) may occur
- **Condition number**: ***Higher number*** means the matrix is ***more*** ill-conditioned (i.e. closer to being singular)


In [3]:
# Compute the condition numbers, determinants, and matrix inverses for the following 3 matrices A_1, A_2, A_3

from numpy.linalg import cond

A_1 = np.array([[1, 1], [2, 3]])
A_2 = np.array([[4.1, 2.8], [9.7, 6.6]])
A_3 = np.array([[4.1, 2.8], [9.6760, 6.6080]])

print("\nMatrix A_1:")
print("Condition number(A_1):", cond(A_1))
print("det(A_1):", det(A_1))
print("Inv A_1:\n", inv(A_1))

print("\nMatrix A_2:")
print("Condition number(A_2):", cond(A_2))
print("det(A_2):", det(A_2))
print("Inv A_2:\n", inv(A_2))

print("\nMatrix A_3 -- This matrix is close to singular or badly scaled. It is ill-conditioned:")
print("Condition number(A_3):", cond(A_3))  # Note that the condition number is very high
print("det(A_3):", det(A_3))  # Note that the determinant is very close to zero
print("Inv A_3:\n", inv(A_3))  # Note that the inverse is very large



Matrix A_1:
Condition number(A_1): 14.933034373659225
det(A_1): 1.0
Inv A_1:
 [[ 3. -1.]
 [-2.  1.]]

Matrix A_2:
Condition number(A_2): 1622.9993838565622
det(A_2): -0.0999999999999999
Inv A_2:
 [[-66.  28.]
 [ 97. -41.]]

Matrix A_3 -- This matrix is close to singular or badly scaled. It is ill-conditioned:
Condition number(A_3): 1.1149221731402912e+16
det(A_3): -4.461167435465537e-15
Inv A_3:
 [[-1.53781451e+15  6.51616316e+14]
 [ 2.25179981e+15 -9.54152463e+14]]


### 1.3. Trace 

- The **trace** of $\textbf{A} : \textbf{A} \in \mathbb{R}^{n\times n}$ is the sum of elements on the main diagonal (from left to right):

$$ \text{tr}(\textbf{A}) = \sum_{i=1}^{n} a_{ii} $$


In [4]:
# Compute the trace of the following matrix A

from numpy import trace

A = np.array([[4.1, 2.8], [9.7, 6.6]])
print("Trace(A)", trace(A))


Trace(A) 10.7


## 2. Back to non-square matrices

### 2.1. Rank

- The **rank** of $\textbf{A} : \textbf{A} \in \mathbb{R}^{m\times n}$ is the **number of linearly independent columns or rows** in $\textbf{A}$
    - NB: Num. of lin. indep. cols. in a matrix $\equiv$ Num. of lin. indep. rows in that matrix

#### 2.1.1. "Full Rank" Matrix

- $\textbf{A}$ is **full rank** if $\text{rank}(\textbf{A}) = \min(m,n)$
- $\textbf{A}$ is also full rank if **all its columns are linearly independent**

#### 2.1.2. Augmented Matrix

- If vector $\textbf{y}$ is concatenated to matrix $\textbf{A}$, we say "$\textbf{A}$ augmented with $\textbf{y}$". Denoted as $(\textbf{A}\vert \textbf{B})$.
    - if $\text{rank}((\textbf{A}\vert \textbf{B})) = \text{rank}(\textbf{A})+1$, then vector $\textbf{y}$ is **"new" information**
    - otherwise, if $\text{rank}((\textbf{A}\vert \textbf{B})) = \text{rank}(\textbf{A})$,  it means $\textbf{y}$ can be created as a linear combination of the columns in $\textbf{A}$

In [5]:
# Compute the condition number and rank for matrix A = [[1,1,0],[0,1,0],[1,0,1]]
# If y = [[1],[2],[1]], get the augmented matrix [A,y]

from numpy import trace
from numpy.linalg import cond, matrix_rank

A = np.array([[1, 1, 0], [0, 1, 0], [1, 0, 1]])
y = np.array([[1], [2], [1]])
print("A matrix's shape:", A.shape, "\ny vector's shape", y.shape)

print("\nCondition number(A):", cond(A))
print("Rank(A):", matrix_rank(A))
print("Trace(A):", trace(A))

A_y = np.concatenate((A, y), axis=1)
print("\nOriginal A matrix:\n", A, "\ny vector:\n", y, "\n\nAugmented (A|y) matrix:\n", A_y)
print("Rank(A_y):", matrix_rank(A_y))

print("\nNote that the rank of A and A_y are both 3, so y is a linear combination of the columns of A.")
print("Therefore, there is no new information in y that is not already in A.")


A matrix's shape: (3, 3) 
y vector's shape (3, 1)

Condition number(A): 4.048917339522305
Rank(A): 3
Trace(A): 3

Original A matrix:
 [[1 1 0]
 [0 1 0]
 [1 0 1]] 
y vector:
 [[1]
 [2]
 [1]] 

Augmented (A|y) matrix:
 [[1 1 0 1]
 [0 1 0 2]
 [1 0 1 1]]
Rank(A_y): 3

Note that the rank of A and A_y are both 3, so y is a linear combination of the columns of A.
Therefore, there is no new information in y that is not already in A.
