# Linear Algebra - Matrices

## Matrix Addition

To sum matrices they have to be of the same dimension.

If $ \textbf{A} $ and $ \textbf{B} $ are two matrices of the same dimension and $ k $ is a scalar

1. $ \textbf{A + B = B + A} \qquad\qquad\qquad\qquad \ $  (Commutative)
   
2. $ \textbf{(A + B) + C = A + (B + C)} \qquad $ (Associative)
   
3. $ k(\textbf{A} + \textbf{B}) = k \textbf{A} + k \textbf{B} $

## Matrix Multiplication

If $ \textbf{A} $ and $ \textbf{B} $ are two matrices of the same dimension and $ k $ is a scalar

1. $ \textbf{AB} \ne \textbf{BA} \qquad\qquad\qquad $  (Not Commutative)
   
2. $ \textbf{(AB)C} = \textbf{A(BC)} \qquad \ \ $ (Associative)

3. $ \textbf{(kA)B = k(AB) = A(kB)} $ 
   
4. $ \textbf{A(B + C) = AB + AC} $ and $ \textbf{(A + B)C = AC + BC} $

## Trace of a Matrix

The trace of an $ n \times n $ matrix $ \textbf{A} $ is defined as the sum of the main diagonal elements

$ 
\textbf{A} = \sum \limits_{i=1}^{n} A_{ii}
$

With rules

1. $ Tr(k\textbf{A}) = k Tr(\textbf{A}) $

2. $ Tr(\textbf{A} + \textbf{B}) = Tr(\textbf{A}) + Tr(\textbf{B}) $
   
3. $ Tr(\textbf{A}^T) = Tr(\textbf{A}) \qquad\qquad\qquad $ [because the diagonal is the same for both]

4. $ Tr(\textbf{AB}) = Tr(\textbf{BA}) $

## Row Echelon Form

A matrix is in row echelon form if the following conditions are true

- The row echelon form of a matrix is not unique, a matrix can have many row echelon forms.

- All the entries beneath a leading 1 on a row is 0.

- If a row does not consist entirely of zeros, then the first nonzero number in the row is 1.

- All zero rows are the bottom of the matrix

- In any two successive rows that do not consist entirely of zeros, the leading 1 in the lower row occurs farther to the right than the leading 1 in the higher row

  $
  \begin{bmatrix}
  \color{orange}1 & 4 \\
  \color{red}0 & \color{orange}1 
  \end{bmatrix}
  \ \ \ \ \
  $
  $
  \begin{bmatrix}
  \color{orange}1 & 5 & 2\\
  \color{red}0 & \color{orange}1 & 4\\
  \color{red}0 & \color{red}0 & \color{red}0
  \end{bmatrix}
  \ \ \ \ \
  $
  $
  \begin{bmatrix}
  \color{orange}1 & 3 & 2\\
  \color{red}0 & \color{orange}1 & 7\\
  \color{red}0 & \color{red}0 & \color{orange}1
  \end{bmatrix}
  \ \ \ \ \
  $
  $
  \begin{bmatrix}
  \color{orange}1 & 3 & 2 & 4 & 7\\
  \color{red}0 & \color{orange}1 & 7 & 5 & 2\\
  \color{red}0 & \color{red}0 & \color{orange}1 & 7 & 5\\
  \color{red}0 & \color{red}0 & \color{red}0 & \color{red}0 & \color{red}0
  \end{bmatrix}
  $

### Gaussian Elimination

Row operations to convert a matrix into row echelon form is called Gaussian elimination. 

The three types of elementary row operations are:

- **Multiply a row by a nonzero number**

- **Interchange two rows**

- **Add a multiple of a row to another row**

Process:-

1- Locate the leftmost column that does not consist entirely of zeros.

2- Interchange the top row with another, if necessary, to bring a nonzero entry to the top of the column found in step 1.

3- If the entry that is now at the top of the column found in step 1 is a, multiply multiply the first row by 1/a in order to introduce a leading 1.

4- Add suitable multiples of the top row to the rows below so that all entries below the leading 1 becomes zeros.

5- Now cover the top row in the matrix and begin again with step 1 applied to the submatrix that remains. Continue int this way until the entire matrix is in row echelon form.

### Example
Use Gaussian elimination to convert this matrix to REF

$
\begin{bmatrix}
2 & 4 & 6 \\
2 & 4 & 2 \\
1 & 3 & 1
\end{bmatrix}
$

To obtain a leading 1 on $ R_1 $, divide $ R_1 $ by 2

$
\begin{bmatrix}
1 & 2 & 3 \\
2 & 4 & 2 \\
1 & 3 & 1
\end{bmatrix}
$

Set the leading element in $ R_2 $ to 1 by dividing it by 2

$
\begin{bmatrix}
1 & 2 & 3 \\
1 & 2 & 1 \\
1 & 3 & 1
\end{bmatrix}
$

Now we can create a leading 0 on $ R_2 $ by subtracting $ R_1 $ from $ R_2 $

$
\begin{bmatrix}
1 & 2 & 3 \\
0 & 0 & -2 \\
1 & 3 & 1
\end{bmatrix}
$

Two leading 0's is a good fit for $ R_3 $, so swap $ R_2 $ with $ R_3 $

$
\begin{bmatrix}
1 & 2 & 3 \\
1 & 3 & 1 \\
0 & 0 & -2
\end{bmatrix}
$

Divide $ R_3 $ by -2

$
\begin{bmatrix}
1 & 2 & 3 \\
1 & 3 & 1 \\
0 & 0 & 1
\end{bmatrix}
$

Finally, To get a leading 0 on $ R_2 $, subtract $ R_1 $ from it

$
\begin{bmatrix}
1 & 2 & 3 \\
0 & 1 & -2 \\
0 & 0 & 1
\end{bmatrix}
$

### REF using Sympy

In [2]:
import numpy as np
from sympy import Matrix

A = np.array([[2, -3, -2], [1, -4, 2], [-3, 5, -1]]);

# Convert to sympy matrix
A_Mat = Matrix(A)

# Get REF (Row Echelon Form)
REF = A_Mat.echelon_form()

print(np.array(REF), '\n')

[[2 -3 -2]
 [0 -5 6]
 [0 0 34]] 



## Reduced Row Echelon Form

- Similar to Row Echelon form, with the extra codition that each column with a leading 1 must have 0's above it

- Matrices must only have one reduced row echelon form; the reduced row echelon form is unique.

- Every matrix can be transformed into a reduced row echelon form by a process called Gauss-Jordan elimination

- If a square matrix $ \textbf{A} $ has linearly independent colums, then the reduced row echelon form of the matrix is the identity matrix $ \textbf{I} $

  $
  \begin{bmatrix}
  \color{orange}1 & \color{red}0 \\
  \color{red}0 & \color{orange}1 
  \end{bmatrix}
  \ \ \ \ \ \
  $
  $
  \begin{bmatrix}
  \color{orange}1 & \color{red}0 & 2 \\
  \color{red}0 & \color{orange}1 & 4 \\
  \color{red}0 & \color{red}0 & \color{red}0
  \end{bmatrix}
  \ \ \ \ \ \
  $
  $
  \begin{bmatrix}
  \color{orange}1 & \color{red}0 & 0 \\
  \color{red}0 & \color{orange}1 & \color{red}0 \\
  0 & \color{red}0 & \color{orange}1
  \end{bmatrix}
  \ \ \ \ \ \
  $
  $
  \begin{bmatrix}
  0 & \color{orange}1 & 5 & \color{red}0 & 0 & 0 & 3 & 9 & 0 & 7 \\
  0 & \color{red}0 & 0 & \color{orange}1 & \color{red}0 & 0 & 5 & 7 & 0 & 5 \\
  0 & 0 & 0 & \color{red}0 & \color{orange}1 & \color{red}0 & 7 & 3 & 0 & 3 \\
  0 & 0 & 0 & 0 & \color{red}0 & \color{orange}1 & 2 & 6 & \color{red}0 & 9 \\
  0 & 0 & 0 & 0 & 0 & \color{red}0 & 0 & 0 & \color{orange}1 & 2
  \end{bmatrix}
  $

### Gauss-Jordan Elimination

Gauss-Jordan Elimination is used to convert any matrix to reduced row echelon form.

It uses the same three elementary operations as Gaussian elimination.

It also follows a similar recipe as Gaussian elimination, but with an added extra step 6 below,

  1- Locate the leftmost column that does not consist entirely of zeros.

  2- Interchange the top row with another, if necessary, to bring a nonzero entry to the top of the column found in step 1.

  3- If the entry that is now at the top of the column found in step 1 is a, multiply multiply the first row by 1/a in order to introduce a leading 1.

  4- Add suitable multiples of the top row to the rows below so that all entries below the leading 1 becomes zeros.

  5- Now cover the top row in the matrix and begin again with step 1 applied to the submatrix that remains. Continue int this way until the entire matrix is in row echelon form.

  6- Beginnign with the last nonzero row and working upward, add suitable multiples of each row to the rows above to introduce zeros above the leading 1's.

### Example

Rewrite this following matrix in RREF

$
\begin{bmatrix}
2 & -2 & 4 \\
6 & -2 & 3 \\
3 & -2 & 4
\end{bmatrix}
$

We need a leading 1 on $ R_1 $, divide $ R_1 $ by 2

$
\begin{bmatrix}
1 & -1 & 2 \\
6 & -2 & 3 \\
3 & -2 & 4
\end{bmatrix}
$

Create a leading 0 on $ R_3 $ by subtracting $ 3 \times R_1 $ from $ R_3 $

$
\begin{bmatrix}
1 & -1 & 2 \\
6 & -2 & 3 \\
0 & 1 & -2
\end{bmatrix}
$

Create a leading 0 on $ R_2 $ by subtracting $ 6 \times R_1 $ from $ R_2 $

$
\begin{bmatrix}
1 & -1 & 2 \\
0 & 4 & -9 \\
0 & 1 & -2
\end{bmatrix}
$

We need a 0 above the leading number on $ R_2 $, add $ R_3 $ to $ R_1 $

$
\begin{bmatrix}
1 & 0 & 0 \\
0 & 4 & -9 \\
0 & 1 & -2
\end{bmatrix}
$

We need a leading 0 on $ R_3 $, apply $ -4 \times R_3 + R_2 $ 

$
\begin{bmatrix}
1 & 0 & 0 \\
0 & 4 & -9 \\
0 & 0 & -1
\end{bmatrix}
$

We need a leading 1 on $ R_3 $, multiple by -1

$
\begin{bmatrix}
1 & 0 & 0 \\
0 & 4 & -9 \\
0 & 0 & 1
\end{bmatrix}
$

We need a 0 above the leading 1 of $ R_3 $, therefore add $ 9 \times R_3 $ to $ R_2 $

$
\begin{bmatrix}
1 & 0 & 0 \\
0 & 4 & 0 \\
0 & 0 & 1
\end{bmatrix}
$

All that remains is to divide $ R_2 $  by 4

$
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1
\end{bmatrix}
$

### RREF with Sympy

In [3]:
import numpy as np
from sympy import Matrix

A = np.array([[2, -3, -2], [1, -4, 2], [-3, 5, -1]]);

# Convert to sympy matrix
Mat = Matrix(A)

# Get RREF (Reduced Row Echelon Form)
RREF = Mat.rref()[0]

print(np.array(RREF))

[[1 0 0]
 [0 1 0]
 [0 0 1]]


## Pivot

- The positions of the leading 1's in a row echelon or reduced row echelon form matrix are the pivot positions.

- A nonzero entry in a pivot position is a pivot.

  $
  \begin{bmatrix}
  \color{orange}1 & -3 & 4 & 7\\
  0 & \color{orange}1 & 2 & 2\\
  0 & 0 & \color{orange}8 & 5
  \end{bmatrix}
  $

  1, 1 and 8 in the diagonals are the pivot points of the matrix.

- The columns containing the leading 1's in a row echelon or reduced row echelon form matrix are the pivot columns of the matrix.
   
- The rows containing the leading 1's are the pivot rows.

## Transpose Rules

Let $ \textbf{A} $ and $ \textbf{B} $ be matrices of appropriate sizes and $ k $ a scalar. The following hold:

1. $ (\textbf{A}^{T})^{T} = \textbf{A} $
   
2. $ \textbf{(A + B)}^T = \textbf{A}^T + \textbf{B}^T $
   
3. $ (k\textbf{A})^T = k\textbf{A}^T $
   
4. $ (\textbf{AB})^{T} = \textbf{B}^{T}\textbf{A}^T $

In [16]:
# Proof

A = np.array([[4, 1, 7], [6, 5, 9], [2, 6, 7]])

B = np.array([[6, 4, 9], [1, 4, 7], [3, 0, 6]])

# 1.
print(((A.T).T == A).all())

# 2.
print((((A + B).T == (A.T + B.T))).all())

# 3. 
print((((5 * A).T) == (5 * A.T)).all())

# 4. 
print( (((A * B).T) == (B.T * A.T)).all())

True
True
True
True


## Inverse Matrix

- The inverse of a matrix $ \textbf{A} $ is a matrix $ \textbf{A}^{-1} $ such that $ \textbf{A}^{-1}\textbf{A} = \textbf{I} $

- The inverse of a matrix is unique for that matrix.
  
- A square matrix $ \textbf{A} $ is invertible if and only if $ |\textbf{A}| \neq 0 $

- A square matrix $ \textbf{A} $ is invertible if the rows of $ \textbf{A} $ are independent

- A square matrix $ \textbf{A} $ is invertible if the columns of $ \textbf{A} $ are independent

- A square matrix $ \textbf{A} $ is invertible if and only if $ \lambda = 0 $ is not an eigenvalue of $ \textbf{A} $.

- Not every matrix is invertible.

- The inverse matrix can be calculated using 

  $ \textbf{A}^{-1} = \dfrac{\large{1}}{\large{|\textbf{A}|}} Adj(\textbf{A}) $ 

  where $ Adj(\textbf{A}) $ is the Adjoint matrix and is defined as $ Adj(\textbf{A}) = (cofactor(\textbf{A}))^{T} $  

- Let A and B be invertible matrices, the following identities hold:

  - $ (\textbf{A}^{−1})^{−1} = \textbf{A} $

  - $ (\textbf{A}^{T})^{−1} = (\textbf{A}^{−1})^{T} $

  - $ (\textbf{AB})^{−1} = \textbf{B}^{−1} \textbf{A}^{−1} $

  - $ (k\textbf{A})^-1 = \dfrac{1}{k} \textbf{A}^{-1} $

### Inverse Matrix using SciPy

In [17]:
# setup

A = np.array([[0, -3, -2], [1, -4, -2], [-3, 4, 1]])

B = np.array([[1, 2, 3], [0, 1, 4], [5, 6, 0]])

A_inv = sp.linalg.inv(A)

B_inv = sp.linalg.inv(B)


# 1.
print((np.round(sp.linalg.inv(A_inv), 4) == A).all())

# 2. 
print((np.round(sp.linalg.inv(np.matmul(A, B)), 4) == np.round(np.matmul(B_inv, A_inv), 4)).all())

# 3. 
print((np.round(sp.linalg.inv(A.T), 4) == np.round(sp.linalg.inv(A).T, 4)).all())

True
True
True


### Inverse Matrix using RREF

Also known as the 'inversion algorithm'. Let $ A $ be an $ n \times n $ square matrix. Then the matrix $ \textbf{A} $ is invertible only if its reduced row echelon form is the identity matrix $ \textbf{I} $.

Proof:

By definition $ \textbf{A}^{-1}\textbf{A} = I $

Let $ E_1, \ E_2, \ E_3, \ ..., \ E_n $ be the set of Gaussian-Jordan transformations that converts $ \textbf{A} $ to an identity matrix $ \textbf{I} $

$ \implies E_1 \ E_2 \ E_3 \ ... \ E_n \ \textbf{A} = \textbf{I} $

multiply both sides by $ \textbf{A}^{-1} $

$ \implies E_1 \ E_2 \ E_3 \ ... \ E_n \ \textbf{A} \textbf{A}^{-1} = \textbf{I} \textbf{A}^{-1} $

$ \implies E_1 \ E_2 \ E_3 \ ... \ E_n \ \textbf{I} = \textbf{A}^{-1} $

$ \therefore \textbf{A}^{-1} = E_1 \ E_2 \ E_3 \ ... \ E_n $

### Example
Find the inverse of the following matrix by transforming it to reduce row echelon form

$
\begin{bmatrix}
1 & 2 & 3 \\
2 & 5 & 3 \\
1 & 0 & 8
\end{bmatrix}
$

Start by augmenting the matrix with the identity matrix


$
\begin{bmatrix}
1 & 2 & 3 & | & 1 & 0 & 0 \\
2 & 5 & 3 & | & 0 & 1 & 0 \\
1 & 0 & 8 & | & 0 & 0 & 1 
\end{bmatrix}
$

Apply Gauss-Jordan transformations such that the left hand of the matrix becomes the identity matrix

- $ R_2 : R_2 - 2 R_1 $
- $ R_3 : R_3 - R_1 $

  $
  \begin{bmatrix}
  1 & 2 & 3 & | & 1 & 0 & 0 \\
  0 & 1 & -3 & | & -2 & 1 & 0 \\
  0 & -2 & 5 & | & -1 & 0 & 1 
  \end{bmatrix}
  $

- $ R_1 : R_1 - 2 R_2 $
- $ R_3 : R_3 + 2 R_2 $

  $
  \begin{bmatrix}
  1 & 0 & 9 & | & 5 & -2 & 0 \\
  0 & 1 & -3 & | & -2 & 1 & 0 \\
  0 & 0 & -1 & | & -5 & 2 & 1 
  \end{bmatrix}
  $

- $ R_1 : R_1 + 9 R_3 $
- $ R_2 : R_2 - 3 R_3 $
- $ R_3 : - 1 R_3 $

  $
  \begin{bmatrix}
  1 & 0 & 0 & | & -40 & 16 & 9 \\
  0 & 1 & 0 & | & 13 & -5 & -3 \\
  0 & 0 & 1 & | & 5 & -2 & -1 
  \end{bmatrix}
  $

$ \therefore $ the inverse matrix is 

  $
  \begin{bmatrix}
  -40 & 16 & 9 \\
  13 & -5 & -3 \\
  5 & -2 & -1 
  \end{bmatrix}
  $

In [18]:
# let's check the invese using SciPy
A = np.array([[1, 2, 3], [2, 5, 3], [1, 0, 8]]);

# Convert to sympy matrix
A_Mat = Matrix(A)

# Get RREF (Reduced Row Echelon Form)
RREF = A_Mat.rref()[0]

print(np.array(RREF), '\n')

print(sp.linalg.inv(A))

[[1 0 0]
 [0 1 0]
 [0 0 1]] 

[[-40.  16.   9.]
 [ 13.  -5.  -3.]
 [  5.  -2.  -1.]]


## Diagonally Dominant Matrix

- A matrix is diagonally dominant if each $ |\textbf{a}_{ii}| $ on the diagonal is larger than the sum of magnitudes along the rest of the row i.
  
  $ |\textbf{a}_{ii}| \gt \sum \limits_{j \ne i}^{n} |a_{ij}| $

- A diagonally dominant matrix is always invertible. But a non-diagonally dominant matrix may or may not be invertible 

### Example

- The following matrix is diagonally dominant since the diagonal elements are greater than the sum of the rest of their rows, therefore this matrix has an inverse.

  $ 
  \begin{bmatrix}
  4 & 1 & 2 \\
  1 & 3 & 1 \\
  0 & 1 & 2 
  \end{bmatrix}
  $

  - 4 > 1 + 2
  
  - 3 > 1 + 1
  
  - 2 > 0 + 1

- The following matrix is non-diagonally dominant, but it is still invertible

  $
  \begin{bmatrix}
  1 & -4 & 2 \\
  -2 & 1 & 3 \\
  2 & 6 & 8 
  \end{bmatrix}
  $

  - 1 < 4 + 2
  
  - 1 < 2 + 3
  
  - 8 = 2 + 6

- The following matrix is not diagonally dominant, since the first diagonal element is less than the sum of the other elements on its row, and it is also non-invertible

  $
  \begin{bmatrix}
  1 & 6 & 4 \\
  2 & 4 & -1 \\
  -1 & 2 & 5 
  \end{bmatrix}
  $

  - 1 < 6 + 4
  
  - 4 > 2 + 1
  
  - 5 > 2 + 1

## Determinant of a Matrix

Given an $ n \times n $ matrix $ \textbf{A} $, the determinant $ \textbf{det(A)} $ or $ |\textbf{A}| $ is a scalar quantity which is defined for a $ 2 \times 2 $ and can be extended for higher dimensions 

$ 
|\textbf{A}| =
\begin{vmatrix}
a & b \\
c & d 
\end{vmatrix}
    =
ad - bc
$

### Rules

- Any matrix $ \textbf{A} $ that has one or more rows or columns with filled only with 0's, will have a determinant of 0

  $ 
  \begin{bmatrix}
  4 & 1 & 2 \\
  \color{red}0 & \color{red}0 & \color{red}0 \\
  0 & 1 & 2 
  \end{bmatrix}
   \ \ or \ \
  \begin{bmatrix}
  4 & 1 & \color{red}0 \\
  1 & 3 & \color{red}0 \\
  0 & 1 & \color{red}0 
  \end{bmatrix}
  \ \ \ \ \
  \implies
  |\textbf{A}| = 0
  $

- If $ \textbf{A} $ is a square matrix with two rows (or two columns) that are equal then its determinant will be 0

  $ 
  \begin{bmatrix}
  4 & 1 & 2 \\
  \color{red}5 & \color{red}8 & \color{red}2 \\
  \color{red}5 & \color{red}8 & \color{red}2 
  \end{bmatrix}
  \ \ or \ \
  \begin{bmatrix}
  4 & \color{red}1 & \color{red}1 \\
  1 & \color{red}3 & \color{red}3 \\
  0 & \color{red}1 & \color{red}1 
  \end{bmatrix}
  \ \ \ \ \
  \implies
  |\textbf{A}| = 0
  $

- The determinant of the transpose of a matrix is equal to the determinant of the matrix

    $ |\textbf{A}^T| = |\textbf{A}| $
  
- If matrix $ A $ is invertible, then

    $ |\textbf{A}^{-1}| = \Large{\frac{1}{|\textbf{A}|}} $

- The determinant of the matrix $ \textbf{A} $ raised ot power $ m $ is related to the determinant of the matrix as follows

    $ |\textbf{A}^m| = |\textbf{A}|^m $

- The determinant of the adjoint of a matrix is related to the determinant of the matrix as follows

    $ |Adj(\textbf{A})| = |\textbf{A}|^{n-1} $
  
- Let $ \textbf{A} $ and $ \textbf{B} $ be square matrices, then
    
    $ \textbf{|AB|} = \textbf{|A||B|} $

- Suppose that $ \textbf{A} $ is a square matrix and let $ \textbf{B} $ be the matrix obtained by interchanging two rows of $ \textbf{A} $, then 

    $ \textbf{|B|} = −\textbf{|A|} $

    $ 
    \textbf{A} = 
    \begin{bmatrix}
    a_{11} & a_{12} & a_{13} \\
    a_{21} & a_{22} & a_{23} \\
    a_{31} & a_{32} & a_{33} 
    \end{bmatrix}
    , \ \ \ \ \
    \textbf{B} = 
    \begin{bmatrix}
    a_{11} & a_{12} & a_{13} \\
    \color{red}a_{31} & \color{red}a_{32} & \color{red}a_{33} \\
    \color{red}a_{21} & \color{red}a_{22} & \color{red}a_{23}
    \end{bmatrix}
    \ \ \ \ \
    \implies
    \textbf{|B|} = −\textbf{|A|} 
    $  

- Let $ \textbf{A} $ be a square matrix, and let $ \textbf{B} $ be the matrix obtained by multiplying a row of $ \textbf{A} $ by $ \beta $, then

    $ \textbf{|B|} = \beta|\textbf{A}| $

    $ 
    \textbf{A} = 
    \begin{bmatrix}
    a_{11} & a_{12} & a_{13} \\
    a_{21} & a_{22} & a_{23} \\
    a_{31} & a_{32} & a_{33} 
    \end{bmatrix}
    , \ \ \ \ \
    \textbf{B} = 
    \begin{bmatrix}
    a_{11} & a_{12} & a_{13} \\
    \color{red}\beta \ a_{21} & \color{red}\beta \ a_{22} & \color{red}\beta \ a_{23} \\
    a_{31} & a_{32} & a_{33} 
    \end{bmatrix}
    \ \ \ \ \
    \implies
    \textbf{|B|} = \beta|\textbf{A}|
    $ 

- Let $ \textbf{A} $ be a square matrix, and let $ \textbf{B} $ be the matrix obtained from $ \textbf{A} $ by adding $ \beta $ times the kth row to the jth row, then 

    $ \textbf{|B|} = \textbf{|A|} $

    $ 
    \textbf{A} = 
    \begin{bmatrix}
    a_{11} & a_{12} & a_{13} \\
    a_{21} & a_{22} & a_{23} \\
    a_{31} & a_{32} & a_{33} 
    \end{bmatrix}
    , \ \ \ \ \
    \textbf{B} = 
    \begin{bmatrix}
    a_{11} & a_{12} & a_{13} \\
    a_{21} & a_{22} & a_{23} \\
    \color{red}a_{31}+\beta a_{21} & \color{red}a_{32}+\beta \ a_{22} & \color{red}a_{33} + \beta \ a_{23}
    \end{bmatrix}
    \ \ \ \ \
    \implies
    \textbf{|B|} = \textbf{|A|}
    $ 

- Let $ \textbf{A} $ be an $ n \times n $ square matrix, then 

    $ \beta|\textbf{A}| = \beta^{n} |\textbf{A}| $

    $ 
    \textbf{A} = 
    \begin{bmatrix}
    a_{11} & a_{12} & a_{13} \\
    a_{21} & a_{22} & a_{23} \\
    a_{31} & a_{32} & a_{33} 
    \end{bmatrix}
    , \ \ \ \ \
    \beta \textbf{A} = 
    \begin{bmatrix}
    \color{red} \beta \ a_{11} & \color{red} \beta \ a_{12} & \color{red} \beta \ a_{13} \\
    \color{red} \beta \ a_{21} & \color{red} \beta \ a_{22} & \color{red} \beta \ a_{23} \\
    \color{red} \beta \ a_{31} & \color{red} \beta \ a_{32} & \color{red} \beta \ a_{33} 
    \end{bmatrix}
    \ \ \ \ \
    \implies
    \beta \textbf{|A|} = \beta^{n} |\textbf{A}|
    $

- If $ \textbf{A} $ is an upper or lower triangular matrix, then the determinant is the product of its main-diagonal entries

  $ |\textbf{A}| = a_{11} \ a_{22} \ \cdots \ a_{nn} = \prod \limits_{i=1}^{n} a_{ii} $

    $ 
    \begin{bmatrix}
    a_{11} & 0 & 0 \\
    a_{21} & a_{22} & 0 \\
    a_{31} & a_{32} & a_{33} 
    \end{bmatrix}
    , \ \ \ \ \
    \begin{bmatrix}
    a_{11} & a_{12} & a_{13} \\
    0 & a_{22} & a_{23} \\
    0 & 0 & a_{33} 
    \end{bmatrix}
    \ \ \ \ \
    \implies
    |\textbf{A}| = \prod \limits_{i=1}^{n} a_{ii}
    $  
  

### Determinant with SciPy

In [19]:
# numpy determinant
A = np.array([[4, 1, 7], [6, 5, 9], [2, 6, 7]])

print(sp.linalg.det(A))

82.00000000000001


### Minors and Cofactors

- If $ \textbf{A} $ is a square matrix, then the minor of entry $ \textbf{a}_{ij} $ is denoted by $ \textbf{M}_{ij} $ or $ \textbf{A}(i|j) $ and is defined to be the determinant of the sub-matrix that remains after the $ i^{th} $ row and $ j^{th} $ column are deleted from A.
  
  <br/>
  
    $ 
    \textbf{A} = 
  \begin{bmatrix}
  a_{11} & a_{12} & a_{13} \\
  a_{21} & a_{22} & a_{23} \\
  a_{31} & a_{32} & a_{33} \\ 
  \end{bmatrix}
  \implies 
  M_{ij} = A(1|1) = 
    \begin{vmatrix}
  a_{22} & a_{23} \\
  a_{32} & a_{33}
  \end{vmatrix}
  $

<br/>

- The number $ \textbf{C}_{ij} = (-1)^{i+j}\textbf{M}_{ij} $ is called the cofactor of entry $ \textbf{a}_{ij} $. The matrix formed from the elements $ \textbf{C}_{ij} $ is called the matrix of cofactors

  <br/>

    $ 
    \textbf{C} = 
  \begin{bmatrix}
  M_{11}  & -M_{12} & M_{13} \\
  -M_{21}  & M_{22} & -M_{23} \\
  M_{31}  & -M_{32} & M_{33} \\ 
  \end{bmatrix}
  $

<br/>

- The determinant of any matrix $ \textbf{A} $ can be written as 

  $ |\textbf{A}| = \sum \limits_{i=1}^{n} \textbf{A}_{ij}C_{ij} $


### Diagonal Matrices

If $ \textbf{A} $ is an $ n \times n $ triangular matrix (upper, lower or diagonal), then $ \textbf{|A|} $ is the product of the entries on the main diagonal of the matrix, $ \textbf{|A|} = a_{11} \ a_{22} \ a_{33} \ ... \ a_{nn} $

<br/>
For an upper triangular matrix:

  $ 
  \begin{vmatrix}
  a_{11} & a_{12} & a_{13} & a_{14} \\
  0 & a_{22} & a_{23} & a_{24} \\
  0 & 0 & a_{33} & a_{34} \\
  0 & 0 & 0 & a_{44}
  \end{vmatrix}
  = a_{44}
  \begin{vmatrix}
  a_{11} & a_{12} & a_{13} \\
  0 & a_{22} & a_{23} \\
  0 & 0 & a_{33} 
  \end{vmatrix}
  = a_{44} \ a_{33}
  \begin{vmatrix}
  a_{11} & a_{12} \\
  0 & a_{22} 
  \end{vmatrix}
  = a_{44} \ a_{33} \ a_{22}
  \begin{vmatrix}
  a_{11}
  \end{vmatrix}
  = a_{44} \ a_{33} \ a_{22} \ a_{11}
  $

<br/>
For an lower triangular matrix:

  $ 
  \begin{vmatrix}
  a_{11} & 0 & 0 & 0 \\
  a_{21} & a_{22} & 0 & 0 \\
  a_{31} & a_{32} & a_{33} & 0 \\
  a_{41} & a_{42} & a_{43} & a_{44}
  \end{vmatrix}
  = a_{11}
  \begin{vmatrix}
  a_{22} & 0 & 0 \\
  a_{32} & a_{33} & 0 \\
  a_{42} & a_{43} & a_{44} 
  \end{vmatrix}
  = a_{11} \ a_{22}
  \begin{vmatrix}
  a_{33} & 0 \\
  a_{43} & a_{44} 
  \end{vmatrix}
  = a_{11} \ a_{22} \ a_{33}
  \begin{vmatrix}
  a_{44}
  \end{vmatrix}
  = a_{11} \ a_{22} \ a_{33} \ a_{44}
  $
  
  <br/>
  For a plain diagonal matrix:

  $ 
  \begin{vmatrix}
  a_{11} & 0 & 0 & 0 \\
  0 & a_{22} & 0 & 0 \\
  0 & 0 & a_{33} & 0 \\
  0 & 0 & 0 & a_{44}
  \end{vmatrix}
  = a_{11} \ a_{22} \ a_{33} \ a_{44}
  $
  


## Matrix Row & Column Vectors

- For any matrix, the rows and columns can be considered as vectors.

- If we have an $ m \times n $ matrix $ \textbf{A} $, then it will have $ m $ row vectors and $ n $ column vectors, denoted by $ \textbf{row(A)} $ and $ \textbf{col(A)} $.

- The subspace spanned by the row vectors of $ \textbf{A} $, $ row(\textbf{A}) $, is called the row space of $ \textbf{A} $

- The subspace spanned by the column vectors of $ \textbf{A} $, $ \textbf{col(A)} $, is called the column space of $ \textbf{A} $

### Example

$ 
\textbf{A} = 
\begin{bmatrix}
  0 & 1 \\
  2 & 3 \\
  -1 & 2 
\end{bmatrix}
$

Will have row vectors $ \textbf{row(A)} $

$ 
\textbf{r}_1=
\begin{bmatrix}
  0 & 1
\end{bmatrix}
, \ \ \ \
\textbf{r}_2=
\begin{bmatrix}
  2 & 3
\end{bmatrix}
, \ \ \ \
\textbf{r}_3=
\begin{bmatrix}
  -1 & 2
\end{bmatrix}
$

and column vectors $ \textbf{col(A)} $

$
\textbf{c}_1=
\begin{bmatrix}
  0 \\
  2 \\
  -1
\end{bmatrix}
, \ \ \ \
\textbf{c}_2=
\begin{bmatrix}
  1 \\
  3 \\
  2 
\end{bmatrix}
$

#### Column Space

- If $ \textbf{A} $ is an m x n matrix, then the subspace of $ \mathbb{R} $ spanned by the column vectors of $ \textbf{A} $ is denoted $ \textbf{col({A})} $ and called the column space of $ \textbf{A} $. 

  $ 
  \textbf{A} = 
    \begin{bmatrix}
    a_{11} & a_{12} & \cdots & a_{1n} \\
    a_{21} & a_{22} & \cdots & a_{2n} \\
    \vdots & \vdots &   & \vdots \\
    a_{m1} & a_{m2} & \cdots & a_{mn}
    \end{bmatrix}
  $

  with column vectors

  $ 
  \textbf{c}_1=
  \begin{bmatrix}
    a_{11} \\
    a_{21} \\
    \vdots \\
    a_{m1}
  \end{bmatrix}
  , \ \ \ \
  \textbf{c}_2=
  \begin{bmatrix}
    a_{12} \\
    a_{22} \\
    \vdots \\
    a_{m2}
  \end{bmatrix}
  , \ \ \ \
  \cdots
  , \ \ \ \
  \textbf{c}_n=
  \begin{bmatrix}
    a_{1n} \\
    a_{2n} \\
    \vdots \\
    a_{mn}
  \end{bmatrix}
  $

  A vector that belongs to the space $ \textbf{col(A)} $ is a linear combination of the column vectors $ \{ \textbf{c}_1, \ \textbf{c}_2, \ ..., \ \textbf{c}_n \} $.

  $ \therefore \textbf{x} = x_1 \ \textbf{c}_1 + x_2 \ \textbf{c}_2 + ... + x_n \ \textbf{c}_n$

  We therefore have:

  - If $ \textbf{x}, \textbf{y} \in \textbf{col(A)}$, then  $ \textbf{x} + \textbf{y} \in \textbf{col(A)} $
  - If $ \textbf{x} \in \textbf{col(A)}$ and $ k $ is a scalar, then  $ k \textbf{x} \in \textbf{col(A)} $

- If a matrix $ \textbf{R} $ is in row echelon form, then the column vectors with the leading 1's of the row vectors form a basis for $ col(\textbf{R}) $

## Basis Vectors for the Row Space

Basis vectors are a set of linearly independent vectors that span a space. In the case of matrices, they will have both row basis vectors, as well column basis vectors.

To evaluate the basis vectors for the matrix row space, $ \textbf{row(A)} $, we need to transform the matrix to row echelon form. The nonzero rows of the matrix will form a basis for $ \textbf{row(A)} $.

#### Example

Evaluate the basis vectors for the row space of the following matrix

$ 
\textbf{A} = 
  \begin{bmatrix}
  1 & -3 & 4 & -2 & 5 & 4 \\
  2 & -6 & 9 & -1 & 8 & 2 \\
  2 & -6 & 9 & -1 & 9 & 7 \\
  -1 & 3 & -4 & 2 & -5 & -4 \\
  \end{bmatrix}
$

Start by evaluating the row echelon form of this matrix

In [20]:
A = np.array([[1, -3, 4, -2, 5, 4], 
              [2, -6, 9, -1, 8, 2], 
              [2, -6, 9, -1, 9, 7], 
              [-1, 3, -4, 2, -5, -4]])

# Convert to sympy matrix
A_Mat = Matrix(A)

# Get RREF (Reduced Row Echelon Form)
REF = A_Mat.echelon_form()

print(np.array(REF))

[[1 -3 4 -2 5 4]
 [0 0 1 3 -2 -6]
 [0 0 0 0 1 5]
 [0 0 0 0 0 0]]


Therefore the basis vectors for $ \textbf{row(A)} $ are

$ 
\textbf{r}_{1}=
\begin{bmatrix}
  1 & -3 & 4 & -2 & 5 & 4
\end{bmatrix}
$

$
\textbf{r}_{2}=
\begin{bmatrix}
  0 & 0 & 1 & 3 & -2 & -6
\end{bmatrix}
$

$
\textbf{r}_{3}=
\begin{bmatrix}
  0 & 0 & 0 & 0 & 1 & 5
\end{bmatrix}
$

These are three linearly independent vectors that span the row space of the matrix.

## Basis Vectors for the Column Space

We can find the basis vectors for the column space of a matrix, $ \textbf{col(A)} $, as follows

1. Reduce the the matrix to row echelon form
   
2. Identify the columns containing the leading 1's
   
3. The corresponding columns of the original matrix form a basis for $ \textbf{col(A)} $

#### Example:

Evaluate the basis vectors for the column space of the following matrix

$ 
\textbf{A} = 
  \begin{bmatrix}
  1 & -3 & 4 & -2 & 5 & 4 \\
  2 & -6 & 9 & -1 & 8 & 2 \\
  2 & -6 & 9 & -1 & 9 & 7 \\
  -1 & 3 & -4 & 2 & -5 & -4 \\
  \end{bmatrix}
$

Find the row echelon form of the matrix

In [21]:
A = np.array([[1, -3, 4, -2, 5, 4], 
              [2, -6, 9, -1, 8, 2], 
              [2, -6, 9, -1, 9, 7], 
              [-1, 3, -4, 2, -5, -4]])

# Convert to sympy matrix
A_Mat = Matrix(A)

# Get RREF (Reduced Row Echelon Form)
REF = A_Mat.echelon_form()

print(np.array(REF))

[[1 -3 4 -2 5 4]
 [0 0 1 3 -2 -6]
 [0 0 0 0 1 5]
 [0 0 0 0 0 0]]


Find the columns that have a leading 1, they will be columns 1, 3 and 5

$ 
\textbf{c}_1=
\begin{bmatrix}
  1 \\
  0 \\
  0 \\
  0
\end{bmatrix}
, \ \ \ \
\textbf{c}_3=
\begin{bmatrix}
  4 \\
  1 \\
  0 \\
  0
\end{bmatrix}
, \ \ \ \
\textbf{c}_5=
\begin{bmatrix}
  5 \\
  -2 \\
  1 \\
  0
\end{bmatrix}
$

The basis vectors of the matrix columns space will be the columns 1, 3 and 5 in the original matrix

Therefore the column space, $ \textbf{col(A)} $, basis vectors will be

$ 
\textbf{c}_1=
\begin{bmatrix}
  1 \\
  2 \\
  3 \\
  -1
\end{bmatrix}
, \ \ \ \
\textbf{c}_3=
\begin{bmatrix}
  4 \\
  9 \\
  9 \\
  -4
\end{bmatrix}
, \ \ \ \
\textbf{c}_5=
\begin{bmatrix}
  5 \\
  8 \\
  9 \\
  -5
\end{bmatrix}
$

These basis vectors are linearly independent and form basis vectors for $ \textbf{col(A)} $

## Null Space and Nullity of a Matrix

All vectors $ \textbf{x} $ satisfying the equation $ \textbf{Ax = 0} $ make up the null space of $ \textbf{A} $.

- Convert the equation $ \textbf{Ax = 0} $ into matrix augmented form

- Reduce the matrix $ \textbf{A} $ to RREF

- Then create an augmented matrix by combining it with the $ \textbf{0} $ vector

- Solve for leading variables

- Deconstruct the resulting vector into basis vectors, these are the basis vectors for null space. They are linearly independent set of vectors which span the null space.

- The number of null space basis vectors is the nullity of the matrix $ \textbf{A} $, written as $ \textbf{nullity(A)} $

## Rank of Matrix

- The number of pivots in RREF(A) is called the rank of A. It is the number of linearly independent rows, or equivalently, the number of linearly independent columns.

- Only a zero matrix has a rank of zero.

- The rank of a matrix can be found using three methods. The most easiest of these methods is "converting matrix into echelon form".

  - Using echelon form
  - Using normal form
  - Minor method

### Rank of Matrix Using Echelon Form


### Rank of Matrix Using Normal Form


### Rank of Matrix Using Minor Method

## Eigenvectors and Eigenvalues

- Let $ \textbf{A} $ be an $ n \times n $ matrix and let $ \textbf{v} $ be a non-zero vector. If the equation $ \textbf{Av} = \lambda \textbf{v} $ is true for some scalar $ \lambda $, then we call the vector $ \textbf{v} $ an eigenvector of $ \textbf{A} $ and we call the scalar $ \lambda $ an eigenvalue of $ \textbf{A} $.

- Let $ \textbf{A} $ be an $ n \times n $ matrix. The polynomial $ p(\lambda) = |\textbf{A} − \lambda \textbf{I}| $ is called the characteristic polynomial and is of $ n^{th} $ degree.

- If $ \textbf{A} $ is an $ n \times n $ matrix that has n distinct eigenvalues $ \{\lambda_1, \lambda_2, . . . , \lambda_n \} $ with corresponding eigenvectors $ \lambda_{i} $. Then $ \{\textbf{v}_1, \textbf{v}_2, \cdots, \textbf{v}_n\} $, then the eigenvectors forms a basis for $ \mathbb{R}^n $.

- The square matrix $ \textbf{A} $ is invertible if and only if $ \lambda = 0 $ is not an eigenvalue of $ \textbf{A} $.

- Let $ \textbf{A} $ be a triangular matrix (either upper or lower). Then the eigenvalues of $ \textbf{A} $ are its diagonal entries

- If $ \textbf{A} $ is a symmetric matrix then all of its eigenvalues are real numbers

- Let $ \textbf{A} $ be a symmetric matrix. If $ \textbf{v}_1 $ and $ \textbf{v}_2 $ are eigenvectors of $ \textbf{A} $ corresponding to distinct eigenvalues then $ \textbf{v}_1 $ and $ \textbf{v}_2 $ are orthogonal, that is, $ \textbf{v}_1 . \textbf{v}_2 = 0 $

In [22]:
A = np.array([[2, 2, 4], [1, 3, 5], [2, 3, 4]])

w, v= sp.linalg.eig(A)

print('List of Eigenvalues', w)
print('\nList of Eigenvectors\n', v)

List of Eigenvalues [ 8.80916362+0.j  0.92620912+0.j -0.73537273+0.j]

List of Eigenvectors
 [[-0.52799324 -0.77557092 -0.36272811]
 [-0.604391    0.62277013 -0.7103262 ]
 [-0.59660259 -0.10318482  0.60321224]]


In [23]:
# proof

eval_1 = 8.80916362
eval_2 = 0.92620912
eval_3 = -0.73537273

evec_1 = np.array([-0.52799324, -0.604391, -0.59660259])

print(A @ evec_1)
print(eval_1 * evec_1)

[-4.65117884 -5.32417919 -5.25556984]
[-4.65117884 -5.32417921 -5.25556983]


## Diagonalization

1. A matrix $ \textbf{D} $ whose off-diagonal entries are all zero is called a diagonal matrix.

2. A matrix $ \textbf{A} $ is called diagonalizable if it is similar to a diagonal matrix $ \textbf{D} $, and there exists an invertible $ \textbf{P} $ such that
   
   $ \textbf{A} = \textbf{PDP}^{−1} $

3. A matrix $ \textbf{A} $ is diagonalizable if and only if there is a basis $ {\textbf{v}_1, \textbf{v}_2, \cdots , \textbf{v}_n} $ of $ \mathbb{R}^n $ consisting of eigenvectors of $ \textbf{A} $.

4. Suppose that $ \textbf{A} $ is a square matrix with n distinct eigenvalues $ λ_1, λ_2, . . . , λ_n $. Then $ \textbf{A} $ is diagonalizable.

5. A matrix $ \textbf{A} $ is diagonalizable if and only if the algebraic and geometric multiplicities of each eigenvalue are equal.

## Positive Definite / Semi-Definite / Negative-Definite

- A symmetric matrix A with real entries is positive-definite if the real scalar number $ \textbf{z}^{T}\textbf{Az} $ is positive for every nonzero real column vector \textbf{z}.

- Positive semi-definite matrices are defined similarly, except that the scalars $ \textbf{z}^{T}\textbf{Az} $ and $ \textbf{z}^{*}\textbf{Az} $ are required to be positive or zero (that is, nonnegative). 
  
- Negative-definite and negative semi-definite matrices are defined analogously. A matrix that is not positive semi-definite and not negative semi-definite is sometimes called indefinite.

## Hermitian Matrix

A Hermitian matrix is positive-definite if the real number $ \textbf{z}^{*}\textbf{Az} $ is positive for every nonzero complex column vector. $ \textbf{z}^* $ denotes the conjugate transpose of $ \textbf{z} $.

## LU Decomposition

## PLU Decomposition

Let $ \textbf{A} $ be an $ m \times n $ matrix. Then there exists a permutation matrix $ \textbf{P} $ such that $ \textbf{A = PLU} $, where $ \textbf{L} $ is a lower triangular $ m \times m $ matrix with 1s on the diagonal, $ \textbf{U} $ is an $ m \times n $ upper triangular matrix and $ \textbf{P} $ is an $ m \times m $ matrix known as the permutation matrix (a permutation of the rows of the identity matrix).



In [24]:
A = np.array([[2, 5, 8, 7], [5, 2, 2, 8], [7, 5, 6, 6], [5, 4, 4, 8]])

p, l, u = sp.linalg.lu(A)

print(A, '\n')

print(u, '\n')

print(p, '\n')

print(p @ l @ u)

[[2 5 8 7]
 [5 2 2 8]
 [7 5 6 6]
 [5 4 4 8]] 

[[ 7.          5.          6.          6.        ]
 [ 0.          3.57142857  6.28571429  5.28571429]
 [ 0.          0.         -1.04        3.08      ]
 [ 0.          0.          0.          7.46153846]] 

[[0. 1. 0. 0.]
 [0. 0. 0. 1.]
 [1. 0. 0. 0.]
 [0. 0. 1. 0.]] 

[[2. 5. 8. 7.]
 [5. 2. 2. 8.]
 [7. 5. 6. 6.]
 [5. 4. 4. 8.]]


## Cholesky Decomposition

Cholesky decomposition is a special case of $ \textbf{LU} $ decomposition applicable to Hermitian positive definite matrices.

When $ \textbf{A} = \textbf{A}^H $ and $ \textbf{x}^H \textbf{Ax} \ge 0 $ for all x, then decompositions of $ \textbf{A} $ can be found so that

$ \textbf{A} = \textbf{U}^{H}\textbf{U} $

$ \textbf{A} = \textbf{LL}^{H} $

where $ \textbf{L} $ is lower triangular and $ \textbf{U} $ is upper triangular where $ \textbf{L} = \textbf{U}^{H} $

The Cholesky decomposition is often used as a fast way of solving $ \textbf{Ax = b} $ (when $ \textbf{A} $ is both Hermitian/symmetric and positive-definite).

In [25]:
A = np.array([[6, 3, 4, 8], [3, 6, 5, 1], [4, 5, 10, 7], [8, 1, 7, 25]])

L = sp.linalg.cholesky(A, lower = True)
U = sp.linalg.cholesky(A, lower = False)

print(A, '\n')

print(L, '\n')

print(U, '\n')

print(L @ U)

[[ 6  3  4  8]
 [ 3  6  5  1]
 [ 4  5 10  7]
 [ 8  1  7 25]] 

[[ 2.44948974  0.          0.          0.        ]
 [ 1.22474487  2.12132034  0.          0.        ]
 [ 1.63299316  1.41421356  2.30940108  0.        ]
 [ 3.26598632 -1.41421356  1.58771324  3.13249102]] 

[[ 2.44948974  1.22474487  1.63299316  3.26598632]
 [ 0.          2.12132034  1.41421356 -1.41421356]
 [ 0.          0.          2.30940108  1.58771324]
 [ 0.          0.          0.          3.13249102]] 

[[ 6.  3.  4.  8.]
 [ 3.  6.  5.  1.]
 [ 4.  5. 10.  7.]
 [ 8.  1.  7. 25.]]


## QR Decomposition

QR decomposition of a matrix $ \textbf{A} $ is a decomposition into a product $ \textbf{A = QR} $ of an orthonormal matrix $ \textbf{Q} $ and an upper triangular matrix $ \textbf{R} $ .

In [26]:
A = np.array([[12, -51, 4], [6, 167, -68], [-4, 24, -41]])
Q, R = sp.linalg.qr(A)

print(A, '\n')

print(Q, '\n')

print(R, '\n')

print(Q @ R)

[[ 12 -51   4]
 [  6 167 -68]
 [ -4  24 -41]] 

[[-0.85714286  0.39428571  0.33142857]
 [-0.42857143 -0.90285714 -0.03428571]
 [ 0.28571429 -0.17142857  0.94285714]] 

[[ -14.  -21.   14.]
 [   0. -175.   70.]
 [   0.    0.  -35.]] 

[[ 12. -51.   4.]
 [  6. 167. -68.]
 [ -4.  24. -41.]]


## Linear Systems

Linear systems are represented using linear equations, where $ x_i $ are the unknowns we need to evaluate, $ a_{ij} $ and $ b_i $ are known

$ a_{11} \ x_{1} + a_{12} \ x_{2} + \cdot\cdot\cdot + a_{1n} \ x_{n} = b_1 $

$ a_{21} \ x_{1} + a_{22} \ x_{2} + \cdot\cdot\cdot + a_{2n} \ x_{n} = b_2 $

$ \ \ \ \ \vdots \ \ \ \ \ \ \ \ \ \ \ \ \vdots $ 

$ a_{m1} \ x_{1} + a_{m2} \ x_{2} + \cdot\cdot\cdot + a_{mn} \ x_{n} = b_3 $

These equations can be represented in matrix form as $ \textbf{Ax=b} $

$ 
\begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1n} \\
a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
a_{m1} & a_{m2} & \cdots & a_{mn}
\end{bmatrix}
\begin{bmatrix}
x_{1} \\
x_{2} \\
\vdots \\
x_{n}
\end{bmatrix}
    =
\begin{bmatrix}
b_{1} \\
b_{2} \\
\vdots \\
b_{n}
\end{bmatrix}
$

To solve these equations, convert the matrix equation to an augmented matrix as follows

$ 
\begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1n} & | & b_1 \\
a_{21} & a_{22} & \cdots & a_{2n} & | & b_2 \\
\vdots & \vdots & \ddots & \vdots & | & \vdots \\
a_{m1} & a_{m2} & \cdots & a_{mn} & | & b_m
\end{bmatrix}
$

Then transform to RREF

## Solution Rules 

Solutions to linear systems can have several forms:

- One solution (consistent and independent)
  
  RREF is the identity matrix.
  
  $ 
  \begin{bmatrix}
  1 & 0 & 0 & | & b_1 \\
  0 & 1 & 0 & | & b_2 \\
  0 & 0 & 1 & | & b_3
  \end{bmatrix}
  $
  
- Many solutions (consistent and dependent)
  
  RREF has less non-zero rows than there are variables.

  $ 
  \begin{bmatrix}
  1 & 0 & 0 & | & b_1 \\
  0 & 1 & 0 & | & b_2 \\
  0 & 0 & 0 & | & 0
  \end{bmatrix}
  $
  
  
- No solutions (Inconsistent and independent)

  RREF has less non-zero rows than there are variables, with a value on the right of the augmentation line

  $ 
  \begin{bmatrix}
  1 & 0 & 0 & | & b_1 \\
  0 & 1 & 0 & | & b_2 \\
  0 & 0 & 0 & | & b_3
  \end{bmatrix}
  $

### One Solution

- For two equations with two unknowns, each representing a line, the lines cross at one point Therefore the solution is consistent and independent 
  
  The example below shows two lines that cross at the point (5, 2)

  $ 3x + y = 17 $

  $ 4x - y = 18 $

  <img src="./images/two_equations_one_solution.png" width="350px">
  
  Solve this using RREF

  $ 
  \begin{bmatrix}
  3 & 1 & | & 17 \\
  4 & -1 & | & 18 
  \end{bmatrix}
  $

In [1]:
import numpy as np
from sympy import Matrix

A = np.array([[3, 1, 17], [4, -1, 18]]);

# Convert to sympy matrix
Mat = Matrix(A)

# Get RREF (Reduced Row Echelon Form)
RREF = Mat.rref()[0]

print(np.array(RREF))

[[1 0 5]
 [0 1 2]]


$ 
\begin{bmatrix}
1 & 0 & | & 5 \\
0 & 1 & | & 2 
\end{bmatrix}
$

This is a consistent RREF

$ \implies x=5, \ \ \ y=2 $

- For Three equations with three unknowns, the three planes meet at a single point and produce only one solution. Therefore the solution is consistent and independent.
  
  <img src="./images/three_equations_one_solution1.png" width="350px">

  The example below shows three planes intersecting at only a single point (3, −2, 1) 

  $ x + y + z = 2 $

  $ 6x - 4y + 5z = 31 $
  
  $ 5x + 2y + 2z = 13 $

  <img src="./images/three_equations_one_solution2.png" width="350px">

  Use RREF to evaluate

  $ 
  \begin{bmatrix}
  1 & 1 & 1 & | & 2 \\
  6 & -4 & 5 & | & 31 \\
  5 & 2 & 2 & | & 13 
  \end{bmatrix}
  $

In [2]:
import numpy as np
from sympy import Matrix

A = np.array([[1, 1, 1, 2], [6, -4, 5, 31], [5, 2, 2, 13]]);

# Convert to sympy matrix
Mat = Matrix(A)

# Get RREF (Reduced Row Echelon Form)
RREF = Mat.rref()[0]

print(np.array(RREF))

[[1 0 0 3]
 [0 1 0 -2]
 [0 0 1 1]]


$ 
\begin{bmatrix}
1 & 0 & 0 & | & 3 \\
0 & 1 & 0 & | & -2 \\
0 & 0 & 1 & | & 1 \\
\end{bmatrix}
$

This is a consistent RREF

$ \implies x=3, \ \ \ y=-2, \ \ \ z=1 $

## Multiple Solutions

- For two equations with two unknowns, this occurs when one of the equations is a multiple of the other equation. This means we only have one equation with two unknowns.
  
  Graphically, the two lines go through the same points

  $ 2x+4y = 8 $

  $ x+2y = 4 $

  Every point on the two lines is a solution, so we have an infinite number of solutions
  
  <img src="./images/two_equations_many_solutions.png" width="304px">

  Use RREF to evaluate

  $ 
  \begin{bmatrix}
  2 & 4 & | & 8 \\
  1 & 2 & | & 4
  \end{bmatrix}
  $

In [5]:
import numpy as np
from sympy import Matrix

A = np.array([[2, 4, 8], [1, 2, 4]]);

# Convert to sympy matrix
Mat = Matrix(A)

# Get RREF (Reduced Row Echelon Form)
RREF = Mat.rref()[0]

print(np.array(RREF))

[[1 2 4]
 [0 0 0]]


$ 
\begin{bmatrix}
1 & 2 & | & 4 \\
0 & 0 & | & 0
\end{bmatrix}
$

This is a consistent RREF

$ \implies x+2y=4 \ \ \implies y = \dfrac{1}{2}(4-x) \ \ \ $ provides all the infinite solutions


- For three equations with three unknowns, this occurs when the planes intersect each other on one line

  <img src="./images/three_equations_many_solutions1.png" width="350px">

  The example below shows three planes intersecting through a finite length line
  
  $ x + y + z = 7 $

  $ 3x – 2y – z = 4 $

  $ x + 6y + 5z = 24 $
  
  <img src="./images/three_equations_many_solutions2.png" width="350px">

  Let's obtain the solution using RREF

  $ 
  \begin{bmatrix}
  1 & 1 & 1 & | & 7 \\
  3 & -2 & -1 & | & 4 \\
  1 & 6 & 5 & | & 24 
  \end{bmatrix}
  $

In [6]:
import numpy as np
from sympy import Matrix

A = np.array([[1, 1, 1, 7], [3, -2, -1, 4], [1, 6, 5, 24]]);

# Convert to sympy matrix
Mat = Matrix(A)

# Get RREF (Reduced Row Echelon Form)
RREF = Mat.rref()[0]

print(np.array(RREF))

[[1 0 1/5 18/5]
 [0 1 4/5 17/5]
 [0 0 0 0]]


$ 
\begin{bmatrix}
1 & 0 & 1/5 & | & 18/5 \\
0 & 1 & 4/5 & | & 17/5 \\
0 & 0 & 0 & | & 0 
\end{bmatrix}
$

This is a consistent RREF

$ \implies x + \dfrac{1}{5} z = \dfrac{18}{5} \ \ \implies x = \dfrac{1}{5} (z-18) $  

$ \implies y + \dfrac{4}{5} z = \dfrac{17}{5} \ \ \implies y = \dfrac{1}{5} (17 - 4z) $  

The infinite solution gives the line

$ x= \dfrac{1}{5} (t-18), \ \ \ y = -\dfrac{1}{5}(4t -17), \ \ \ z = t $

## No Solution

- For two equations with two unknowns, no solution exists if the two lines are parallel.

  $ 3x+2y=5 $

  $ 6x+4y=8 $

  <img src="./images/two_equations_no_solution.png" width="310px">

  Use RREF to evaluate the solution

  $ 
  \begin{bmatrix}
  3 & 2 & | & 5 \\
  6 & 4 & | & 8
  \end{bmatrix}
  $

In [7]:
import numpy as np
from sympy import Matrix

A = np.array([[3, 2, 5], [6, 4, 8]]);

# Convert to sympy matrix
Mat = Matrix(A)

# Get RREF (Reduced Row Echelon Form)
RREF = Mat.rref()[0]

print(np.array(RREF))

[[1 2/3 0]
 [0 0 1]]


$ 
\begin{bmatrix}
1 & 2/3 & | & 0 \\
0 & 0 & | & 1
\end{bmatrix}
$

This is an inconsistent RREF, hence no solutions.

- For three equations with three unknowns, the plains never all touch at a point or a line
  
  <img src="./images/three_equations_no_solution1.png" width="650px">

### Homogenous Linear Systems

- A homogenous system can be written in matrix form as $ \textbf{Ax}= \textbf{0} $ and always has one of two possible solutions

  - A trivial solution $ \textbf{x=0} $. If the matrix $ \textbf{A} $ is invertible (i.e. $ |\textbf{A}| \neq 0 $) then this is always true.

  - Infinitely many solutions. This occurs when the set of equations has more unknowns than equations, i.e. $ n > m $

- In the 2D case, the homogenous equations either describe two lines that cross at the origin (no offset term), or two lines that are parallel and stacked on top of each other.

  For the trivial case, the lines always meet at the origin (none of the lines is offset from the origin):

  <img src="./images/homogenous_trivial.png" width="300px" style="filter:invert(1)">    

  For the infinite case (the lines are stacked on each other and parallel):

  <img src="./images/homogenous_infinite.png" width="304px" style="filter:invert(1)">

  #### Example 1:

  $ x_1 - 5 x_2 = 0 $

  $ x_1 + 2 x_2 = 0 $

  These equations represent the lines in the first image above. The lines clearly cross at the origin and therefore we expect the trivial solution $ \textbf{x} = 0 $.

  from the first equation, we have

  $ \implies x_1 = 5 x_2 $

  plug this into the second equation

  $ \implies 5 x_2 = -2 x_2 $

  The only way this can be true is if $ x_2 = 0 $ and $ x_1 = 0 $, so the solution is 
  $ \textbf{x} =
    \begin{bmatrix}
      0 \\
      0
    \end{bmatrix} 
  $

  In matrix form we have

  $ 
  \textbf{A} =
  \begin{bmatrix}
  1 & -5 \\
  1 & 2 
  \end{bmatrix}
  $

  with determinant $ |\textbf{A}| = 7 $

  Therefore these homogenous equations have the trivial solution

  #### Example 2:

  $ x_1 + 2 x_2 = 0 $

  $ 2 x_1 + 4 x_2 = 0 $

  These equations represent the lines in the second image above. The lines are both parallel and stacked on top of each other.

  from the first equation, we have

  $ \implies x_1 = -2 x_2 $

  plug this into the second equation

  $ \implies 2 (-2 x_2) = -4 x_2  \implies x_2 = x_2 $

  This is true for an infinite number of values of $ x_2 $, also infinite values of $ x_1 = -2 x_2 $

  So all these vectors are solutions for all $ t \in \mathbb{R}^1 $

  $ 
  \textbf{x} =
  \begin{bmatrix}
  -2t \\
  t 
  \end{bmatrix}
  $

  In matrix form we have

  $ 
  \textbf{A} =
  \begin{bmatrix}
  1 & 2 \\
  2 & 4 
  \end{bmatrix}
  $

  with determinant $ |\textbf{A}| = 0 $

  Let's check our result with $ x_2 = 5 $ and $ x_1 = -10 $

  $ 
  \begin{bmatrix}
  1 & 2 \\
  2 & 4 
  \end{bmatrix}
  \begin{bmatrix}
  -10 \\
  5  
  \end{bmatrix}
      =
  \begin{bmatrix}
  0 \\
  0  
  \end{bmatrix}
  $

  Therefore these homogenous equations have infinitely many solutions.

## Non-Homogenous Linear Systems

- A system of linear equations $ \textbf{Ax=b} $ is consistent (has solutions) if and only if $ \textbf{b} $ is in the column space of $ \textbf{A} $. So $ \textbf{b} \in col(\textbf{A})$

  #### proof: 
  
  Write the equation $ \textbf{Ax=b} $ as a sum of column vectors, with $ \textbf{A} = [\textbf{a}_1 \ \textbf{a}_2 \cdots \textbf{a}_n] $   
  
  $ 
  \begin{bmatrix}
  a_{11} & a_{12} & \cdots & a_{1n} \\
  a_{21} & a_{22} & \cdots & a_{2n} \\
  \vdots & \vdots & \ddots & \vdots \\
  a_{m1} & a_{m2} & \cdots & a_{mn}
  \end{bmatrix}
  \begin{bmatrix}
  x_{1} \\
  x_{2} \\
  \vdots \\
  x_{n}
  \end{bmatrix}
      =
  \begin{bmatrix}
  b_{1} \\
  b_{2} \\
  \vdots \\
  b_{n}
  \end{bmatrix}
  $
  
  $ \implies x_1\textbf{a}_1 + x_2\textbf{a}_2 + \cdots + x_n\textbf{a}_n = \textbf{b} $

  Therefore we conclude that $ \textbf{b} $ is a linear combination of the column vectors of $ \textbf{A} $.

- Suppose that the linear system $ \textbf{Ax = b} $ is consistent and let $ \textbf{p} $ be a solution. Then any other solution $ \textbf{q} $ of the system $ \textbf{Ax = b} $ can be written in the form $ \textbf{q = p + v} $, for some vector $ \textbf{v} $ that is a solution to the homogeneous system $ \textbf{Ax = 0} $.
