<a href="https://colab.research.google.com/github/mdallas1/shared_code/blob/main/linear_algebra_review.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This Colab notebook reviews the essential content from linear algebra, and demonstrates some basic matrix operations with Octave. To run the code, click the play button that appears to the left when you place your mouse over the gray code cells. The one immediately below this cell may take a minute to run, but you're good to go once you see a green check to the left of the cell. In this notebook, you'll always see two adjacent cells (aside from the first one immediately below). Always run the one that says %%writefile first, and then run the one below it to see the output.

In [None]:
!apt install octave

# Linear Algebra Review

For the purposes of this class, a *matrix* is an $m\times n$ array of real or complex numbers with $m$ rows and $n$ columns. If $A$ is an $m\times n$ matrix, its entry in the $i$th row and $j$th column is denote $a_{ij}$. For a concrete example, if $A$ is a $3\times 5$ matrix (3 rows and 5 columns), then $A$ can be written as
$$
\begin{align}
A = \begin{bmatrix} a_{11} &a_{12} &a_{13} &a_{14} &a_{15} \\ a_{21} &a_{22} &a_{23} &a_{24} &a_{25} \\ a_{31} &a_{32} &a_{33} &a_{34} &a_{35}\end{bmatrix}
\end{align}
$$

In Octave, a matrix is denoted by square brackets, with semicolons indicated the end of a row. The code cells below demonstrates how to define a matrix entrywise.

In [None]:
#@title Define a matrix entrywise
%%writefile a_mat.m
A = [1 2 3; 4 5 6; 7 8 9]

In [None]:
!octave -W a_mat.m

We can also define a matrix via its rows or columns. For example, we can write $A = [\mathbf{a}_1 \hspace{0.25em} \mathbf{a}_2\hspace{0.25em} \cdots \hspace{0.25em}\mathbf{a}_n]$, where $\mathbf{a}_j$ is the $j$th column of $A$. Alternativley, we can write $A = [\mathbf{a}^1 \hspace{0.25em} \mathbf{a}^2 \hspace{0.25em}\cdots \hspace{0.25em}\mathbf{a}^m]$ where $\mathbf{a}^i$ is the $i$th row of $A$. We can define $A$ by rows or columns in Octave as demonstrated below.


In [None]:
#@title Define a matrix by rows or columns

%%writefile a_mat2.m

% Define A by rows
a1 = [1 2 3]; a2 = [4 5 6]; a3 = [7 8 9];
A_rows = [a1 ; a2 ; a3]

% Define A by columns
a1 = [1 ; 4 ; 7]; a2 = [2 ; 5 ; 8]; a3 = [3 ; 6; 9];
A_cols = [a1 a2 a3]

In [None]:
!octave -W a_mat2.m

A *vector* is an $m\times 1$ matrix or $1\times n$ matrix. The latter case is a row vector, and the former is a column vector. If we just say "vector," we typically mean a column vector.

A $1\times 1$ matrix is a *scalar*. Scalars will always be denoted by lower case letters, and matrices will be denoted with upper case letters. We reserve **bold-faced** letters for **vectors**.

# Matrix Operations

To multiply a matrix by a scalar, simply multiply each entry in the matrix by the given scalar.

In [None]:
#@title Matrix times a scalar
%%writefile mat_scalar.m
A = [1 2 3; 4 5 6]
two_times_A = 2*A

In [None]:
!octave -W mat_scalar.m

To add two matrices, you must first check that they have the same dimensions. If they do, then you just add the corresponding entries.

In [None]:
#@title Matrix plus a matrix
%%writefile mat_p_mat.m
A = [1 2 3; 4 5 6], B = [1 1 -1; 0 2 4],

% Add matrices of equal dimensions
A_plus_B = A + B

% Add matrices of different dimensions is undefined (Octave throws error)
A_plus_column_vec = A + [1 ; 2 ;3]

In [None]:
!octave -W mat_p_mat.m

Let 0 denote the matrix of all zeros. We can define the $m\times n$ matrix of all zeros in octave with the command zeros(m,n).

In [None]:
#@title Define zero matrix

%%writefile zero_mat.m
% matrix of all zeros with 5 rows and 8 columns.
Z = zeros(5,8)

In [None]:
!octave -W zero_mat.m

If $A$, $B$, and $C$ are matrices of equal dimensions, 0 the zero matrix, and $\alpha$ and $\beta$ are scalars, then the following hold.
$$
\begin{align}
A+0 &= A = 0+A \\
A+B &= B+A\\
(A+B)+C &= A+(B+c)\\
\alpha(A+B) &= \alpha A+\alpha B\\
(\alpha+\beta) A &= \alpha A + \beta A
\end{align}
$$

An operation that comes up frequently is the *transpose* of a matrix. Given $A$, its transpose $A^T$ is obtained by defining row $i$ of $A^T$ to be column $i$ of $A$. In Octave, $A'$ is the transpose of $A$.


In [None]:
#@title Compute the transpose
%%writefile mat_transpose.m
A = [1 2 3; 4 5 6]

% Compute transpose
A_transpose = A'

In [None]:
!octave -W mat_transpose.m

# Matrix Multiplication

We will build matrix multiplication up from the dot product, to a matrix-vector product, and then finally to a matrix-matrix product. Given two vectors of equal length $n$, $\mathbf{a}$ and $\mathbf{b}$, we define their *dot product* as
$$
\begin{align}
\mathbf{a}\cdot \mathbf{b} = \sum_{i=1}^n a_ib_i
\end{align}
$$
An example with $n=3$ is given below.

In [None]:
#@title Compute dot product
%%writefile dot_prod.m
a = [1 ; 2 ; 3], b = [3 ; 2 ; 1]
dot1 = a'*b, dot2 = dot(a,b)

In [None]:
!octave -W dot_prod.m

Notice in the code above that we compute the dot product of $\mathbf{a}$ and $\mathbf{b}$ in two different ways. dot2 is computed using the built in function *dot*, and dot1 takes the transpose of the column vector $\mathbf{a}$ and multiplies it by the column vector $\mathbf{b}$. In general, a row vector on the left times a column vector on the right is always defined, and equals the dot product of the two vectors.

## Matrix-Vector products

Now we can define matrix-vector products, which takes a particularly nice form if we express the $m\times n$ matrix $A$ with rows:
$$
\begin{align}
A = \begin{bmatrix} \mathbf{a}^1 \\ \mathbf{a}^2 \\ \vdots \\ \mathbf{a}^m \end{bmatrix} = \begin{bmatrix}  a^1_1 &a_2^1 &\cdots &a_n^1\\
a_1^2 & a_2^2 & \cdots & a_n^2\\
\vdots  \\
a_1^m &a_2^m &\cdots &a_n^m\end{bmatrix}
\end{align}
$$
Let $\mathbf{v} = [v_1 \hspace{0.25em} v_2 \hspace{0.25em}\cdots \hspace{0.25em} v_n]^T$ denote a general column vector of length $n$. We then define
$$
\begin{align}
A\mathbf{v} = \begin{bmatrix} \mathbf{a}^1\cdot \mathbf{v} \\ \mathbf{a}^2\cdot \mathbf{v} \\ \vdots \\ \mathbf{a}^m\cdot \mathbf{v} \end{bmatrix} =  \begin{bmatrix} a_1^1v_1 + a_2^1v_2 + \cdots + a_n^1v_n \\
\vdots \\
a_m^1v_1 + a_2^mv_2 + \cdots + a_n^mv_n
 \end{bmatrix}
\end{align}
$$
Note that the result of multiplying an $m\times n$ matrix with an $n\times 1$ vector is an $m\times 1$ vector.

**Warning**: if the length of $\mathbf{v}$ does not equal the number of columns of $A$, then $A\mathbf{v}$ is not defined.

We'll also note here that the dot product $\mathbf{a}\cdot\mathbf{b}$ for two vectors of length $n$ requires $n$ multiplications and $n-1$ additions for a total of $2n-1$ flops, or floating point operations.

Since matrix-vector multiplication of an $m\times n$ matrix with an $n\times 1$ vector requires $m$ dot products to be computed, it follows that a matrix-vector product requires $2nm-m$ flops.

In [None]:
#@title Compute matrix vector product
%%writefile mat_vec.m
A = [1 2 3 ; 4 5 6], v = [-1 ; 2 ; 1]
A_times_v = A*v

In [None]:
!octave -W mat_vec.m

## Matrix-Matrix multiplication

Let $A$ be an $m\times n$ matrix with rows $\mathbf{a}^i$, and $B$ be an $n\times p$ matrix with columns $\mathbf{b}_j$. The $ij$th entry of the product $AB$ is defined as $\mathbf{a}^i\cdot\mathbf{b}_j$. In other words, the $j$th column of $AB$ is $A\mathbf{b}_j$. We can express this as
$$
\begin{align}
AB = A[\mathbf{b}_1 \hspace{0.25em}\mathbf{b}_2 \hspace{0.25em}\cdots \hspace{0.25em}\mathbf{b}_p] =[A\mathbf{b}_1\hspace{0.25em} A\mathbf{b}_2\hspace{0.25em} \cdots \hspace{0.25em}A\mathbf{b}_p]
\end{align}
$$
The number of columns of $A$ must match the number of rows of $B$ for $AB$ to be defined. If $A$ is $m\times n$ and $B$ is $n\times p$, then $AB$ has dimensions $m\times p$.

We've seen that the product of an $m\times n$ matrix with an $n\times 1$ vector requires $2mn-m$ flops. Matrix-matrix multiplication of an $m\times n$ matrix with an $n\times p$ matrix requires $p$ matrix-vector products where each vector is $n\times 1$. Thus, a matrix-matrix product requires $(2nm-m)p = 2nmp-mp$ flops. In the case of square matrices with $n$ rows and $n$ columns, we can say that a matrix-matrix product is $\mathcal{O}(n^3)$. We remark that there are ways to [compute matrix products more efficiently](https://en.wikipedia.org/wiki/Computational_complexity_of_matrix_multiplication).

In [None]:
#@title Compute matrix-matrix product
%%writefile mat_mat.m
A = [1 2 3 ; 4 5 6], B = [1 0 -1; 1 1 1; 1 0 0]
AtimesB = A*B

In [None]:
!octave -W mat_mat.m

We will let $I_n$ denote the $n\times n$ matrix whose entries are all zero except for those on the diagonal. That is, if $e_{ij}$ is the $ij$th entry of $I_n$, then $e_{ij} = \delta_{ij}$, where $\delta_{ij}$ is the [Kronecker delta](https://en.wikipedia.org/wiki/Kronecker_delta). $I_n$ acts like the number 1 for matrices with respect to multiplication. We therefore call $I_n$ the **identity matrix**. We will frequently just write $I$ for the identity and let the context inform the dimension.

In [None]:
#@title Define identity matrix
%%writefile identity_mat.m

% define identity matrix with 5 rows and 5 columns.
I = eye(5)

In [None]:
!octave -W identity_mat.m

### Summary of matrix multiplication properties

Let $A$, $B$, and $C$ be matrices whose dimensions are such that the operations below are well defined, $I$ denotes the identity matrix, and let $\alpha$ denote a scalar. The following hold.
$$
\begin{align}
A(BC) &= (AB)C\\
A(B+C) &= AB+AC\\
(B+C)A &= BA+CA\\
\alpha (AB) &= (\alpha A)B = A(\alpha B)\\
IA &= A = AI
\end{align}
$$


## Here be dragons

1. $AB\neq BA$ in general.


In [None]:
%%writefile ab_not_ba.m
A = [1 1 ; 1 1], B = [1 0 ; 0 2]
A_times_B = A*B, B_times_A = B*A

In [None]:
!octave -W ab_not_ba.m

2. $AB=AC$ does not, in general, imply that $B=C$.

In [None]:
%%writefile abisac_b_not_c.m

A = [1 -1 ; 2 -2], B = [1 2 ; 1 1], C = [1 0; 1 -1]
A_times_B = A*B, A_times_C = A*C

In [None]:
!octave -W abisac_b_not_c.m

3. $AB=0$ does not necessarily mean $A=0$ or $B=0$. Indeed, take
$$
\begin{align}
A = B = \begin{bmatrix} 0 &1\\ 0 &0 \end{bmatrix}
\end{align}
$$

In (2) and (3), if we could somehow "divide" by the matrix $A$, then we could indeed say that $B=C$ and $B=0$ respectively. This brings us to the notion of *invertible matrices*.

# Matrix Inverses

If $A$ is a matrix that has the same number of rows and columns, then we say that $A$ is a *square matrix*. Let $A$ be a square matrix. If there exists a square matrix $B$ such that
$$
\begin{align}
AB = I = BA
\end{align}
$$
then we say that $A$ is *invertible*, or *nonsingular*, and we call $B$ the *inverse matrix* of $A$. The inverse $B$ is frequently denoted $A^{-1}$.

Some things to note:

1. If $A$ is not invertible, then we say $A$ is *singular*.
2. If $A$ is invertible, and $AB=AC$, then $B=C$. This can be shown using the definition of $A^{-1}$ and the properties of matrix-matrix multiplication.
$$
\begin{align}
B = IB = (A^{-1}A)B = A^{-1}(AB) = A^{-1}(AC) = (A^{-1}A)C = IC = C.
\end{align}
$$

# The Determinant

A quantity that frequently arises when discussing square matrices is the *determinant*. For a $2\times 2$ matrix, the determinant is defined as
$$
\begin{align}
\text{det}\begin{bmatrix} a &b \\ c &d\end{bmatrix} := ad-bc.
\end{align}
$$

Geometrically, the determinant tells us how much the matrix $A$ "squishes" the space it's acting on.

In [None]:
#@title Compute determinant
%%writefile det_a.m
A = [1 2 ; 3 4], detA = det(A) % 4 - 6 = -2

In [None]:
!octave -W det_a.m

## Determinants of larger square matrices

Let $A_{ij}$ denote the matrix obtained by deleting the $i$th row and the $j$th column. For example,
if
$$
\begin{align}
A = \begin{bmatrix} 1 &2 &5 \\ 0 &3 &2 \\ 1 &1 &-1\end{bmatrix}
\end{align}
$$
then
$$
\begin{align}
A_{23} = \begin{bmatrix} 1 &2\\ 1 &1\end{bmatrix}
\end{align}
$$
Let $C_{ij} = (-1)^{i+j}\text{det}(A_{ij})$. For a general $n\times n$ matrix, the determinant is
$$
\begin{align}
\text{det}(A) = \sum_{j=1}^n a_{ij}C_{ij}
\end{align}
$$
The numbers $C_{ij}$ are called *cofactors*, and this method of computing $\text{det}(A)$ is called *cofactor expansion*. It does not matter which row or column you expand along.

**Example**: Here we expand along the first row.
$$
\begin{align}
\text{det}\begin{bmatrix} 1 &2 &3 \\ 4 &5 &6 \\ 7 &8 &9\end{bmatrix}  = 1\cdot \text{det}\begin{bmatrix} 5 &6\\ 8 &9\end{bmatrix} - 2\cdot\begin{bmatrix} 4 &6\\ 7 &9\end{bmatrix}+3\cdot \begin{bmatrix} 4 &5\\ 7 &8\end{bmatrix} = 0.
\end{align}
$$

It is useful to know that given two square matrices $A$ and $B$, $\text{det}(AB) = \text{det}(A)\text{det}(B)$.

## Determinants and inverses

Determinants and inverse matrices are intimately related. In fact, it can be proven that a square matrix $A$ is invertible if and only if $\text{det}(A)\neq 0$.

We just saw that
$$
\begin{align}
\text{det}\begin{bmatrix} 1 &2 &3 \\ 4 &5 &6 \\ 7 &8 &9\end{bmatrix}  = 0
\end{align}
$$
This tells us that
$$
\begin{align}
\begin{bmatrix} 1 &2 &3 \\ 4 &5 &6 \\ 7 &8 &9\end{bmatrix}
\end{align}
$$
is singular, i.e., it does not have an inverse.

Not only does $\text{det(A)}$ tell us if $A$ has an inverse, the determinant can tell us what that inverse is. If $\det(A)\neq 0$, then it it can be shown (using Cramer's rule) that
$$
\begin{align}
A^{-1} = \dfrac{1}{\text{det}(A)}\begin{bmatrix}
C_{11} &\cdots &C_{n1} \\
\vdots &\ddots &\vdots \\
C_{1n} &\cdots & C_{nn} \end{bmatrix}= \dfrac{C^T}{\det(A)}
\end{align}
$$
where $C_{ij} = (-1)^{i+j}\text{det}(A_{ij})$. Note that there is no typo in the indices above. The $n1$ entry of $A^{-1}$ is indeed $C_{1n}/\det(A)$. The matrix $C$ is called the *cofactor matrix*, and the matrix $C^T$ is called the *adjugate*.

It is quite expensive to compute a determinant using cofactor expansion. If we let $c_n$ denote the cost of computing the determinant of an $n\times n$ matrix using cofactor expansion. It follows that
$$
\begin{align}
c_n = nc_{n-1}+2n-1
\end{align}
$$
since cofactor expansions requires $n$ determinants of $n-1 \times n-1$ matrices, $n$ multiplications, and $n-1$ sums. We can get a sense of how quickly $c_n$ grows just by looking at $nc_{n-1}$. Since our formula holds for any $n$, $nc_{n-1} = n(n-1)c_{n-2} +$ stuff, $n(n-1)c_{n-2} = n(n-1)(n-2)c_{n-3}+$ stuff, and so on. It appears that $c_n \sim n!$ for large $n$. With a bit more care, one can show that $c_n\sim 3n!$ for large $n$ (see exercise 5.1 in our book).

The point is that, using cofactor expansion, the computational cost of computing $\det(A)$ for an $n\times n$ matrix $A$ is $\mathcal{O}(n!)$.
## Solutions to square linear systems

Suppose $A$ is a square matrix with $n$ columns, and consider the linear system $A\mathbf{x}=\mathbf{b}$. This system has a unique solution if and only if $A$ is nonsingular, in which case $\mathbf{x} = A^{-1}\mathbf{b}$. To solve this system, we could thus compute $A^{-1}$ and then multiply this by $\mathbf{b}$. However, we just saw that computing determinants can be expensive. There are ways to compute them more efficiently, but as we'll see in the next lecture, computing the determinant in this more efficient way leads us to an efficient way to solve the system without computing the inverse. Hence $A^{-1}$ is rarely used explicitly to solve a linear system.  

# Some Special Types of Matrices

There are a few special types of matrices that arise frequently in scientific computing. We discuss some of these below and provide examples.

1. A square matrix $A$ is said to be **diagonal** if all $a_{ij} = 0$ whenever $i\neq j$. The collection of entries $a_{ii}$ for $i=1,...,n$ is known as the *digaonal*. Every identity matrix is a diagonal matrix. An example of a $3\times 3$ diagonal matrix different from the identity is
$$ \begin{bmatrix} 1 &0 &0 \\ 0 &2 &0 \\ 0 &0 &3\end{bmatrix}$$

2. A square matrix $A$ is said to be **lower triangular** if $a_{ij} = 0$ for all $j > i$. In other words, all entries *above* the diagonal are zero. An example of such a matrix is
$$ \begin{bmatrix} 1 &0 &0 \\ 2 &3 &0 \\ 4 &5 &6\end{bmatrix} $$

3. An **upper triangular** matrix $A$ is a matrix such that $a_{ij} = 0$ whenever $i > j$. In other words, all entries *below* the main diagonal are zero. An example of an upper triangular matrix is the transpose of the lower triangular example above:
$$ \begin{bmatrix} 1 &0 &0 \\ 2 &3 &0 \\ 4 &5 &6\end{bmatrix}^T = \begin{bmatrix} 1 &2 &4 \\ 0 &3 &5 \\ 0 &0 &6\end{bmatrix} $$

4. A matrix $A$ is **symmetric** if $A^T = A$. That is, if $a_{ij} = a_{ji}$ for all $1\leq i,j\leq n$. All diagonal matrices are symmetric, but a more interesting example of a symmetric matrix is
$$ \begin{bmatrix} 1 &2 &3 \\ 2 &4 &5 \\ 3 &5 & 6 \end{bmatrix}$$
