## Matrix Algebra

In [2]:
import numpy as np

In this section we look at matrix algebra and some of its common properties  We will also see how operations involving matrices are connected to linear systems of equations.

A **matrix** is really nothing more than an array.  When we do computations with matrices using NumPy, we will be using arrays just as we did before.  Let's write down some of examples of matrices and give them names.

$$
\begin{equation}
A = \left[ \begin{array}{cc} 1 & 3 \\ 2 & 1 \end{array}\right] \hspace{1cm} 
B = \left[ \begin{array}{ccc} 3 & 0 & 4 \\ -1 & -2 & 1 \end{array}\right] \hspace{1cm}
C = \left[ \begin{array}{cc} -2 & 1 \\ 4 & 1 \end{array}\right] \hspace{1cm}
D = \left[ \begin{array}{c} 2 \\ 6 \end{array}\right]
\end{equation}
$$

When discussing matrices it is common to talk about their dimensions, or shape, by specifying the number of rows and columns.  The number of rows is usually listed first.  For our examples, $A$ and $C$ are $2\times 2$ matrices, $B$ is a $2 \times 3$ matrix, and $D$ is a $2 \times 1 $ matrix.  Matrices that have only 1 column, such as $D$, are commonly refered to as **vectors**.  We will adhere to this convention as well, but do be aware that when we make statements about matrices, we are also making statements about vectors even if we don't explicitly mention them.  We will also adopt the common convention of using uppercase letters to name matrices.

It is also necessary to talk about the individual entries of matrices.  The common notation for this is to use a lowercase letter with subscripts to denote the position of the entry in the matrix.  So $b_{12}$ refers to the 0 in the first row and second column of the matrix $B$.  If we are talking about generic positions, we might use variables in the subscripts, such as $a_{ij}$.

Let's create these matrices as NumPy arrays before further discussion.

In [3]:
A = np.array([[1, 3],[2,1]])
B = np.array([[3, 0, 4],[-1, -2, 1]])
C = np.array([[-2, 1],[4, 1]])
D = np.array([[2, 6]])

It will be useful for us to access the dimensions of our arrays.  When the array is created, this information gets stored as part of the array object and can be accessed with a method called $\texttt{shape}$.  If $\texttt{B}$ is an array, the object $\texttt{B.shape}$ is another array that has two entries.  The first is the number of rows, and the second is the number of columns.

In [5]:
print("Array B has",B.shape[0],"rows.")
print("Array B has",B.shape[1],"columns.")      

Array B has 2 rows.
Array B has 3 columns.


### Operations on Matrices

There are three algebraic operations for matrices that we will need to perform.  For our definitions let us suppose that $A$ and $C$ are $m \times n$ matrices, $B$ is an $n \times k$ matrix, and $c$ is a number.  When discussing algebra involving matrices and numbers, the numbers are usually referred to as **scalars**. 

1. A matrix of any shape can be multiplied by a single number, typica.  The result is that all entries are multiplied by that number.  Using the subscript notation, we would write

$$
\begin{equation}
(cA)_{ij} = ca_{ij}
\end{equation}
$$

2. Two matrices that *have the same shape* can be added.  The result is that all corresponding entries are added.


$$
\begin{equation}
(A+C)_{ij} = a_{ij} + c_{ij}
\end{equation}
$$

3. If the number of columns of matrix $A$ is equal to the number of rows of matrix $B$, the matrices can be multipled in that order.  The result will be a new matrix $AB$ that has the same number of rows as $A$ and the same number of columns as $B$.  The entries $(AB)_{ij}$ will be the following combination of the entries of row $i$ of $A$ and column $j$ of $B$.

$$
\begin{equation}
(AB)_{ij} = \sum_{k=1}^n a_{ik}b_{kj}
\end{equation}
$$

The last operation, known as **matrix multiplication**, is the most complex and least intuitive of the three.  No doubt this last formula is a bit indimidating the first time we read it.  Let's give some examples to clarify.

1.  The multiplication of a number and a matrix:

$$
\begin{equation}
3A = 3\left[ \begin{array}{cc} 1 & 3 \\ 2 & 1 \end{array}\right] 
= \left[ \begin{array}{cc} 3 & 9 \\ 6 & 3 \end{array}\right]
\end{equation}
$$

2. The sum of two matrices of the same shape:

$$
\begin{equation}
A + C = \left[ \begin{array}{cc} 1 & 3 \\ 2 & 1 \end{array}\right] + 
\left[ \begin{array}{cc} -2 & 1 \\ 4 & 1 \end{array}\right] 
= \left[ \begin{array}{cc} -1 & 4 \\ 6 & 2 \end{array}\right]
\end{equation}
$$

3.  The multiplication of two matrices:

$$
\begin{equation}
AB = \left[ \begin{array}{cc} 1 & 3 \\ 2 & 1 \end{array}\right]
\left[ \begin{array}{ccc} 3 & 0 & 4 \\ -1 & -2 & 1 \end{array}\right]
 = \left[ \begin{array}{ccc} 0 & -6 & 7  \\  5 & -2 & 9  \end{array}\right]
 \end{equation}
$$
 
To clarify what happens in the  matrix multiplication, lets calculate two of the entries in detail.

$$
\begin{eqnarray*}
(AB)_{12} & = & 1\times 0 + 3 \times (-2) = -6 \\
(AB)_{23} & = & 2 \times 4 + 1 \times 1 = 9
\end{eqnarray*}
$$

These matrix operations are all built into NumPy, but we have to use the symbol $\texttt{@}$ instead of $\texttt{*}$ for matrix multiplication.

In [5]:
print(3*A)
print('\n')
print(A+C)
print('\n')
print(A@B)

[[3 9]
 [6 3]]


[[-1  4]
 [ 6  2]]


[[ 0 -6  7]
 [ 5 -2  9]]


### Exercise

Calculate

- $-2E$ 
- $G+F$ 
- $4F-G$ 
- $HG$ 
- $GE$ 

using the following matrices.

$$
\begin{equation}
E = \left[ \begin{array}{c} 5 \\ -2 \end{array}\right] \hspace{1cm} 
F = \left[ \begin{array}{cc} 1 & 6 \\ 2 & 0 \\ -1 & -1 \end{array}\right] \hspace{1cm}
G = \left[ \begin{array}{cc} 2 & 0\\ -1 & 3 \\ -1 & 6 \end{array}\right] \hspace{1cm}
H = \left[ \begin{array}{ccc} 3 & 0 & 1 \\ 1 & -2 & 2 \\ 3 & 4 & -1\end{array}\right]
\end{equation}
$$

Do the exercise on paper first, then check by doing the calculation with NumPy arrays.

### Properties of matrix operations


$$
\begin{equation}
c(A+B) = cA + cB
\end{equation}
$$

$$
\begin{equation}
C(A+B) = CA + CB
\end{equation}
$$

$$
\begin{equation}
A(BC) = (AB)C
\end{equation}
$$

..assuming dimensions are appropriate.


### Identity Matrices

An **identity matrix** is a square matrix that behaves similar to the number 1 with respect to ordinary multiplication.  Identity matrices, labeled with $I$, are made up of ones along the main diagonal, and zeroes everywhere else.  Below is the $4 \times 4$ version of $I$.

$$
\begin{equation}
I = \left[ \begin{array}{ccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{array}\right]
\end{equation}
$$

If $A$ is any other $4 \times 4$ matrix, multiplication with $I$ will produce $A$.  Furthermore it doesn't matter in this case which order the multiplication is carried out.

$$
\begin{equation}
AI = IA = A
\end{equation}
$$

Notes:

1. Actual calculation with identity matrices is rather uncommon, but the idea is useful for symbolic calculations and progressing further with the theory.
2. If we are discussing a non-square matrix, then we must take care to use the correct size identity matrix depending on the order of multiplication.  For example, if $C$ is a $2\times 3$ matrix, $I_2$ is the $2\times 2$ identity, and $I_3$ is the $3\times 3 $ identity, we would have the following result.

$$
\begin{equation}
I_2 C = CI_3 = C
\end{equation}
$$

In [6]:
I5 = np.eye(5)

I5

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [7]:
R = np.random.randint(-10,10,size=(5,5))
print(R)
print('\n')
print(R@I5)
print('\n')
print(I5@R)

[[-7  0  7 -2 -7]
 [-3 -5  2 -6 -1]
 [-8 -9  6 -8 -5]
 [ 5  7  2 -9  5]
 [-5  6  7 -8 -8]]


[[-7.  0.  7. -2. -7.]
 [-3. -5.  2. -6. -1.]
 [-8. -9.  6. -8. -5.]
 [ 5.  7.  2. -9.  5.]
 [-5.  6.  7. -8. -8.]]


[[-7.  0.  7. -2. -7.]
 [-3. -5.  2. -6. -1.]
 [-8. -9.  6. -8. -5.]
 [ 5.  7.  2. -9.  5.]
 [-5.  6.  7. -8. -8.]]


### Matrix-vector multiplication

One special case of matrix multiplication deserves close attention, the case where one of the matrices is a vector.  This case is so important that it is commonly discussed separately and given a special name, **matrix-vector multiplication**.  Let's suppose that $P$ is our matrix and that its shape is $n \times m$, and $X$ is our vector which is $m \times 1$.  The product $PX$ then is a $n \times 1$ vector.  It is the relationship between this new vector and the columns of the matrix $P$ that makes this situation important.

Let's have a look with a specific example.

$$
\begin{equation}
P = \left[ \begin{array}{ccc} 1 & 3 & -2 \\ 5 & 2 & 0 \\ 4 & 2 & -1 \\ 2 & 2 & 0 \end{array}\right]\hspace{1cm}
X = \left[ \begin{array}{c} 2 \\ -3 \\ 4 \end{array}\right]
\end{equation}
$$

In this case of matrix-vector multiplication, we can package the calculation a bit differently to better understand what is happening.

$$
\begin{equation}
PX = \left[ \begin{array}{ccc} 1 & 3 & -2 \\ 5 & 2 & 0 \\ 4 & 2 & -1 \\ 2 & 2 & 0 \end{array}\right]
\left[ \begin{array}{c} 2 \\ -3 \\ 4 \end{array}\right]=
2\left[ \begin{array}{c} 1 \\ 5 \\ 4 \\ 2 \end{array}\right] -
3\left[ \begin{array}{c} 3 \\ 2 \\ 2 \\ 2 \end{array}\right] +
4\left[ \begin{array}{c} -2 \\ 0 \\ -1 \\ 0 \end{array}\right] =
\left[ \begin{array}{c} -15\\ 4 \\ -2 \\ -2  \end{array}\right]
\end{equation}
$$

This is the same operation that we were doing before, but now we see that this product is a result of adding the columns of $P$ after first multiplying each be the corresponding entry in $X$.

### Matrix multiplication by columns

We can extend the calculation of matrix-vector multiplication to better understand exactly what is produced by matrix-matrix multiplication.  Suppose for example that $X$ from the previous calculation was actually the third column of a $3\times 4$ matrix $C$.

$$
\begin{equation}
C = \left[ \begin{array}{cccc} * & * & 2 & * \\ * & * & -3 & * \\ * & * & 4 & *\end{array}\right]
\end{equation}
$$

The third column of the product $PC$ will be exactly $PX$!  The other columns of $PC$ will be the products of $P$ with the corresponding columns of $C$.

$$
\begin{equation}
PC = \left[ \begin{array}{cccc} * & * & -15 & * \\ * & * & 4 & * \\* & * & -2 & *  \\ * & * & -2 & *\end{array}\right]
\end{equation}
$$


This discussion offers a great opportunity to learn how to perform operations on portions of NumPy arrays using a feature called **slicing**.  Let's build the matrices $P$ and $C$, and then define $X$ as a **subarray** of $C$.  To create a subarray of $C$, we use the syntax $\texttt{C[a:b,c:d]}$.  This will create an array object that has shape $(b-a)\times(d-c)$ and contains the entries of rows $a$ to $b-1$ and columns $c$ to $d-1$ of $C$.    

Specifically, we want $X$ to include all rows of $C$, but only the third column (which has Python index 2!).  

In [23]:

P = np.array([[1, 3, -2],[5, 2, 0],[4, 2, -1],[2, 2, 0]])
C = np.array([[0, 6, 2, 6],[-1, 1, -3, 4],[0, 2, 4, 8]])
X = C[0:3,2:3]
print(P)
print('\n')
print(C)
print('\n')
print(X)
print('\n')
print(P@X)

[[ 1  3 -2]
 [ 5  2  0]
 [ 4  2 -1]
 [ 2  2  0]]


[[ 0  6  2  6]
 [-1  1 -3  4]
 [ 0  2  4  8]]


[[ 2]
 [-3]
 [ 4]]


[[-15]
 [  4]
 [ -2]
 [ -2]]


Notes on array slicing:

1. Another way to select all rows (or all columns) from an array is to use : without any numbers.  In our example above, we could have used the following line of code to produce the same result.  Try it out by editing the cell above!
```
X = C[:,2:3]
```
2. If we only want to select a single row or column, it is tempting to try a line of code like the following.
```
X = C[:,2]
```
this is indeed valid code, but the array $X$ is not exactly what we expect.  Instead we get an array with the correct entries, but not the correct shape.  It is possible to make this work, but let's avoid this complication.


#### Exercises

Define NumPy arrays for the following matrices.  


$$
\begin{equation}
H = \left[ \begin{array}{ccc} 3 & 3 & -1  \\ -3 & 0 & 8 \\  1 & 6 & 5 \end{array}\right]\hspace{2cm}
G = \left[ \begin{array}{cccc} 1 & 5 & 2 & -3 \\ 7 & -2 & -3 & 0 \\ 2 & 2 & 4 & 6\end{array}\right]
\end{equation}
$$

1. Multiply the second and third column of $H$ with the first and second row of $G$.  Use slicing to make subarrays.  Does the result have any relationship to the full product $HG$?
2. Multiply the first and second row of $H$ with the second and third column of $G$.  Does this result have any relationship to the full product $HG$?

### Block matrix multiplication