# Matrices

```{admonition} Definition
A *matrix* is a rectangular array of mathematical objects arranged into rows and columns. The plural of matrix is *matrices*.
```

For the time being, the mathematical objects in these rectangular arrays we call matrices will be (real) numbers, but just as with vectors we want our definitions to be flexible. Our matrices are always finite dimensional; that is, they have only a finite number of rows and columns, though in the future you may encounter matrices with an infinite number of rows and/or columns. 

```{admonition} Notation
:class: important
**Notation:** Matrices are usually denoted with capital letters from the front of the English alphabet. A matrix $A$ with $m$ rows and $n$ columns is said to be an $m\times n$ dimensional matrix, and we denote the entry in the $i^{th}$ row and $j^{th}$ column of $A$ as $a_{i,j}$. We always follow a row-first, column-second convention when discussing matrices, and unless we are programming, we use 1-based indexing. 
```

**Example:**

$$
    A=\begin{bmatrix}
        1 & 2 & 3 \\
        4 & 5 & 6 \\
        7 & 8 & 9 \\
        0 & 1 & 2
      \end{bmatrix}
$$

is a $4\times 3$ matrix: it has 4 rows and 3 columns. The entry in position $2, 3$ is $a_{2,3}=6$. The entry in position $3, 2$ is $a_{3, 2}=8$. 

As we will see, there are many different ways to think about what such an array of numbers represents. One particularly useful approach is to think of a matrix as a collection of (column) vectors. We use the name of the matrix with a single subscript to denote a specific column vector; the matrix above, for example, consists of the three column vectors

$$
    A_1 = \begin{bmatrix}
            1 \\
            4 \\
            7 \\
            0
          \end{bmatrix},
    A_2 = \begin{bmatrix}
            2 \\
            5 \\
            8 \\
            1
          \end{bmatrix},\text{ and }
    A_3 = \begin{bmatrix}
            3 \\
            6 \\
            9 \\
            2
          \end{bmatrix}.
$$        

Alternatively, we might think of a vector as a special type of matrix: a matrix with only one column. That is, if a vector $\mathbf{v}$ has $m$ components, it could just as well be viewed as an $m\times 1$ matrix. This is a very useful perspective, and the view that we take will depend on the context in which we find ourselves working.

## Matrix Arithmetic, Part 1

If we think of a matrix as simply a collection of vectors, then the rules for performing many operations on matrices follow immediately: addition, subtraction, and scalar multiplication should be done componentwise, with the caveat that the first two operations are only defined if the matrices being added or subtracted are the same size. Why we might want to do these things, and how we should interpret the results, will have to wait, but at least these initial operations are straightforward.

```{admonition} Matrix Addition/Subtraction
Given two matrices $A$ and $B$ of the same dimensions $m\times n$, the matrix $C = A \pm B$ has in position $i, j$ the entry $c_{i, j} = a_{i, j} \pm b_{i, j}$.
```

**Example:**

$$
    \begin{bmatrix}
        1 & 2 & 3 \\
        0 & 0 & 1
    \end{bmatrix} + 
    \begin{bmatrix}
        2 & 2 &  -1 \\
        1 & 11 & -1
    \end{bmatrix} =
    \begin{bmatrix}
    1 + 2 & 2 + 2 & 3 + (-1) \\
    0 + 1 & 0 + 11 & 1 + (-1)
    \end{bmatrix} =
    \begin{bmatrix}
        3 & 4 & 2 \\
        1 & 11 & 0
    \end{bmatrix}.
$$

**Example:**

$$
    \begin{bmatrix}
        1 & 2 & 3 \\
        0 & 0 & 1
    \end{bmatrix} - 
    \begin{bmatrix}
        2 & 2 &  -1 \\
        1 & 11 & -1
    \end{bmatrix} =
    \begin{bmatrix}
    1 - 2 & 2 - 2 & 3 - (-1) \\
    0 - 1 & 0 - 11 & 1 - (-1)
    \end{bmatrix} =
    \begin{bmatrix}
        -1 & \hfill0 & 4 \\
        -1 & -11 & 2
    \end{bmatrix}.
$$

**Example:**

$$
    \begin{bmatrix}
        1 & 2 & 3 \\
        0 & 0 & 1
    \end{bmatrix} + 
    \begin{bmatrix}
        1 & 1 \\
        1 & 0
    \end{bmatrix}
$$

is undefined, because the matrix on the left is $2\times 3$ while the matrix on the right is $2\times 2$.

```{admonition} Note
:class: important
We call $n\times n$ matrices *square* matrices. They have many important characteristics that distiguish them from their nonsquare counterparts. The matrixe on the right in the example immediately above is a square matrix.
```

```{admonition} Scalar Multiplication
Given a matrix $A$ and a scalar $s$, the entry in position $i, j$ of the matrix $sA$ is $s\cdot a_{i, j}$.
```

**Example:** 

$$
    5\cdot
    \begin{bmatrix}
        1 & 2 & 3 \\
        0 & 0 & 1
    \end{bmatrix} = 
    \begin{bmatrix}
        5\cdot 1 & 5\cdot 2 & 5\cdot 3 \\
        5\cdot 0 & 5\cdot 0 & 5\cdot 1
    \end{bmatrix} = 
    \begin{bmatrix}
        5 & 10 & 15 \\
        0 & 0 & 5
    \end{bmatrix}
$$.

```{admonition} Transposition
Given an $m\times n$ matrix $A$, the $n\times m$ matrix $A^T$ whose $i, j^{th}$ entry is $a_{j,i}$ is called the *transpose* of $A$.
```

**Example:**

Suppose that

$$  A =
        \begin{bmatrix}
            1 & 2 & 3 \\
            0 & 0 & 1
        \end{bmatrix}.
$$

Then

$$
    A^T = 
        \begin{bmatrix}
            1 & 0 \\
            2 & 0 \\
            3 & 1
    \end{bmatrix}.
$$

Transposing a matrix turns its rows into columns and vice versa. If we think of a vector as a matrix with only one column, transposing it turns it into a matrix with only one row. We might think of such a matrix as the previously mentioned but not yet studied *row vector*.

**Example:**

Let 

$$
    \mathbf{v} =
        \begin{bmatrix}
            \hfill1 \\
            -1 \\
            \hfill3
        \end{bmatrix}.
$$

Then

$$
    \mathbf{v}^T = 
        \begin{bmatrix}
            1 & -1 & 3
        \end{bmatrix}.
$$

## Matrix Arithmetic, Part 2: Linear Systems and Matrix-Vector Multiplication

Matrices and vectors show up naturally in linear systems; that is, when one wants to determine where two or more lines (or planes in 3 dimensions, or *hyperplanes* in dimensions higher than 3) intersect. Moreover, their appearance in this context suggests how we might define matrix-vector multiplication.

```{admonition} Definition
A *linear equation* is an equation that can be written in the form $a_1x_1 + a_2x_2 + \cdots + a_nx_n = b$, where $b$ and $a_i$ is constant for all $1\leq i \leq n$. A linear equation written this way is said to be in *standard form*.
```

**Example:** $y = 3x+2$ is a linear equation, but it is not in standard form. If we move $3x$ from the right side of the equation to the left, it is in standard form: $ -3x + y = 2$.

In two dimensions, linear equations are rarely put in standard form until we are confronted with a system of linear equations, because in two dimensions it is easier to plot a line or analyze it in other forms, such as the *slope-intercept* form that began the last example. Here is a linear system:

$$
    \begin{alignat}
        3x + 2y &= 7 \\
        x + 4y &= 9
    \end{alignat}
$$

The connection to matrices starts with linear combinations. Consider the coefficients of $x$ and $y$ as forming vectors. Then the left-hand sides of the equations in the system taken together can be viewed as a linear combination that produces a vector whose components are the left-hand sides of the two equations:

$$
    x\begin{bmatrix}
        3 \\
        1
    \end{bmatrix} + 
    y\begin{bmatrix}
        2 \\
        4
    \end{bmatrix} =
    \begin{bmatrix}
        3x + 2y \\
        x + 4y
    \end{bmatrix} =
    \begin{bmatrix}
        7 \\
        9
    \end{bmatrix},
$$

where the scalars in the linear combination are the unknowns $x$ and $y$ and the vectors in the linear combination contain the coeffients of their respective 'scalars'. Of course, this only makes sense if we think of the right-hand sides of the two equations as forming a vector as well, but if we do so, we see that our conditions for equality between vectors give us exactly the system of equations that we started with: the first components must agree (first equation) and the second components must agree (second equation).

Now, suppose we call

$$
    \mathbf{x} = \begin{bmatrix}
                    x \\
                    y
                 \end{bmatrix}.
$$ 

Then multiplication of a matrix and a vector can be defined through this linear combination idea.

```{admonition} Definition (*Column method* of matrix multiplication)
Given a matrix $A$ and a vector $\mathbf{v}$, the product $A\mathbf{v}$ is the linear combination $v_1A_1 + \cdots + v_nA_n$ (recall that $A_i$ denotes the $i^{th}$ column of $A$). 
```

```{admonition} Dimensional considerations
:class: important
Note that the above definition only makes sense if the matrix has the same number of columns as the vector has components. Assuming that is so, the result of the multiplication will have the same number of components as the matrix has rows.
```

In the context of a linear system in standard form, we construct a matrix from the coefficients of the variables on the left, a vector of unknowns from the variables on the left, and the result of the matrix-vector multiplication is the vector of known quantities on the right-hand sides of the equations. This gives an equation of the form $A\mathbf{x}=\mathbf{b}$. In the current example, the left-hand side matrix multiplication looks like this:

$$
    \begin{bmatrix}
        3 & 2 \\
        1 & 4
    \end{bmatrix}\cdot
    \begin{bmatrix}
        x \\
        y
    \end{bmatrix} = 
        x\begin{bmatrix}
        3 \\
        1
    \end{bmatrix} + 
    y\begin{bmatrix}
        2 \\
        4
    \end{bmatrix} =
    \begin{bmatrix}
        3x + 2y \\
        x + 4y
    \end{bmatrix}.
$$

Again if we can write the left-hand side of a linear system as a product that results in a vector, then we must write the right-hand side as a vector too: $[7\hspace{1em}9]^T$. Vector equality only holds when the components are all equal, so the linear system must be equivalent to the matrix-vector equation

$$
    \begin{bmatrix}
        3 & 2 \\
        1 & 4
    \end{bmatrix}\cdot
    \begin{bmatrix}
        x \\
        y
    \end{bmatrix}=
    \begin{bmatrix}
        7 \\
        9
    \end{bmatrix}.
$$

Our motivation and our previous examples come from linear systems, but it is important to recognize that we have a general-purpose definition at hand, and provided the dimensions align we can multiply matrices and vectors whether they come from linear systems or not. 

In the next section we will focus on solving linear systems, but for the remainder of this one we turn our attention now to the general properties of matrix-vector multiplication and how they extend to give us matrix-matrix multiplication.

First, i|t is important to note that the way we performed matrix-vector multiplication above is not how matrix-vector multiplication is usually performed by hand. It is the right viewpoint for obtaining a deep understanding of why we do matrix-vector multiplication the way we do. By hand, though, it is far more common to do the computation by taking dot products: the dot product of each of the rows of the matrix with the (column) vector being multiplied produces the same result as the linear combination computation described above; that is:

```{admonition} Definition (Alternative *row method* of matrix multiplication)
Given an $m\times n$ matrix $A$ and a vector $\mathbf{x}$ with $n$ components, the product $A\mathbf{x}$ is the vector with $m$ components whose $i^{th}$ component is the dot product of the $i^{th}$ row of $A$ with $\mathbf{x}$. 
```

**Example:** (row method)

$$
    \begin{bmatrix}
        1 & 0 & \hfill2\\
        2 & 2 & -3
    \end{bmatrix}\cdot
    \begin{bmatrix}
        1 \\
        2 \\
        3
    \end{bmatrix} = 
    \begin{bmatrix}
        1\cdot 1 + 0\cdot 2 + 2\cdot 3 \\
        2\cdot 1 + 2\cdot 2 -3\cdot 3
    \end{bmatrix} = 
    \begin{bmatrix}
        1 + 6 \\ 
        2 + 4 -9
    \end{bmatrix}=
    \begin{bmatrix}
        \hfill7 \\ 
        -3
    \end{bmatrix}.
$$

Note the dimensions above: the matrix is $2\times 3$, the vector multiplying the matrix is $3\times 1$, and the result is a $2\times 1$ vector.

**Example:** 

$$
    \begin{bmatrix}
        1 & 7 \\
        2 & 5
    \end{bmatrix}\cdot
    \begin{bmatrix}
        \hfill5 \\
        -7 \\
        \hfill9
    \end{bmatrix}
$$

is undefined, because the matrix does not have the same number of columns as the vector has components, so the underlying dot product calculations themselves are undefined.

```{admonition} Note
:class: important
We have defined a one-sided operation so far: The discussion above tells us how to calculate $A\mathbf{x}$ for an appropriately sized matrix $A$ and vector $\mathbf{x}$, but it does not tell us how to calculate $\mathbf{x}A$, or even if such an operation is defined.
```

## The Dot Product Revisited

It was pointed out above that a (column) vector can be viewed as a $m\times 1$ matrix; so too a row vector could be viewed as a $n\times 1$ matrix whose 'columns' are vectors containing only a single component. Suppose we have two vectors $\mathbf{u}$ and $\mathbf{v}$, each $m\times 1$. Then $\mathbf{u}^T$ is a $1\times m$ matrix, and either the column or row view of matrix-vector multiplication above gives

$$
    \mathbf{u}^T\mathbf{v} = \left[\sum_{i=1}^m u_iv_i\right],
$$

which is a $1\times 1$ 'vector' whose sole component is precisely the dot product of $\mathbf{u}$ and $\mathbf{v}$. Analagously, 

$$
    \mathbf{v}^T\mathbf{u} = \left[\sum_{i=1}^m v_iu_i\right].
$$

For reasons we will get in to later, we often abuse notation, drop the brackets, and treat $1\times 1$ vectors as just real numbers. Since the real number in this case is precisely the dot product and the dot notation may be misinterpreted, we often prefer to notate the dot product of two vectors $\mathbf{u}$ and $\mathbf{v}$ using the transpose as just described.

```{admonition} Notation
:class: important
Let $\mathbf{u}$ and $\mathbf{v}$ be vectors with $m$ components. Then $\mathbf{u}\cdot\mathbf{v} = \mathbf{u}^T\mathbf{v} = \mathbf{v}^T\mathbf{u}$.
```

## Matrix Arithmetic, Part 3: Matrix-Matrix Multiplication

Multiplication of two matrices is just a slight extension of the matrix-vector multiplication just defined. Suppose we have two matrices $A$ and $B$ and we wish to calculate $AB$. Thinking of $B$ as a collection of column vectors $B_1,\dots,B_p$, the logical way to do this multiplication then would be to do the matrix vector multiply $AB_i$ for each column $B_i$, $1\leq i\leq p$ of $B$. That would result in a new collection of column vectors, which we could organize into a new matrix:

$$
    AB = A\left[B_1\hspace{1em}B_2\dots B_p\right] = \left[AB_1\hspace{1em}AB_2\dots AB_p\right].
$$

Of course, this requires that $B$ have the same number of rows as $A$ has columns, and results in a matrix with the same number of rows as $A$ and the same number of columns as $B$. It's most common to use the row method for computation here, and the product can be written formulaically as follows:

```{admonition} Definition: (Matrix-matrix multiplication)
Let $A$ be an $m\times n$ matrix and $B$ an $n\times p$ matrix. The product $AB$ is an $m\times p$ matrix $C$ such that

$$
    c_{ij} = \sum_{k=1}^n a_{ik}b_{kj}.
$$
```

That is, $c_{ij}$ is the dot product of the $i^{th}$ row of $A$ with the $j^{th}$ column of $B$.

**Example:**

Below, a $2\times 3$ matrix (left) and a $3\times 2$ matrix (right) are multiplied. The operation is defined because the dimensions align: the matrix on the left has the same number of columns as the matrix on the right has rows. The output will be a $2\times 2$ matrix. The dot product used to calculate entry $1, 1$ of the result is highlighted: row one from the left is combined with column one on the right. The remaining three entries are calculated similarly.

$$
    \begin{bmatrix}
        2 & 3 & 4 \\
        1 & 0 & 0
    \end{bmatrix}\begin{bmatrix}
        7 & \hfill0 \\
        6 & -1 \\
        5 & -2
    \end{bmatrix} = 
    \begin{bmatrix}
        \textcolor{blue}{2} & \textcolor{blue}{3} & \textcolor{blue}{4} \\
        1 & 0 & 0
    \end{bmatrix}\begin{bmatrix}
        \textcolor{orange}{7} & \hfill0 \\
        \textcolor{orange}{6} & -1 \\
        \textcolor{orange}{5} & -2
    \end{bmatrix} =
    \begin{bmatrix}
        \textcolor{blue}{2}\cdot \textcolor{orange}{7} + \textcolor{blue}{3}\cdot \textcolor{orange}{6} + \textcolor{blue}{4}\cdot \textcolor{orange}{5} & \\
        &
    \end{bmatrix} = 
    \begin{bmatrix}
        \textcolor{blue}{2}\cdot \textcolor{orange}{7} + \textcolor{blue}{3}\cdot \textcolor{orange}{6} + \textcolor{blue}{4}\cdot \textcolor{orange}{5} & 2 \cdot 0 + 3\cdot (-1) + 4\cdot (-2) \\
        1\cdot 7 + 0\cdot 6 + 0\cdot 5 & 1\cdot 0 + 0\cdot (-1) + 0\cdot (-2)
    \end{bmatrix} = 
    \begin{bmatrix}
        14 + 18 + 20 & 0 - 3 - 8 \\
        7 + 0 + 0 & 0 + 0 + 0
    \end{bmatrix} = 
    \begin{bmatrix}
        52 & -11 \\
        7 & \hfill 0
    \end{bmatrix}.
$$

**Example:**

Now a $2\times 2$ matrix (left) and a $2\times 3$ matrix (right) are multiplied. The operation is defined because again the dimensions align: the matrix on the left has the same number of columns (2) as the matrix on the right has rows. The output will be a $2\times 3$ matrix, because there are three columns in the matrix on the right. The dot product used to calculate entry $2, 2$ of the result is highlighted: row two from the left is combined with column two on the right. The remaining five entries are calculated similarly.

$$
    \begin{bmatrix}
        2 & 1  \\
        1 & 2
    \end{bmatrix}\begin{bmatrix}
        7 & \hfill0 & 3\\
        6 & -1 & 3
    \end{bmatrix} = 
    \begin{bmatrix}
        2 & 1 \\
        \textcolor{blue}{1} & \textcolor{blue}{2}
    \end{bmatrix}\begin{bmatrix}
        7 & \hfill\textcolor{orange}{0} & 3 \\
        6 & \textcolor{orange}{-1} & 3 \\
    \end{bmatrix} =
    \begin{bmatrix}
        &  & \\
        & \textcolor{blue}{1}\cdot \textcolor{orange}{0} + \textcolor{blue}{2}\cdot \textcolor{orange}{(-1)} &
    \end{bmatrix} = 
    \begin{bmatrix}
        2\cdot 7 + 1\cdot 6 & 2\cdot 0 + 1\cdot (-1) & 2\cdot 3 + 1\cdot 3 \\
        1\cdot 7 + 2\cdot 6 & \textcolor{blue}{1}\cdot \textcolor{orange}{0} + \textcolor{blue}{2}\cdot \textcolor{orange}{(-1)} & 1\cdot 3 + 2\cdot 3
    \end{bmatrix} = 
    \begin{bmatrix}
        14 + 6 & 0 - 1 & 6 + 3 \\
        7 + 12 & 0 - 2 & 3 + 6
    \end{bmatrix} = 
    \begin{bmatrix}
        20 & -1 & 9 \\
        19 & -2 & 9
    \end{bmatrix}.
$$

**Example:**

A matrix with only one row looks like (is) a row vector. Nevertheless the same multiplication rule applies. The multiplication is defined because the number of columns on the left (3) matches the number of rows on the right. The output is $1\times 2$, because there is one row on the left and two columns on the right.

$$
    \begin{bmatrix}
        1 & 2 & 1 
    \end{bmatrix}
    \begin{bmatrix}
        3 & 2 \\
        4 & 0 \\
        8 & 1
    \end{bmatrix} = 
    \begin{bmatrix}
        1\cdot 3 + 2\cdot 4 + 1\cdot 8 & 1\cdot 2 + 2\cdot 0 + 1\cdot 1
    \end{bmatrix} = 
    \begin{bmatrix}
        3 + 8 + 8 & 2 + 0 + 1
    \end{bmatrix} = 
    \begin{bmatrix}
        19 & 3
    \end{bmatrix}.
$$


```{admonition} Note
:class: important 
A vector is just a matrix with a single column, and the definition given for matrix-matrix multiplication gives the same result when the matrix on the right is a vector.
```

## Important Matrices

```{admonition} Definition
The *identity matrix* of dimension $n$, denoted $I$, is the $n\times n$ square matrix with 1's on the main diagonal and zeros everywhere else.

The *zero matrix* of dimension $n$, denoted $0$, is the $n\times n$ square matrix consisting only of zeros.
```

**Example:** Below is the six-dimensional identity matrix:

$$
    I = \begin{bmatrix}
        1 & 0 & 0 & 0 & 0 & 0 \\
        0 & 1 & 0 & 0 & 0 & 0 \\
        0 & 0 & 1 & 0 & 0 & 0 \\
        0 & 0 & 0 & 1 & 0 & 0 \\
        0 & 0 & 0 & 0 & 1 & 0 \\
        0 & 0 & 0 & 0 & 0 & 1 
        \end{bmatrix}.
$$

With both of these special matrices, the dimension is usually not specified by simply inferred based on the context. These matrices have the properties that:
- $IA = AI = A$ for any $A$ such that $IA$ and $AI$ are defined,
- $0A = A0 = 0$ for any $A$ such that $0A$ and $A0$ are defined, and
- $0 + A = A + 0 = A$ for any $A$ such that $0 + A$ and $A + 0$ are defined. 

Note that the first property above implies that $I\mathbf{v} = \mathbf{v}$ for any vector $\mathbf{v}$ such that the multiplication is defined. It is a worthwhile exercise to make up such a vector $\mathbf{v}$ with six components and work out the multiplication with the six-dimensional $I$ above to confirm this fact.