# Matrices

**Definition:** A *matrix* is a rectangular array of mathematical objects arranged into rows and columns. The plural of matrix is *matrices*.

For the time being, the mathematical objects in these rectangular arrays we call matrices will just be numbers, but just as with vectors we want our definitions to be flexible. Our matrices are always finite dimensional; that is, they have only a finite number of rows and columns, though in future courses you may encounter matrices with an infinite number of rows and/or columns. 

**Notation:** Matrices are usually denoted with capital letters from the front of the English alphabet. A matrix $A$ with $m$ rows and $n$ columns is said to be an $m\times n$ dimensional matrix, and we denote the entry in the $i^{th}$ row and $j^{th}$ column of $A$ as $a_{i,j}$. We always follow a row-first, column-second convention when discussing matrices, and unless we are programming, we use 1-based indexing. 

**Example:**

$$
    A=\begin{bmatrix}
        1 & 2 & 3 \\
        4 & 5 & 6 \\
        7 & 8 & 9 \\
        0 & 1 & 2
      \end{bmatrix}
$$

is a $4\times 3$ matrix: it has 4 rows and 3 columns. The entry in position $2, 3$ is $a_{2,3}=6$. The entry in position $3, 2$ is $a_{3, 2}=8$. 

As we will see, there are many different ways to think about what such an array of numbers represents. One particularly useful approach is to think of a matrix as a collection of (column) vectors. We use the name of the matrix with a single subscript to denote a specific column vector; the matrix above, for example, consists of the three column vectors

$$
    A_1 = \begin{bmatrix}
            1 \\
            4 \\
            7 \\
            0
          \end{bmatrix},
    A_2 = \begin{bmatrix}
            2 \\
            5 \\
            8 \\
            1
          \end{bmatrix},\text{ and }
    A_3 = \begin{bmatrix}
            3 \\
            6 \\
            9 \\
            2
          \end{bmatrix}.
$$        

Alternatively, we might think of a vector as a special type of matrix: a matrix with only one column. That is, if a vector $\mathbf{v}$ has $m$ components, it could just as well be viewed as an $m\times 1$ matrix. This is a very useful perspective, and the view that we take will depend on the context in which we find ourselves working.

## Matrix Arithmetic, Part 1

If we think of a matrix as simply a collection of vectors, then the rules for performing many operations on matrices follow immediately: addition, subtraction, and scalar multiplication should be done componentwise, with the caveat that the first two operations are only defined if the matrices being added or subtracted are the same size. Why we might want to do these things, and how we should interpret the results, will have to wait, but at least these initial operations are straightforward.

**Matrix Addition/Subtraction:** Given two matrices $A$ and $B$ of the same dimensions $m\times n$, the matrix $C = A \pm B$ has in position $i, j$ the entry $c_{i, j} = a_{i, j} \pm b_{i, j}$.

**Example:**

$$
    \begin{bmatrix}
        1 & 2 & 3 \\
        0 & 0 & 1
    \end{bmatrix} + 
    \begin{bmatrix}
        2 & 2 &  -1 \\
        1 & 11 & -1
    \end{bmatrix} =
    \begin{bmatrix}
    1 + 2 & 2 + 2 & 3 + (-1) \\
    0 + 1 & 0 + 11 & 1 + (-1)
    \end{bmatrix} =
    \begin{bmatrix}
        3 & 4 & 2 \\
        1 & 11 & 0
    \end{bmatrix}.
$$

**Example:**

$$
    \begin{bmatrix}
        1 & 2 & 3 \\
        0 & 0 & 1
    \end{bmatrix} - 
    \begin{bmatrix}
        2 & 2 &  -1 \\
        1 & 11 & -1
    \end{bmatrix} =
    \begin{bmatrix}
    1 - 2 & 2 - 2 & 3 - (-1) \\
    0 - 1 & 0 - 11 & 1 - (-1)
    \end{bmatrix} =
    \begin{bmatrix}
        -1 & \hfill0 & 4 \\
        -1 & -11 & 2
    \end{bmatrix}.
$$

**Example:**

$$
    \begin{bmatrix}
        1 & 2 & 3 \\
        0 & 0 & 1
    \end{bmatrix} + 
    \begin{bmatrix}
        1 & 1 \\
        1 & 0
    \end{bmatrix}
$$

is undefined, because the matrix on the left is $2\times 3$ while the matrix on the right is $2\times 2$.

**Note:** We call $n\times n$ matrices *square* matrices. They have many important characteristics that distiguish them from their nonsquare counterparts. The matrixe on the right in the example immediately above is a square matrix.

**Scalar Multiplication:** Given a matrix $A$ and a scalar $s$, the entry in position $i, j$ of the matrix $sA$ is $s\cdot a_{i, j}$.

**Example:** 

$$
    5\cdot
    \begin{bmatrix}
        1 & 2 & 3 \\
        0 & 0 & 1
    \end{bmatrix} = 
    \begin{bmatrix}
        5\cdot 1 & 5\cdot 2 & 5\cdot 3 \\
        5\cdot 0 & 5\cdot 0 & 5\cdot 1
    \end{bmatrix} = 
    \begin{bmatrix}
        5 & 10 & 15 \\
        0 & 0 & 5
    \end{bmatrix}
$$.

**Definition:** Given an $m\times n$ matrix $A$, the $n\times m$ matrix $A^T$ whose $i, j^{th}$ entry is $a_{j,i}$ is called the *transpose* of $A$.

**Example:**

Suppose that

$$  A =
        \begin{bmatrix}
            1 & 2 & 3 \\
            0 & 0 & 1
        \end{bmatrix}.
$$

Then

$$
    A^T = 
        \begin{bmatrix}
            1 & 0 \\
            2 & 0 \\
            3 & 1
    \end{bmatrix}.
$$

Transposing a matrix turns its rows into columns and vice versa. If we think of a vector as a matrix with only one column, transposing it turns it into a matrix with only one row. We might think of such a matrix as the previously mentioned but not yet studied *row vector*.

**Example:**

Let 

$$
    \mathbf{v} =
        \begin{bmatrix}
            \hfill1 \\
            -1 \\
            \hfill3
        \end{bmatrix}.
$$

Then

$$
    \mathbf{v}^T = 
        \begin{bmatrix}
            1 & -1 & 3
        \end{bmatrix}.
$$

## Matrix Arithmetic, Part 2: Linear Systems

Matrices and vectors show up naturally in linear systems; that is, when one wants to determine where two or more lines (or planes in 3 dimensions, or *hyperplanes* in dimensions higher than 3) intersect. Moreover, their appearance in this context suggests how we might define matrix-vector multiplication.

**Definition:** A *linear equation* is an equation that can be written in the form $a_1x_1 + a_2x_2 + \cdots + a_nx_n = b$, where $b$ and $a_i$ is constant for all $1\leq i \leq n$. A linear equation written this way is said to be in *standard form*.

**Example:** $y = 3x+2$ is a linear equation, but it is not in standard form. If we move $3x$ from the right side of the equation to the left, it is in standard form: $ -3x + y = 2$.

In two dimensions, linear equations are rarely put in standard form until we are confronted with a system of linear equations, because in two dimensions it is easier to plot a line or analyze it in other forms, such as the *slope-intercept* form that began the last example. Here is a linear system:

$$
    \begin{alignat}
        3x + 2y &= 7 \\
        x + 4y &= 9
    \end{alignat}
$$

The connection to matrices starts with vectors and dot products. Suppose we call

$$
    \mathbf{x} = \begin{bmatrix}
                    x \\
                    y
                 \end{bmatrix}.
$$

Then both left-hand sides taken together can be viewed as a *linear combination* that produces a vector whose components are the left-hand sides of the two equations:

$$
    x\begin{bmatrix}
        3 \\
        1
    \end{bmatrix} + 
    y\begin{bmatrix}
        2 \\
        4
    \end{bmatrix} =
    \begin{bmatrix}
        3x + 2y \\
        x + 4y
    \end{bmatrix} =
    \begin{bmatrix}
        7 \\
        9
    \end{bmatrix},
$$

where the scalars in the linear combination are the unknowns $x$ and $y$ and the vectors in the linear combination contain the coeffients of their respective 'scalars'. Of course this only makes sense if we think of the right-hand sides as the two equations and forming a vector as well.

We define multiplication of a matrix and a vector through this linear combination idea: given a matrix $A$ and a vector $\mathbf{v}$, the product $A\mathbf{v}$ is the linear combination $v_1A_1 + \cdots + v_nA_n$. In the context of a linear system in standard form, we construct the matrix from the coefficients of the variables on the left, the vector of unknowns from the variables on the left, and the result of the matrix-vector multiplication is the vector of known quantities on the right-hand sides of the equations. In the current example, the left-hand side matrix multiplication looks like this:

$$
    \begin{bmatrix}
        3 & 2 \\
        1 & 4
    \end{bmatrix}\cdot
    \begin{bmatrix}
        x \\
        y
    \end{bmatrix} = 
        x\begin{bmatrix}
        3 \\
        1
    \end{bmatrix} + 
    y\begin{bmatrix}
        2 \\
        4
    \end{bmatrix} =
    \begin{bmatrix}
        3x + 2y \\
        x + 4y
    \end{bmatrix}.
$$

Again if we can write the left-hand side of the system as a product that results in a vector, then we must write the right-hand side as a vector too: $[7\hspace{1em}9]^T$. Vector equality only holds when the components are all equal, so the linear system must be equivalent to the matrix-vector equation

$$
    \begin{bmatrix}
        3 & 2 \\
        1 & 4
    \end{bmatrix}\cdot
    \begin{bmatrix}
        x \\
        y
    \end{bmatrix}=
    \begin{bmatrix}
        7 \\
        9
    \end{bmatrix}.
$$

This is not usually how matrix-vector multiplication is performed by hand, but it is the right viewpoint for understanding how and why we do matrix-vector multiplication the way we do. By hand it is far more common to do the computation by taking dot products of the rows of a matrix with the (column) vector being multiplied; that is, Given an $m\times n$ matrix $A$ and a vector $\mathbf{x}$ with $n$ components, the product $A\mathbf{x}$ is the vector with $m$ components whose $i^{th}$ component is the dot product of the $i^{th}$ row of $A$ with $\mathbf{x}$. This is sometimes described as the *row view* of matrix-vector multiplication, because it requires us to focus our attention on the rows of $A$.

**Example:** 

$$
    \begin{bmatrix}
        1 & 0 & \hfill2\\
        2 & 2 & -3
    \end{bmatrix}\cdot
    \begin{bmatrix}
        x \\
        y \\
        z
    \end{bmatrix} = 
    \begin{bmatrix}
        x + 0y + 2z \\
        2x + 2y -3z
    \end{bmatrix} = 
    \begin{bmatrix}
        x + 2z \\ 
        2x + 2y -3z
    \end{bmatrix}.
$$

**Example:** 

$$
    \begin{bmatrix}
        1 & 7 \\
        2 & 5
    \end{bmatrix}\cdot
    \begin{bmatrix}
        x \\
        y \\
        z
    \end{bmatrix}
$$

is undefined, because the matrix does not have the same number of columns as the vector has components, so the underlying dot product calculations themselves are undefined.

**Important: The Column View of Matrix-Vector Multiplication**

There is another way to define multiplication between a matrix and a vector that is less used in practice for calculations, but is important for a deeper understanding of matrices and vectors. Suppose that we have a matrix $A$ with the same number of columns as some vector $\mathbf{x}$ has components. Think of $A$ as analagous to a row vector whose components are *column* vectors; specifically, the columns of $A$. We can then equivalently define the product of $A$ and $\mathbf{x}$ as the dot product of $A$ (viewed as a vector whose components are vectors) with $\mathbf{x}$, and we get the same result:

$$
    \begin{bmatrix}
        3 & 2 \\
        1 & 4
    \end{bmatrix}\cdot
    \begin{bmatrix}
        x \\
        y
    \end{bmatrix}=
 x\cdot\begin{bmatrix}
            3 \\
            1
          \end{bmatrix} +
    y\cdot\begin{bmatrix}
            2 \\
            4
          \end{bmatrix} =
          \begin{bmatrix}
              x\cdot 3 \\
              x\cdot 1
          \end{bmatrix} + 
          \begin{bmatrix}
              y\cdot 2 \\
              y\cdot 4
          \end{bmatrix} = 
                    \begin{bmatrix}
              3x + 2y \\
              x + 4y
          \end{bmatrix}.
$$

This is called the *column view* of matrix-vector multiplication, because even though it produces the same result as above, it requires us to work with the columns of $A$ to perform the desired operation. 

**Note:** We have defined a one-sided operation so far: The discussion above tells us how to calculate $A\mathbf{x}$ for an appropriately sized matrix $A$ and vector $\mathbf{x}$, but it does not tell us how to calculate $\mathbf{x}A$, or even if such an operation is defined.

## The Dot Product Revisited