# A Brief Review of Matrix Multiplication

Next we will review a couple of important ideas from linear algebra, namely 

1. Linear combinations and the dot product
2. Matrix multiplication
3. Other matrix transforms such as the transpose and inverse.

We start by defining the the fundemental stucture in linear algebra, the vector.

## Column vectors and row vectors

A **vector** is an order sequence of values.  Vectors can be written as either **column vectors**, which stack the value vertically, or **row vectors**, which organize the values horizontally.  Below we illustrate a row vector $\mathbf{r}$ and a column vector $\mathbf{c}$

$$\mathbf{c}= \begin{pmatrix} 
                 c_{1} \\ 
                 c_{2} \\ 
                 c_{2} 
              \end{pmatrix} \;
\mathbf{r}= \begin{pmatrix} 
                 r_{1} & r_2 & r_3\\ 
              \end{pmatrix}$$
              
Take a second to consider our typesetting.  Note that we represent a whole vector with **bold, non-italicized font** and an entry of the vector with *an non-bold, italicized font* using the same letter.  This should help keep it clear when we are taking about a whole vector versus an entry in said vector.

## Column and row vectors in `R`

We can construct a vector in `R` using the `c` function.  

In [14]:
v1 <- c(1,2,3)
v1

We can use `class` to verify that this is an `R` numeric vector (Vectors are also the standard data structure in `R`, so much so that `class` neglects to tell us the "vector" part of numeric vector.)

In [16]:
str(v1)

 num [1:3] 1 2 3


Note that `R` will try to guess whether a vector is a row or column vector and, as we will see in the next section, the meaning of a computation will change depending on the location of the vector.  It is good practice to avoid this sort of confusion, so we will use the `rbind` and `cbind` functions to bind our vectors to rows and columns, respectively.

In [37]:
c1 <- cbind(c(1,1,3))
c1

0
1
1
3


In [38]:
r1 <- rbind(c(2,5,1))
r1

0,1,2
2,5,1


Here are some more examples that we will use in the following sections.

In [49]:
v2 <- c(1, 0, 0)
c2 <- cbind(c(1.1, 0.5, 1))
r2 <- rbind(c(1,2,1))

## Vector arithmetic

There are two arithmetic operations defined on vectors, addition and scalar multiplication.

**Vector Addition** is only defined for vectors with the same number of entries, is accomplished by adding the corresponding entries, and results in a vector of the same size.

#### Adding column vectors

$$\mathbf{a} + \mathbf{b}= 
  \begin{pmatrix} 
     a_{1} \\ 
     a_{2} \\ 
     a_{3} 
  \end{pmatrix} + 
  \begin{pmatrix} 
     b_{1} \\ 
     b_{2} \\ 
     b_{3} 
  \end{pmatrix} = 
  \begin{pmatrix} 
     a_1 + b_1\\ 
     a_2 + b_2\\ 
     a_3 + b_3
  \end{pmatrix}$$

#### Adding row vectors

$$\mathbf{c} + \mathbf{d}= 
  \begin{pmatrix} 
     c_{1} &  c_{2} &  c_{3} 
  \end{pmatrix} + 
  \begin{pmatrix} 
     d_{1} &  d_{2} &  d_{3} 
  \end{pmatrix} = 
  \begin{pmatrix} 
     c_1 + d_1 &  c_2 + d_2 &  c_3 + d_3
  \end{pmatrix}$$
  
Vector addition is performed in `R` using the regular addition `+` operator

In [96]:
v1 + v2

In [51]:
c1 + c2

0
2.1
1.5
4.0


In [97]:
r1 + r2

0,1,2
3,7,2


In [98]:
v1 + c1

0
2
3
6


In [99]:
v1 + r1

0,1,2
3,7,4


In [100]:
c1 + r1

ERROR: Error in c1 + r1: non-conformable arrays


## <font color="red"> Exercise 1</font>

Suppose we define the following vectors.

$$\mathbf{a} =
\begin{pmatrix} 
     1 \\ 
     2 \\ 
     3 
  \end{pmatrix} \;
\mathbf{b} =
\begin{pmatrix} 
     4 \\ 
     5 \\ 
     6 
  \end{pmatrix} \;
\mathbf{c} =
\begin{pmatrix} 
     7 \\ 
     8 \\ 
     9 \\
     10
  \end{pmatrix}$$

Compute the sum of all pairs of these vectors by hand and in `R`.  If the sum is note defined, identify the problem (in words) and illustrate what happens when you try to add these vectors in `R`.

> *Your by-hand computations here

In [23]:
# Your code here

## Scalar multiplication

**Scalar multiplication** involves multiplying a *scalar*, or single number, times a *vector*.  The result of scalar multiplication is another vector of the same size as the original, where every entry is multiplied by the scalar.  Suppose that $c$ is a real-valued scalar, then

#### Scalar multiplication with a column vector.

$$c*\mathbf{a} =c\mathbf{a}= 
  \begin{pmatrix} 
     c*a_{1} \\ 
     c*a_{2} \\ 
     c*a_{3} 
  \end{pmatrix}$$


#### Scalar multiplication with a row vector



$$c*\mathbf{b} = c\mathbf{b}= 
  \begin{pmatrix} 
     c*b_{1} &  c*b_{2} &  c*b_{3} 
  \end{pmatrix}$$
  
  
Similar to vector addition, scalar multiplication is performed in `R` using `*`

In [101]:
2*v1

In [102]:
v1*2 # consider bad form in math classes

In [103]:
3*c1

0
3
3
9


In [104]:
pi*r1

0,1,2
6.283185,15.70796,3.141593


## <font color="red"> Exercise 2</font>

Consider the vectors defined in <font color="red">Exercise 1</font>.  Perform each of the following computations, both by-hand and using `R`

* What is $2\mathbf{a}$
* What is $1.5\mathbf{b}$
* What is $\pi\mathbf{d}$

> *Your by-hand computations here

In [23]:
# Your code here

## Linear combinations

A **linear combination** of items involves combining a number of values together through multiplication and addition.  A good example of linear combinations in the wild is a cooking recipe, specifically a recipe for  fruit salad (yummy yummy).  

Suppose we have a recipe for a simple fruit salad that mixes 

* 3 bananas,
* 4 peaches,
* 15 strawberries, and
* 30 blueberries.

Mathematically, we might write this process as

$$3*b + 4*p + 15*s + 30*b$$

This is all there is to a linear combination.

Another good example of linear combinations in the wild is totaling a shopping bill.  Suppose that I went shopping for the ingredients to my fruit salad (yummy yummy) and the current prices for each ingredient are currently

* Bananas are \$2 each<br>
* Peaches are \$0.3 each<br>
* Strawberries are \$0.2 each<br>
* Blueberries are \$0.05 each<br>

Then the (pre-tax) total of my bill can be expressed with the following linear combination.

$$ 3*2 + 4*0.3 + 15*0.2 + 30*0.05 = 11.7$$

## Linear combinations of vectors

A **linear combination of vectors** is the result of performing scalar multiplication on each vector, then adding the results.

$$u*\mathbf{a} + v*\mathbf{b}= u\mathbf{a} + v\mathbf{b}= 
  \begin{pmatrix} 
     u*a_{1} \\ 
     u*a_{2} \\ 
     u*a_{3} 
  \end{pmatrix} + 
  \begin{pmatrix} 
     v*b_{1} \\ 
     v*b_{2} \\ 
     v*b_{3} 
  \end{pmatrix} = 
  \begin{pmatrix} 
     u*a_1 + v*b_1\\ 
     u*a_2 + v*b_2\\ 
     u*a_3 + v*b_3
  \end{pmatrix}$$


It can be argued that this is most important action that we can perform with vectors, and it turns out that all of linear algebra can be thought of as the study of linear combinations of vectors.  

Linear combinations in `R` are easy, just combine multiplication and addition.

In [105]:
v1 + 2*v2

In [106]:
2*c1 + 1.5*c2

0
3.65
2.75
7.5


In [107]:
12*r1 + -11*r2

0,1,2
13,38,1


## <font color="red"> Exercise 3</font>

Consider the vectors defined in <font color="red">Exercise 1</font>.  Perform each of the following computations, both by-hand and using `R`

* What is $2\mathbf{a} + 0.5\mathbf{b}$
* Investigate the result of linear combinations of the different type of `R` vectors
* How might we express $\mathbf{a} + \mathbf{b}$ as an linear combination?
* Notice that we haven't defined vector subtraction.  Use a linear combination make a reasonable definition for $\mathbf{a} - \mathbf{b}$
* Why can't we perform a linear combination of $\mathbf{a}$ and  $\mathbf{c}$?

> *Your by-hand computations and written answers here

In [25]:
# Your code here

## Linear combinations and the dot product

The **dot product** between two vectors is defined for all vectors of the same size and is computed by taking the sum of the products of corresponding entries.

$$\mathbf{a} \cdot \mathbf{b}= 
  \begin{pmatrix} 
     a_{1} \\ 
     a_{2} \\ 
     a_{3} 
  \end{pmatrix} \cdot 
  \begin{pmatrix} 
     b_{1} \\ 
     b_{2} \\ 
     b_{3} 
  \end{pmatrix} =  a_1b_1 +   a_2b_2 +   a_3b_3$$
  
In `R` we can compute the dot product between two vectors defined with `c` using the `%*%` operator.

In [27]:
c(1,2,3) %*% c(3, 2, 1)

0
10


Note that if we explicitly define our vectors as row and column vectors using `rbind` and `cbind`, then we can only compute the dot product of a row vector (on the left) with a column vector (on the right).

In [108]:
rbind(c(1,2,3)) %*% cbind(c(3, 2, 1))

0
10


Attempting to perform the dot product in the other direction will result something *entirely different* (more on this in the next section).

In [109]:
cbind(c(1,2,3)) %*% rbind(c(3, 2, 1))

0,1,2
3,2,1
6,4,2
9,6,3


### Example - Fruit Salad, Yummy Yummy

Note that we can think of a dot product as a linear combination, with the coefficients in the left-hand vector and the values being combined in the right-hand vector. If we put the fruit quantities in a (row) vector $\mathbf{c}$ and the prices in a (column) vector $\mathbf{p}$, as shown below

$$
\mathbf{c}= 
\begin{pmatrix} 
  3  &  4  &  15 &  30\\
\end{pmatrix} \;
\mathbf{p}= 
\begin{pmatrix} 
  2  \\ 
  0.3  \\ 
  0.2 \\ 
  0.05
\end{pmatrix}$$

Then the dot product of $\mathbf{c}$ and $\mathbf{p}$ gives 

$$
\mathbf{c} \cdot \mathbf{p}= 
\begin{pmatrix} 
  3  &  4  &  15 &  30\\
\end{pmatrix}\cdot
\begin{pmatrix} 
  2  \\ 
  0.3  \\ 
  0.2 \\ 
  0.05
\end{pmatrix} = 3*2 + 4*0.3 + 15*0.2 + 30*0.05 = 11.7;$$

Which was exactly the computation from the last section. There are other ways of thinking about a dot product (i.e. thinking about lengths and angles), but in this class this will be our primary focus.

## Matrices and matrix arithmetic

A matrix $\mathbf{A}$ is a collection of numbers stored in an array with $n$ rows and $m$ columns.  For convenience, we describe the size of such a matrix as $n\times m$, where the numbers represent the number of rows and columns, respectively.  For example, we might write a $2\times 2$ matrix $\mathbf{A}$ as follows.

$$\mathbf{A}= \begin{pmatrix} 
                 a_{11} & a_{12} \\ 
                 a_{21} & a_{22} 
              \end{pmatrix}$$

Matrices can be defined in `R` using by either binding rows with `rbind`, or columns with `cbind`, as shown below. 

In [70]:
A = rbind(c(1,2,3),
          c(1,3,1))
A

0,1,2
1,2,3
1,3,1


In [77]:
B = cbind(c(1,1,2),
          c(3,3,2))
B

0,1
1,3
1,3
2,2


The author prefers `rbind`, as the matrix is represented in the correct order in the code, making it easier to read.

In [110]:
C = rbind(c(1,1),
          c(2,2),
          c(3,1))
C

0,1
1,1
2,2
3,1


#### Scalar matrix multiplication

We define scalar matrix multiplication and matrix addition in an analogous way to the vector definitions.  Again, scalar multiplication involves multiplying a scalar (single number) with a matrix; and again this computation is resolved by multiplying each element by said scalar.  

$$c\mathbf{A}= c\begin{pmatrix} 
                 a_{11} & a_{12} \\ 
                 a_{21} & a_{22} 
              \end{pmatrix} =
              \begin{pmatrix} 
                ca_{11} & ca_{12} \\ 
                ca_{21} &  ca_{22} 
              \end{pmatrix} $$

As before, scalar multiplication is performed in `R` using `*`

In [72]:
2*A

0,1,2
2,4,6
2,6,2


#### Matrix-vector multiplication

Next, let's look at matrix-vector multiplication (with the matrix on the left).  The result of $\mathbf{A}\mathbf{b}$, is as an example, if $\mathbf{A}$ is a $2\times 2$ matrix and $\mathbf{b}$ is a $2\times 1$ vector, then the matrix $\mathbf{A}\mathbf{b}$ is $2\times 1$, as follows:

$$\mathbf{A}\mathbf{b}= \begin{pmatrix} 
                 a_{11} & a_{12} \\ 
                 a_{21} & a_{22} \end{pmatrix}
              \begin{pmatrix} 
                  b_1\\ b_2 
              \end{pmatrix} =
              \begin{pmatrix} 
                a_{11} b_1 + a_{12}b_2 \\ 
                a_{21}b_1 + a_{22}b_2 
              \end{pmatrix} $$
              
              
<img src="./img/matrix_vector_multiplication.gif" width=600>

#### Matrix multiplication

It is also important to recall the definition of matrix multiplication, we illustrate below for two $2 \times 2$ matrices.


$$\mathbf{AB}= \begin{pmatrix} 
                 a_{11} & a_{12}\\ 
                 a_{21} & a_{22}
              \end{pmatrix}
              \begin{pmatrix} 
                 b_{11} & b_{12}\\ 
                 b_{21} & b_{22}
              \end{pmatrix} =
              \begin{pmatrix} 
                a_{11}b_{11} + a_{12}b_{21} & a_{11}b_{12} + a_{12}b_{22}\\ 
                a_{21}b_{11} + a_{22}b_{21} & a_{21}b_{12} + a_{22}b_{22} 
              \end{pmatrix} $$

Matrix multiplication involves multiplying the entries of rows in the first matrix with the entries of the corresponding column in the second; then summing the results of these products. 

We illustrate by adding another column to the example animated above, and picking up where we left off.

<img src="./img/matrix_multiplication.gif" width=600>

If we need to multiply larger matrices, we simply continue this process for each column in the right-hand matrix.

We can use `R` to compute the result of matrix multiplication using the `%*%` operator.  

In [75]:
A %*% B

0,1
9,15
6,14


Note that matrix multiplication is only defined if the number of rows on the LHS matches the number of columns on the RHS.  Consequently, $BC$ is undefined for the matrices defined above, as $B$ and $C$ are both $3\times 2$.

In [87]:
B %*% C

ERROR: Error in B %*% C: non-conformable arguments


## <font color="red"> Exercise 4</font>

Compute the following products, if possible.

1. $BA$
2. $BC$
3. $CB$

If the multiplication is not defined, identify the problem.

> *Your by-hand computations and written answers here

In [25]:
# Your code here

A nice way to check if multiplication is defined is to write out the matrix sizes, side-by-side, in the order of the multiplication.  If the multiplication is defined, the middle two numbers will match.  Furthermore, the outside number determine the size of the resulting matrix (when defined). 

#### Example 1

For $AB$ we get

$$ \left(2 \times 3\right) * \left(3 \times 2\right),$$

and since the inner numbers match the multiplication is defined and we will get a $2\times 2$ matrix (outer numbers).

#### Example 2

For $B$ and $C$ we have

$$ \left(3 \times 2\right) * \left(3 \times 2\right)$$

and we note that the inside numbers (2 and 3) don't match, so the multiplication is not defined.

In [86]:
B

0,1
1,3
1,3
2,2


## <font color="red"> Exercise 4</font>

Use the method defined above to (a) determine if the multiplication is defined and if so, (b) determine the size of the output.

1. $BA$
2. $BC$
3. $CB$

> *Your by-hand computations and written answers here

## Solving systems of equations with matrices

Finally, we review the matrix inverse.  Recall that a matrix equation $\mathbf{A}\mathbf{x} = \mathbf{b}$ corresponds to a linear system of equations.  For example, the system shown below 
corresponds to the matrix equation is given by $\mathbf{A}\mathbf{x}=\mathbf{b}$.

$$\begin{array}{ccccc}
2x & + & 3y & = & 5\\
x & - & y & = & 7
\end{array} 
\Longrightarrow
\begin{pmatrix}
2 & 3\\
1 & -1
\end{pmatrix} 
\begin{pmatrix}
  x\\
  y
\end{pmatrix} = 
\begin{pmatrix}
  5\\
  7
\end{pmatrix} .$$

If a system has a unique solution, then the matrix $\mathbf{A}$ will have a matrix inverse, denoted $\mathbf{A}^{-1}$, and the solution to the original equation can be written as $\mathbf{x} = \mathbf{A}^{-1}\mathbf{b}$ ^[Read more about computing the inverse on [Wikipedia](https://en.wikipedia.org/wiki/Invertible_matrix).].  R can be used to solve a system in this way, as follows.

In [111]:
A <- rbind(c(2,3),
           c(1,-1))
A

0,1
2,3
1,-1


In [112]:
b <- cbind(c(5,7))
b

0
5
7


In [113]:
A.inverse <- solve(A)
A.inverse

0,1
0.2,0.6
0.2,-0.4


In [114]:
x <- A.inverse %*% b
x

0
5.2
-1.8


We can double check this solution by computing $\mathbf{A}\mathbf{x}$ and making sure it is the same as $\mathbf{b}$.

In [115]:
A %*% x

0
5
7


## The transpose of a matrix

Finally, we note that the **transpose** of a matrix $\mathbf{A}$, denoted $\mathbf{A}^T$, is contructed using the rows of $\mathbf{A}$ as columns. 

$$\mathbf{A}= \begin{pmatrix} 
                 a_{11} & a_{12}\\ 
                 a_{21} & a_{22}
              \end{pmatrix} \Rightarrow \mathbf{A}^T = 
              \begin{pmatrix} 
                 a_{11} & a_{21}\\ 
                 a_{12} & a_{22}
              \end{pmatrix}$$

We can compute the transpose in R using the `t` function, as follows.

In [116]:
A <- rbind(c(2,3),
           c(1,-1))
A

0,1
2,3
1,-1


In [117]:
t(A) # Transpose

0,1
2,1
3,-1
