# Using matrices with transformations
> TBD

## Contents
TBD

## Intro
We saw in the previous chapter that any linear transformation in 3D can be specified by knowing how it affects the three standard basis vectors $ e1 $, $ e2 $, and $ e3 $.

This means that exactly 9 numbers uniquely identify the effects of linear transformation. When arranged appropriately, the numbers that tell us how to compute a linear transformation are called a *matrix*.

Note that these matrices should be considered *computational tools* or a *notation* that will help use calculate these transformations more efficiently, but ultimately, we will be doing the same transformations we performed on the previous chapters: rotations, scaling, reflection, etc.

The underlying idea would be to arrange in the matrix the information of that tells us how the linear transformation affects the standar basis vectors.

## Representing linear transformations with matrices

Let $ A $ be a linear transformation and we know that:

$
A(e1) = (1, 1, 1) \\
A(e2) = (1, 0, -1) \\
A(e3) = (0, 1, 1)
$

This information completely specifies the $ A $ transformation, in a way that it can be applied to any vector.

Because we will reuse this concept over and over, it warrants a special notation called *matrix notation*.



### Writing vectors and linear transformations as matrices

Matrices are rectangular grids of numbers. Their shape (i.e. arrangement) tells us how to interpret them.

For instance, a matrix that is a single column of numbers as a vector, with its entries being the coordinates ordered from top to bottom is called a *column vector*.

For example, the standard basis can be written as three column vectors as:

$
e1 = \begin{pmatrix}
        1 \\
        0 \\
        0
    \end{pmatrix}
,\hspace{1ex}
e2 = \begin{pmatrix}
        0 \\
        1 \\
        0
    \end{pmatrix}
,\hspace{1ex}
e3 = \begin{pmatrix}
        0 \\
        0 \\
        1
    \end{pmatrix}
$

For our purposes, this notation means the same as $ e1=(1, 0, 0) $, $ e2=(0, 1, 0) $, $ e3 = (0, 0, 1) $.

As a consequence, we can denote A transformation as:

$
A(e1) = \begin{pmatrix}
        1 \\
        1 \\
        1
    \end{pmatrix}
,\hspace{1ex}
e2 = \begin{pmatrix}
        1 \\
        0 \\
        -1
    \end{pmatrix}
,\hspace{1ex}
e3 = \begin{pmatrix}
        0 \\
        1 \\
        1
    \end{pmatrix}
$

And the matrix representing $ A $ transformation is the 3x3 grid of numbers consisting of theses vectors squashed together:

$
A = \begin{pmatrix}
        1 & 1 & 0 \\
        1 & 0 & 1\\
        1 & -1 & 1
    \end{pmatrix}
$

In 2D, a column vector consists of two entries. As a result, the linear transformation that scales input vectors by a factor of 2 can be written as:

$
D(e1) = \begin{pmatrix}
2 \\
0
\end{pmatrix}
,\hspace{1ex} 
D(e2) = \begin{pmatrix}
0 \\
2
\end{pmatrix}
$

Or using the matrix notation:

$
D = \begin{pmatrix}
2 & 0 \\
0 & 2 
\end{pmatrix}
$

Matrices will come in other shapes and sizes, but for now, the focus is in column matrices representing vectors and square matrices representing linear transformations.

The next step will consist in evaluating a linear transformation given its matrix.

### Multiplying a matrix with a vector

Let $ B $ a linear transformation represented as a matrix and $ v $ a vector also represented as a matrix:

$
B = \begin{pmatrix}
0 & 2 & 1 \\
0 & 1 & 0 \\
1 & 0 & -1 
\end{pmatrix}
,\hspace{1ex}
v = \begin{pmatrix}
3 \\
-2 \\
5
\end{pmatrix}
$

Then, using the already known procedure for linear transformations, with the only change that we use *column vector* notation, we can say:

$
B(v) = 3 \cdot B(e1) -2 \cdot B(e2) + 5 \cdot B(e3) \\
= 3 \cdot \begin{pmatrix}
0 \\
0 \\
1
\end{pmatrix} - 2 \cdot \begin{pmatrix}
2 \\
1 \\
0
\end{pmatrix} + 5 \cdot \begin{pmatrix}
1 \\
0 \\
-1
\end{pmatrix} = \begin{pmatrix}
0 \\
0 \\
3
\end{pmatrix} + \begin{pmatrix}
-4 \\
-2 \\
0
\end{pmatrix} + \begin{pmatrix}
5 \\
0 \\
-5
\end{pmatrix} = \begin{pmatrix}
1 \\
-2 \\
-2
\end{pmatrix}
$

This sequence of operations is a special case of an operation called *matrix multiplication*. This can be succinctly denoted as:

$
Bv = \begin{pmatrix}
0 & 2 & 1 \\
0 & 1 & 0 \\
1 & 0 & -1
\end{pmatrix} \begin{pmatrix}
3 \\
-2 \\
5
\end{pmatrix} = \begin{pmatrix}
1 \\
-2 \\
-2
\end{pmatrix}
$

As opposed to multiplying numbers, the order matters when you multiply matrices. In particular $ Bv $ is a valid product but $ vB $ is not.

Now, we're in a position to write Python code that multiplies a matrix by a vector:

In [3]:
from my_vectors import *

B = (
    (0, 2, 1),
    (0, 1, 0),
    (1, 0, -1)
)

v = (3, -2, 5)

# Note that the B matrix is arranged as a set of rows,
# rather than as a set of columns
# However, you can easily get the columns using `zip`

print('columns of the matrix B: {}'.format(list(zip(*B))))

def linear_combination(scalars, *vectors):
    scaled = [scale(s, v) for s,v in zip(scalars, vectors)]
    return add(*scaled)

def multiply_matrix_vector(matrix, vector):
    return linear_combination(vector, *zip(*matrix))

print('B v = {}'.format(multiply_matrix_vector(B, v)))

columns of the matrix B: [(0, 0, 1), (2, 1, 0), (1, 0, -1)]
B v = (1, -2, -2)


Let's dissect a little bit, howe we got to this succinct Python implementation:

Firstly, we define our matrix as a tuple of tuples, with the first tuple containing the first row, the second tuple the second row, etc.

This is just for convenience, as it is easier to write that way:

```Python
B = (
    (0, 2, 1),
    (0, 1, 0),
    (1, 0, -1)
)
```

Then we define our `v` vector the usual way:

```Python
v = (3, -2, 5)
```

Now, we need to find an efficient way to do:

```Python
3 * (0, 0, 1) + -2 * (2, 1, 0) + 5 * (1, 0, -1)
```

Getting the columns from B is easy using the `zip(...)` function, as it will *aggregate* the first elements of all the tuples into a first tuple, the second elements into a second tuple, etc.

Remember that `zip(...)` expects a variable number of *iterables* of the same size:

```Python
list(zip(('a', 'b', 'c'), (1, 2, 3), ('x', 'y', 'z'))) = 
[('a', 1, 'x'), ('b', 2, 'y'), ('c', 3, 'z')]
```

Note that we cannot feed `B` directly into `zip(...)` as it won't have the desired effect on account of `B` being a single tuple containing multiple tuples within it:

```Python
list(zip(((0, 2, 1), (0, 1, 0), (1, 0, -1)))) = 
[
  (
      (0, 2, 1)
  ), 
  (
      (0, 1, 0)
  ), 
  (
      (1, 0, -1)
  )
]
```

Fortunately, we can use the `*` operator to *flatten* `B` before passing it to `zip(...)`. This will have the effect of injecting the actual matrix rows as arguments to the `zip(...)` function:

```Python
list(zip((0, 2, 1), (0, 1, 0), (1, 0, -1))) = [(0, 0, 1), (2, 1, 0), (1, 0, -1)]
list(zip(*B)) = [(0, 0, 1), (2, 1, 0), (1, 0, -1)]
```

Now that we've got the matrix columns in an iterable, we can focus on writing the function that will take each of the coordinates of the vector `v` and perform the scalar multiplication of the first coordinate by the first column (*vector*), add it to the scalar multiplication of the second coordinate by the second column, etc.

But we already define such function in the previous chapter! &mdash; the `linear_combination(...)` function:

```Python
def linear_combination(scalars, *vectors):
    scaled = [scale(s, v) for s,v in zip(scalars, vectors)]
    return add(*scaled)
```

We just need to pass `v` as `scalars` and the columns of the matrix in `vectors` (note that the second argument is a variable argument list):

Thus, to perform the desired multiplication, we have to do:

```Python
linear_combination((3, -2, 5), (0, 0, 1), (2, 1, 0), (1, 0, -1)) = (1, -2, -2)
```

We're almost there: we just need to define now a Python function that will name that operation:

```Python
def multiply_matrix_vector(matrix_columns, vector):
    return linear_combination(vector, matrix_columns)
```

However, if we try to invoke the function using:

```Python
multiply_matrix_vector(((0, 0, 1), (2, 1, 0), (1, 0, -1)), (3, -2, 5))
```

it will fail, because `linear_combination(...)` requires a variable number of tuples as the second argument, and we're passing a tuple of tuples. We need to *flatten* again this argument using the `*` operator:

```Python
def multiply_matrix_vector(matrix_columns, vector):
    return linear_combination(vector, *matrix_columns)
```

This let us define our first functional `multiply_matrix_vector(...)` function:

```Python
multiply_matrix_vector(((0, 0, 1), (2, 1, 0), (1, 0, -1)), (3, -2, 5)) = (1, -2, -2)
```

Now for the final step &mdash; it wouldn't be a good DX if the consumer of the `multiply_matrix_vector(...)` would have to convert the rows of the matrix into columns. It will be much better if we do it ourselves as part of the function implementation:

```Python
def multiply_matrix_vector(matrix, vector):
    return linear_combination(vector, *(zip(*matrix)))
```

or more succinctly:
```Python
def multiply_matrix_vector(matrix, vector):
    return linear_combination(vector, *zip(*matrix))
```


Now we're ready to test the function with hardcoded arguments and variables too:



In [1]:
def linear_combination(scalars, *vectors):
    scaled = [scale(s, v) for s,v in zip(scalars, vectors)]
    return add(*scaled)

def multiply_matrix_vector(matrix, vector):
    return linear_combination(vector, *zip(*matrix))

print(multiply_matrix_vector(((0, 2, 1), (0, 1, 0), (1, 0, -1)), (3, -2, 5)))

multiply_matrix_vector(B, v)

NameError: name 'scale' is not defined

Now that we have a way to do the multiplication programmatically, we can explore mnemonic recipes for multiplying a matrix by a vector.

Consider the following *prototypical* matrix multiplication:

$
\begin{pmatrix}
a & b & c \\
d & e & f \\
g & h & i
\end{pmatrix}
\begin{pmatrix}
x \\
y \\
z
\end{pmatrix} = x \cdot \begin{pmatrix}
a \\
d \\
g
\end{pmatrix} + y \cdot \begin{pmatrix}
b \\
e \\
h
\end{pmatrix} + z \cdot \begin{pmatrix}
c \\
f \\
i
\end{pmatrix} = \begin{pmatrix}
a \cdot x + b \cdot y + c \cdot z \\
d \cdot x + e \cdot y + f \cdot z \\
g \cdot x + h \cdot y + i \cdot z
\end{pmatrix}
$

The first recipe is that each coordinate of the output vector is a function of all the coordinates of the input vector:

$
f(x, y, z) = a \cdot x + b \cdot y + c \cdot z
$

Moreover, this is a linear function, in the sense that it is a sum of number of times each variable.

The second mnemonic recipe presents the same formula as the result of doing *dot products* of the corresponding matrix rows by the column vector:

$
\begin{pmatrix}
a & b & c \\
d & e & f \\
g & h & i
\end{pmatrix}
\begin{pmatrix}
x \\
y \\
z
\end{pmatrix} = \begin{pmatrix}
(a, b, c) \cdot (x, y, z) \\
(d, e, f) \cdot (x, y, z) \\
(g, h, i) \cdot (x, y, z)
\end{pmatrix} = = \begin{pmatrix}
a \cdot x + b \cdot y + c \cdot z \\
d \cdot x + e \cdot y + f \cdot z \\
g \cdot x + h \cdot y + i \cdot z
\end{pmatrix}
$

Note that the same mnemonics will apply to 2D transformations:

$
\begin{pmatrix}
j & k \\
l & m
\end{pmatrix}
\begin{pmatrix}
x \\
y
\end{pmatrix} = x \cdot \begin{pmatrix}
j \\
l
\end{pmatrix} + y \cdot \begin{pmatrix}
k \\
m 
\end{pmatrix} = \begin{pmatrix}
(j, k) \cdot (x, y) \\
(l, m) \cdot (x, y)
\end{pmatrix} = \begin{pmatrix}
j \cdot x + k \cdot y \\
l \cdot x + m \cdot y
\end{pmatrix}
$


### Composing linear transformations by matrix multiplication

We know from the previous chapter that the composition of a linear transformation is also a linear transformation. Because any linear transformation can be represented by a matrix, any composition of linear transformations can also be represented by a matrix.

This fact has profound consequences in terms of the computing power to apply transformations. If you were to apply a big number of transformations to a vector, each of the function calls `f1(f2(f3(...fn(v))))` would represent some overhead, while if you could represent $ f_1 \circ f_2 \circ f_3 \circ ... \circ f_n $ by a matrix, applying it to a vector would be just a handful of simple computations (additions and multiplications).

Let $ A $ and $ B $ be two linear transformations that we want to compose for a given vector $ v = (x, y, z) $. Thus we're interested in $ A(B(v)) $.

$
A = \begin{pmatrix}
1 & 1 & 0 \\
1 & 0 & 1 \\
1 & -1 & 1
\end{pmatrix}, \hspace{1ex}
B = \begin{pmatrix}
0 & 2 & 1 \\
0 & 1 & 0 \\
1 & 0 & -1
\end{pmatrix}
$

In order to compute the transformation, we would need to first apply the $ B $ transformation to $ v $, which would render a 3D column vector, which we would then apply to $ A $ which would render the definitive 3d column vector that is the result of the composition:

$
A(B(v)) = ABv = \begin{pmatrix}
1 & 1 & 0 \\
1 & 0 & 1 \\
1 & -1 & 1
\end{pmatrix}
\begin{pmatrix}
0 & 2 & 1 \\
0 & 1 & 0 \\
1 & 0 & -1
\end{pmatrix}
\begin{pmatrix}
x \\
y \\
z
\end{pmatrix}
$

So the problem lies in determining, what should be the result of multiplying $ AB $. 

We know that ultimately, $ AB $ should be a 3x3 matrix, so we need are interesting in calculating all those question marks:

$
AB = \begin{pmatrix}
1 & 1 & 0 \\
1 & 0 & 1 \\
1 & -1 & 1
\end{pmatrix}
\begin{pmatrix}
0 & 2 & 1 \\
0 & 1 & 0 \\
1 & 0 & -1
\end{pmatrix} = \begin{pmatrix}
? & ? & ? \\
? & ? & ? \\
? & ? & ?
\end{pmatrix}
$


We also know that each of the columns of the transformation is the effect of applying the given transformation to the vectors of the standard basis.

That is:

$
B(e1) = \begin{pmatrix}
0 \\
0 \\
1
\end{pmatrix} \\
B(e2) = \begin{pmatrix}
2 \\
1 \\
0
\end{pmatrix} \\
B(e3) = \begin{pmatrix}
1 \\
0 \\
-1
\end{pmatrix}
$

But also, in virtue of the last fact, the first column of the resulting matrix, will be the effect of $ AB(e1) $, the second column will be $ A(B(e2)) $ etc.

Therefore:
$
A(B(e1)) = ABe1 = \begin{pmatrix}
1 & 1 & 0 \\
1 & 0 & 1 \\
1 & -1 & 1
\end{pmatrix}
\begin{pmatrix}
0 \\
0 \\
1
\end{pmatrix} = \begin{pmatrix}
0 \\
1 \\
1
\end{pmatrix}
$

And similarly:
$
A(B(e2)) = ABe2 = \begin{pmatrix}
1 & 1 & 0 \\
1 & 0 & 1 \\
1 & -1 & 1
\end{pmatrix}
\begin{pmatrix}
2 \\
1 \\
0
\end{pmatrix} = \begin{pmatrix}
3 \\
2 \\
1
\end{pmatrix} \\
A(B(e3)) = ABe3 = \begin{pmatrix}
1 & 1 & 0 \\
1 & 0 & 1 \\
1 & -1 & 1
\end{pmatrix}
\begin{pmatrix}
0 \\
0 \\
1
\end{pmatrix} = \begin{pmatrix}
1 \\
0 \\
0
\end{pmatrix}
$

And fitting all the intermediate results together into a single matrix:

$
AB = \begin{pmatrix}
0 & 3 & 1 \\
1 & 2 & 0 \\
1 & 1 & 0
\end{pmatrix}
$


As a takeaway, we haven't reivented the wheel &mdash; we've just leaned once more into concepts that we already knew to come up with the *matrix multiplication*:

![Getting to matrix multiplication](../images/getting_to_matrix_multiplication.png)

AS a mnemonic rule, multiplying a 3x3 matrix by a column vector is the same as doing the *dot product* of a given row by a given column:

$ 
AB = \begin{pmatrix}
1 & 1 & 0 \\
1 & 0 & 1 \\
1 & -1 & 1
\end{pmatrix}
\begin{pmatrix}
0 & 2 & 1 \\
0 & 1 & 0 \\
1 & 0 & -1
\end{pmatrix} = \\
\begin{pmatrix}
(1, 1, 0) \cdot (0, 0, 1) & (1, 1, 0) \cdot (2, 1, 0) & (1, 1, 0) \cdot (1, 0, -1) \\
(1, 0, 1) \cdot (0, 0, 1) & (1, 0, 1) \cdot (2, 1, 0) & (1, 0, 1) \cdot (1, 0, -1) \\
(1, -1, 1) \cdot (0, 0, 1) & (1, -1, 1) \cdot (2, 1, 0) & (1, -1, 1) \cdot (1, 0, -1)
\end{pmatrix} = \\
= \begin{pmatrix}
0 & 3 & 1 \\
1 & 2 & 0 \\
1 & 1 & 0 \\
\end{pmatrix}
$


That is, each of the $ a_ij $ elements of the matrix, with $ i $ being the row index and $ j $ being the column index is the result of dot multiplication of the *ith* row vector from $ A $ by the *jth* row vector from $ B $.

![Matrix multiplication](../images/matrix_multiplication.png)

Note that the same rule applies to matrices of lower dimension:

$
\begin{pmatrix}
1 & 2 \\
3 & 4 
\end{pmatrix}
\begin{pmatrix}
0 & -1 \\
1 & 0
\end{pmatrix} = \begin{pmatrix}
2 & -1 \\
4 & -3
\end{pmatrix}
$

### Implementing matrix multiplication

Learning how to do matrix multiplication by hand is fine for educational purposes, but the ultimate goal is to define a function that will do that tedious work for us.

When using Python, the fact that each of the elements is the result of the dot product of the corresponding row and column from the given matrices is more conducive to this calculation:

Again, this will require a little bit of an staged approach.

Let's denote the elements of the resulting matrix as `a00`, `a01`, etc.

$
AB = \begin{pmatrix}
a_{00} & a_{01} & a_{02} \\
a_{10} & a_{11} & a_{12} \\
a_{20} & a_{21} & a_{22}
\end{pmatrix}
$

We know that $ a_ij $ will be the result of the dot product of row i from $ A $ by column j from $ B $. We also know that we can get the column vectors from B doing `zip(*b)`.

Thus, the first brute force approach for the `matrix_multiply(...)` function will be:


In [14]:
from my_vectors import *

a = (
    (1, 1, 0),
    (1, 0, 1),
    (1, -1, 1)
)
b = (
    (0, 2, 1),
    (0, 1, 0),
    (1, 0, -1)
)

def matrix_multiply(a, b):
    b_cols = tuple(zip(*b))
    a00 = dot(a[0], b_cols[0])
    a01 = dot(a[0], b_cols[1])
    a02 = dot(a[0], b_cols[2])
    a10 = dot(a[1], b_cols[0])
    a11 = dot(a[1], b_cols[1])
    a12 = dot(a[1], b_cols[2])
    a20 = dot(a[2], b_cols[0])
    a21 = dot(a[2], b_cols[1])
    a22 = dot(a[2], b_cols[2])        
    return (
        (a00, a01, a02),
        (a10, a11, a12),
        (a20, a21, a22)
    )


matrix_multiply(a, b)

((0, 3, 1), (1, 2, 0), (1, 1, 0))

Obviously, that's far from an ideal implementation, but it gives us some hints about the kind of loops that will be needed &mdash; we will need an inner iteration over cols of b, and an outer iteration over the rows of a.

This second implementation focuses on the inner iteration:

In [17]:
from my_vectors import *

a = (
    (1, 1, 0),
    (1, 0, 1),
    (1, -1, 1)
)
b = (
    (0, 2, 1),
    (0, 1, 0),
    (1, 0, -1)
)

def matrix_multiply(a, b):
    result_first_row = tuple(dot(a[0], col_from_b) for col_from_b in zip(*b))
    result_second_row = tuple(dot(a[1], col_from_b) for col_from_b in zip(*b))
    result_third_row = tuple(dot(a[2], col_from_b) for col_from_b in zip(*b))
    return (result_first_row, result_second_row, result_third_row)

matrix_multiply(a, b)

((0, 3, 1), (1, 2, 0), (1, 1, 0))

We're really close to our final implementation, we just need to add the outer loop, so that no variable definition is needed:

In [19]:
from my_vectors import *

a = (
    (1, 1, 0),
    (1, 0, 1),
    (1, -1, 1)
)
b = (
    (0, 2, 1),
    (0, 1, 0),
    (1, 0, -1)
)

def matrix_multiply(a, b):
    return tuple(
        tuple(dot(row_from_a, col_from_b) for col_from_b in zip(*b)) 
        for row_from_a in a
    )

matrix_multiply(a, b)

((0, 3, 1), (1, 2, 0), (1, 1, 0))

Note that this function works for matrices of any dimension:

In [20]:
from my_vectors import *

c = (
    (1, 2),
    (3, 4)
)
d = (
    (0, -1),
    (1, 0)
)

def matrix_multiply(a, b):
    return tuple(
        tuple(dot(row_from_a, col_from_b) for col_from_b in zip(*b)) 
        for row_from_a in a
    )

matrix_multiply(c, d)

((2, -1), (4, -3))

### 3D animation with matrix transformations

To animate a 3D model, we redraw a transformed version of the original model in each frame. To make the model appear to move or change over time, we need to use different transformations as the time progresses. If these transformations are linear transformations specified by matrices, we will need a new matric for every new frame of the animation.

Because *PyGame*'s built-in clock keeps track of time in milliseconds, we can generate matrices whose entries depend on time. That is, we can think of every matrix as a function that takes the current time and returns a matrix for that given time:

$
t \rightarrow \begin{pmatrix}
a_{00}(t) & a_{01}(t) & a_{02}(t) \\
a_{10}(t) & a_{11}(t) & a_{12}(t) \\
a_{20}(t) & a_{21}(t) & a_{22}(t)
\end{pmatrix}
$

The following matrix, would be an example of such transformation that depends on time:

$
\begin{pmatrix}
cos(t) & 0 & -sin(t) \\
0 & 1 & 0 \\
sin(t) & 0 & cos(t)
\end{pmatrix}
$

Let's use this transformation when drawing the teapot.