### Introduction

In the last lesson, we saw that we can represent our features as a matrix, and our target variables as a vector.

|ad spending    | price|t-shirts           
| ------------- |------|:-------------:| 
|    800        | 13  | 330  | 
|    1500        |11 | 780  | 
|    2000      | 9 | 1130  | 
|    3500      | 10 | 1310  | 
|    4000      | 8 | 1780  | 

$A = \begin{pmatrix}
    800 & 13 \\
    1500 & 11 \\
    2000 & 9 \\
    3500 & 10 \\ 
    4000 & 8 \\ 
\end{pmatrix} b =  \begin{pmatrix}
330   \\
780 \\
1130 \\
1310 \\
1780 \\
\end{pmatrix}$ 

Remember, of course, that when we perform linear regression, we will need to multiply our features by coefficients.  This is how we'll make predictions.  Below we let our coefficients equal $.35$ and $-12$.  So this means that we multiply ad spending by $.35$ and add that to $-12$ times by the price, to predict the amount of T-shirt sales.

$$800*.35 + 13*-12 $$

$$1500*.35 + 11*-12 $$

$$2000*.35 + 9*-12 $$

$$3500*.35 + 10*-12 $$

$$4000*.35 + 8*-12 $$

### Multiplication with matrices

We would like to translate the system of equations to use matrices.  This is our system of equations.

$$800*.35 + 13*-12 $$

$$1500*.35 + 11*-12 $$

$$2000*.35 + 9*-12 $$

$$3500*.35 + 10*-12 $$

$$4000*.35 + 8*-12 $$

And we can represent our features variables as the following matrix.

$A = \begin{pmatrix}
    800 & 13 \\
    1500 & 11 \\
    2000 & 9 \\
    3500 & 10 \\ 
    4000 & 8 \\ 
\end{pmatrix} $ 

Now, what we would like to do is multiply the first column of matrix A by .35, and the second column by -12.  We can accomplish this with the following.  First we create a new vector to represent our feature variables, .35 and -12.

$x = \begin{pmatrix}
    .35 & -12 \\
\end{pmatrix} $

Then we multiply our matrix A by this row vector.

$\begin{pmatrix}
    .35 & -12 \\
\end{pmatrix} \cdot  \begin{pmatrix}
    800 & 13 \\
    1500 & 11 \\
    2000 & 9 \\
    3500 & 10 \\ 
    4000 & 8 \\ 
\end{pmatrix} $

Doing so, is equivalent to the following:

$.35* \begin{pmatrix}
    800  \\
    1500  \\
    2000 \\
    3500 \\ 
    4000 \\ 
\end{pmatrix} + -12* \begin{pmatrix}
     13 \\
     11 \\
     9 \\
     10 \\ 
     8 \\ 
\end{pmatrix} $



Which is precisely what we want.  We want to multiply our first vector of feature variables by our first coefficient, and our second vector of feature variables by the second.

### Proving it with code

Let's walk through these steps with code.  

In [36]:
import numpy as np
x = np.array([.35, -12])
A = np.array([
    [800, 13],
    [1500, 11],
    [2000, 9],
    [3500, 10],
    [4000, 8],
])

In [37]:
A.dot(x)

array([ 124.,  393.,  592., 1105., 1304.])

Now if we wished to break this down, we can with the following:

In [38]:
first_column = A[:, 0] 
first_column

array([ 800, 1500, 2000, 3500, 4000])

In [39]:
second_column = A[:, 1]
second_column

array([13, 11,  9, 10,  8])

In [40]:
x[0]*first_column + x[1]*second_column

array([ 124.,  393.,  592., 1105., 1304.])

Which is precisely what we calculated before.  

Let's see all of the steps broken down.

$\begin{pmatrix}
    .35 & -12 \\
\end{pmatrix} \cdot  \begin{pmatrix}
    800 & 13 \\
    1500 & 11 \\
    2000 & 9 \\
    3500 & 10 \\ 
    4000 & 8 \\ 
\end{pmatrix}  = .35* \begin{pmatrix}
    800  \\
    1500  \\
    2000 \\
    3500 \\ 
    4000 \\ 
\end{pmatrix} + -12* \begin{pmatrix}
     13 \\
     11 \\
     9 \\
     10 \\ 
     8 \\ 
\end{pmatrix} = \begin{pmatrix}
     280 \\
     525 \\
     700 \\
     1225 \\ 
     1400 \\ 
\end{pmatrix} + \begin{pmatrix}
     -156 \\
     -132 \\
     -108 \\
     -120 \\ 
     -96 \\ 
\end{pmatrix} =  \begin{pmatrix}
     124 \\
     393 \\
     592 \\
     1105 \\ 
     1304 \\ 
\end{pmatrix}  $

So really matrix vector multiplication is just a combination of what we learned before - first scaling each vector and then adding the two vectors.

### For the visual mind

Visually, and conceptually, matrix vector multiplication is essentially equivalent to what we learned before.

|ad spending    | price|
| ------------- |------|
|    800        | 13  | 
|    1500        |11 | 

And we have coefficients of .35 and -12.

When we perform matrix vector multiplication, we are really just scaling two vectors and then adding them together.

$.35* \begin{pmatrix}
    800  \\
    1500  \\
\end{pmatrix} + -12* \begin{pmatrix}
     13 \\
     11 \\
\end{pmatrix} $

Which visually looks like the following:

In [48]:
import numpy as np
a_1 = np.array([800, 1500])
a_2 = np.array([13, 11])

ad_spending_trace = vector_trace()
price_trace = vector_trace(-12*a_2)

In [51]:
plus_trace(.35*a_1,-12*a_2)

{'x': [280.0, 124.0],
 'y': [525.0, 393.0],
 'mode': 'lines+markers',
 'name': '',
 'text': []}

In [50]:
from graph import trace_values
def vector_trace(vector, name = '', text = ''):
    x_coord = vector[0] 
    if len(vector) is 1:
        trace = {'x': [0, x_coord], 'y': [0, 0], 'mode': 'lines+markers', 'name': name, 'text': text}
        layout = {'xaxis': {'range': [0, 4]}, 'yaxis': dict(range=[-.5, .5],
        showgrid=False,
        zeroline=True,
        showline=False,
        ticks='',
        showticklabels=False)}
        return plot([trace], layout)
    else:
        y_coord = vector[1]
        return {'x': [0, x_coord], 'y': [0, y_coord], 'mode': 'lines+markers', 'name': name, 'text': text}
    
def plus_trace(first_array, second_array, name = ''):
    added = first_array + second_array
    first_added = added[0]
    second_added = added[1]
    second_vector_x = [first_array[0], first_added]
    second_vector_y = [first_array[1], second_added]
    return trace_values(second_vector_x, second_vector_y, mode = 'lines+markers', name = name)