In [115]:
import numpy as np

## Matrix Operations

I'll tell the old way, and then the new way.  You used to need the .dot() method for matrix multiplication, and you'll still see this.  The name might be confusing, but you can think of matrix multiplication as a set of dot products between rows of the first matrix and columns of the second.

In [116]:
a = np.array([
    [1,2,3],
    [4,5,6]
])

b = np.array([
    [1, 1],
    [0, 1],
    [2, 2]
])

In [117]:
a.shape, b.shape

((2, 3), (3, 2))

In [118]:
a.dot(b)

array([[ 7,  9],
       [16, 21]])

In [119]:
b.dot(a)

array([[ 5,  7,  9],
       [ 4,  5,  6],
       [10, 14, 18]])

Now we have an infix operator: the @ symbol, which makes this a lot nicer

In [120]:
a @ b

array([[ 7,  9],
       [16, 21]])

In [121]:
b @ a

array([[ 5,  7,  9],
       [ 4,  5,  6],
       [10, 14, 18]])

What about vectors?  matrices always have two dimensions, but there's more than one way to store a vector.  If you put a list of numbers into the array constructor, you get what we call a one dimensional array.

In [122]:
v = np.array([1,2])

In [123]:
v.shape

(2,)

If you study linear algebra, you might wonder, is this a row vector or a column vector.  It's neither one!  It has one dimension, and might interpret that as a row or as a column.  If you want to specify one or the other, we have to make this two dimensional.

In [124]:
v_row = w.reshape(1,2)
v_row

array([[1, 2]])

we can use a -1 in one argument to reshape and python will figure out what that number needs to be.

In [125]:
v_row = w.reshape(1,-1)
v_row

array([[1, 2]])

In [126]:
v_col = w.reshape(-1,1)
v_col

array([[1],
       [2]])

this is pretty standard, to put weights into a column vector

if we have a matrix of width 2, we can multiply by this column vector

In [127]:
b.shape, v_col.shape

((3, 2), (2, 1))

In [128]:
b @ v_col

array([[3],
       [2],
       [6]])

Notice the result is a column vector.

but the row vector won't work.

In [130]:
b @ v_row

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 2)

It turns out that python will also let us use the one dimentional vector in this multiplication. 

In [131]:
b @ v

array([3, 2, 6])

The result is a one dimensional vector.  what happens is that python automatically adds an extra dimension to w.  after the multiplication, python takes away the extra dimension.

Python will always adds the dimension to the outside of the multiplication. In the previous example, w came second, so the shape becomes (2,1).  if w were first, the shape would become (1,2).

In [132]:
v.shape, a.shape

((2,), (2, 3))

In [133]:
v @ a

array([ 9, 12, 15])

Most of the time, even if you're not paying close attention, this process works and gives you the result that you're looking for.  If you're a careful person, you should turn your one-dimensional vectors into column vectors or row vectors, and that way you will clearly know exactly what multiplication is happening.

## A Linear Model

Just for fun, let's do something useful with these matrix operations.  I put some data from a made-up experiment in a file, experiment, csv

In [134]:
!cat experiment.csv

caffeine_mg, sleep_hours, score
2, 8.2, 84
32, 7.6, 90
15, 7.1, 92
14, 5.5, 79
18, 6.8, 93

the caffeine and sleep variables are our inputs, which we label X, and the score is the output Y, which we'd like to predict.  Let's get these number into numpy arrays.

Even though it's a future topic, I'm going to use the read_csv function in the pandas library

In [135]:
import pandas as pd

sleep_df= pd.read_csv("experiment.csv")

In [136]:
sleep_df

Unnamed: 0,caffeine_mg,sleep_hours,score
0,2,8.2,84
1,32,7.6,90
2,15,7.1,92
3,14,5.5,79
4,18,6.8,93


In [137]:
data = sleep_df.values
data 

array([[ 2. ,  8.2, 84. ],
       [32. ,  7.6, 90. ],
       [15. ,  7.1, 92. ],
       [14. ,  5.5, 79. ],
       [18. ,  6.8, 93. ]])

We have to get the design matrix from this array.  here's how you can do it.

In [110]:
X = data[:,:2]
np.ones((X.shape[0],1))
X = np.hstack((np.ones((X.shape[0],1)),X))
X

array([[ 1. ,  2. ,  8.2],
       [ 1. , 32. ,  7.6],
       [ 1. , 15. ,  7.1],
       [ 1. , 14. ,  5.5],
       [ 1. , 18. ,  6.8]])

In [111]:
Y = data[:, -1:]

In [112]:
Y

array([[84.],
       [90.],
       [92.],
       [79.],
       [93.]])

what if we had a linear model for this data?

$$\widehat{\text{score}} = 60 + .2 \text{caffeine_mg} + 2.5 \text{sleep_hours}$$

We'd have to put the weight into a column vector

In [138]:
beta = np.array([60, .2, 2.5]).reshape((-1,1))

Now we can create our predictions, by multiplying the design matrix X by beta

In [140]:
predicted_scores = X @ beta
predicted_scores

array([[80.9 ],
       [85.4 ],
       [80.75],
       [76.55],
       [80.6 ]])

In this case, we just made up the model weights, so these predicted scores probably aren't very good.  A good next question then, is how can we make these weights better.
Can we make them optimal in some way?
You might realize that that's exactly what a linear regression is designed to do.

We won't get into the theory of linear regression, but just to show you the power of numpy, here's the formula for computing the weight for an ordinary least squares regression:

$$\hat \beta = (X^T X)^{-1}X^T Y$$

We can translate that into an elegant line of numpy code.

In [113]:
np.linalg.inv(X.T @X) @ X.T @ Y

array([[64.86880758],
       [ 0.27072887],
       [ 2.60587851]])

In [114]:
import statsmodels.api as sm

mod = sm.OLS(Y, X)
res = mod.fit()
res.params

array([64.86880758,  0.27072887,  2.60587851])