# Vectors and matrices

This suppplements the notes at https://people.duke.edu/~ccc14/sta-663-2018/notebooks/S03_Numpy_Annotated.html to go over more challenging concepts.

In [192]:
import numpy as np

## Vectors

In mathematics, the default is to use a column vector $x$. To denote a row vector, we use $x^T$.

In [193]:
x = np.arange(1, 5).reshape(-1,1)
x

array([[1],
       [2],
       [3],
       [4]])

In [194]:
y = np.arange(4, 0, -1).reshape(-1,1)
y

array([[4],
       [3],
       [2],
       [1]])

## Simple operations on vectors

### Element-wise operations

In [195]:
x + y

array([[5],
       [5],
       [5],
       [5]])

In [196]:
x * y

array([[4],
       [6],
       [6],
       [4]])

In [197]:
x / y

array([[0.25      ],
       [0.66666667],
       [1.5       ],
       [4.        ]])

In [198]:
x ** y

array([[1],
       [8],
       [9],
       [4]])

### Universal functions

In [199]:
x**2

array([[ 1],
       [ 4],
       [ 9],
       [16]])

In [200]:
np.log(x)

array([[0.        ],
       [0.69314718],
       [1.09861229],
       [1.38629436]])

In [201]:
np.cos(x)

array([[ 0.54030231],
       [-0.41614684],
       [-0.9899925 ],
       [-0.65364362]])

### Vector reductions

In [202]:
x.sum()

10

In [203]:
x.mean()

2.5

In [204]:
x.max()

4

### Transpose

In [205]:
x.transpose()

array([[1, 2, 3, 4]])

In [206]:
x.T

array([[1, 2, 3, 4]])

## Vector multiplication


Standard inner or dot proudct

In [207]:
x.T.dot(x)

array([[30]])

In [208]:
x.T @ x

array([[30]])

If you just want the scalar result

In [209]:
(x.T @ x).item()

30

Contrast with the outer product

In [210]:
x @ x.T

array([[ 1,  2,  3,  4],
       [ 2,  4,  6,  8],
       [ 3,  6,  9, 12],
       [ 4,  8, 12, 16]])

In [211]:
np.squeeze(x.T @ x)

array(30)

## Matrices

### Basically 2D arrays

In [212]:
A = np.arange(1, 13).reshape(3,4)

In [213]:
A

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [214]:
A.shape

(3, 4)

### Most things work just like for vectors

In [215]:
A + A

array([[ 2,  4,  6,  8],
       [10, 12, 14, 16],
       [18, 20, 22, 24]])

In [216]:
A**2

array([[  1,   4,   9,  16],
       [ 25,  36,  49,  64],
       [ 81, 100, 121, 144]])

In [217]:
np.log(A)

array([[0.        , 0.69314718, 1.09861229, 1.38629436],
       [1.60943791, 1.79175947, 1.94591015, 2.07944154],
       [2.19722458, 2.30258509, 2.39789527, 2.48490665]])

In [218]:
A.T

array([[ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11],
       [ 4,  8, 12]])

## Matrix multiplication

In [219]:
A @ A.T

array([[ 30,  70, 110],
       [ 70, 174, 278],
       [110, 278, 446]])

In [220]:
A.T @ A

array([[107, 122, 137, 152],
       [122, 140, 158, 176],
       [137, 158, 179, 200],
       [152, 176, 200, 224]])

### Matrix vector multiplication

Withc column vectors

In [221]:
A @ x

array([[ 30],
       [ 70],
       [110]])

With row vectors

In [222]:
x.T @ A.T

array([[ 30,  70, 110]])

## Extrracting vectors from matrices

In [223]:
A

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

### Get column vectors

In [224]:
c0 = np.array([1,0,0,0]).reshape(-1, 1)
c0

array([[1],
       [0],
       [0],
       [0]])

In [225]:
c2 = np.array([0,0,1,0]).reshape(-1, 1)
c2

array([[0],
       [0],
       [1],
       [0]])

In [226]:
A @ c0

array([[1],
       [5],
       [9]])

In [227]:
A @ c2

array([[ 3],
       [ 7],
       [11]])

### So multiplying a matrix with a column vector gives a weighted sum of the marix columns

In [228]:
A

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [229]:
x

array([[1],
       [2],
       [3],
       [4]])

In [230]:
A @ x

array([[ 30],
       [ 70],
       [110]])

Exploain how we got the result above.

Similar things can be done with rows.

In [231]:
r2 = np.array([0,0,1])
r2

array([0, 0, 1])

In [232]:
r2 @ A

array([ 9, 10, 11, 12])

Multiplying two matrices can be seen as geenrating the weighted sums column by column (or row by row)

In [233]:
B = np.array([
    [1,2,3],
    [1,2,3],
    [1,2,3],
    [1,2,3]
])
B

array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

In [234]:
A

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

Column interpreation

In [235]:
A @ B[:, 0].reshape(-1,1)

array([[10],
       [26],
       [42]])

In [236]:
A @ B[:, 1].reshape(-1,1)

array([[20],
       [52],
       [84]])

In [237]:
A @ B[:, 2].reshape(-1,1)

array([[ 30],
       [ 78],
       [126]])

In [238]:
A @ B

array([[ 10,  20,  30],
       [ 26,  52,  78],
       [ 42,  84, 126]])

Row interpretation

In [239]:
A[0,:]

array([1, 2, 3, 4])

In [240]:
B

array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

In [241]:
A[0,:] @ B

array([10, 20, 30])

How did we get this result?

In [242]:
A[1, :]

array([5, 6, 7, 8])

In [243]:
A[1,:] @ B

array([26, 52, 78])

In [244]:
A[2,:] @ B

array([ 42,  84, 126])

In [245]:
A @ B

array([[ 10,  20,  30],
       [ 26,  52,  78],
       [ 42,  84, 126]])

### Permutations can be seen as rearranging columns

In [246]:
A.shape

(3, 4)

In [247]:
I = np.eye(A.shape[1], dtype='int')
I

array([[1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 1]])

In [248]:
A

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [249]:
A @ I

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

We can interpret the post-(or rihgt-)multiplicatin of A with the identify matrix like so:
    
- The 1st column of I takes 1 of 1st column of A and 0 of everything else
- The 2nd columnn of I takes 1 of 2nd column of A and 0 of everything else
- The 3rd column of I takes 1 of 3rd column of A and 0 of everything else
- The 4th columnn of I takes 1 of 4th column of A and 0 of everything else

In [250]:
I1 = I[:, [2,0,3,1]]
I1

array([[0, 1, 0, 0],
       [0, 0, 0, 1],
       [1, 0, 0, 0],
       [0, 0, 1, 0]])

In [251]:
A @ I1

array([[ 3,  1,  4,  2],
       [ 7,  5,  8,  6],
       [11,  9, 12, 10]])

In [252]:
A[:, [2,0,3,1]]

array([[ 3,  1,  4,  2],
       [ 7,  5,  8,  6],
       [11,  9, 12, 10]])

Pre-multiplication by a permuted identity martrix does the same things to the *rows* of A. (not shown)

## Broadcasting

A scalar can be *promoted* or *broadcast* to be a vector or matrix

In [253]:
x

array([[1],
       [2],
       [3],
       [4]])

In [254]:
x + 1

array([[2],
       [3],
       [4],
       [5]])

In [255]:
A

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [256]:
A + 1

array([[ 2,  3,  4,  5],
       [ 6,  7,  8,  9],
       [10, 11, 12, 13]])

A vecotr can be promoted to a matrix if the *shape* is right

In [257]:
u = np.ones(3, dtype='int').reshape(-1,1)
u

array([[1],
       [1],
       [1]])

In [258]:
A.shape

(3, 4)

In [259]:
u.shape

(3, 1)

In [260]:
A + u

array([[ 2,  3,  4,  5],
       [ 6,  7,  8,  9],
       [10, 11, 12, 13]])

In [261]:
x.T.shape

(1, 4)

In [262]:
A.shape

(3, 4)

In [263]:
A + x.T

array([[ 2,  4,  6,  8],
       [ 6,  8, 10, 12],
       [10, 12, 14, 16]])

### Broadcasting rules

- Look at the sahpes of the two arrays from right to left
- If the numbers are the same, it is ok
- If one number is 1, it will be promoted to the size of the other
- If one number is missing, it wil be promoted to the sie of the other
- Everything else will be an error

In [264]:
A.shape, u.shape

((3, 4), (3, 1))

In [265]:
A + u

array([[ 2,  3,  4,  5],
       [ 6,  7,  8,  9],
       [10, 11, 12, 13]])

Promotions means `numpy` does something like this under the hood.

In [266]:
u1 = u @ np.ones((1,4), dtype='int')
u1

array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]])

In [267]:
A.shape, u1.shape

((3, 4), (3, 4))

In [268]:
A + u1

array([[ 2,  3,  4,  5],
       [ 6,  7,  8,  9],
       [10, 11, 12, 13]])

### To enable broadcasting,  you can add *dummy* dimensions

In [269]:
A.shape

(3, 4)

In [270]:
A[:, None].shape

(3, 1, 4)

In [271]:
A[:, None]

array([[[ 1,  2,  3,  4]],

       [[ 5,  6,  7,  8]],

       [[ 9, 10, 11, 12]]])

In [272]:
A[:, :, None].shape

(3, 4, 1)

Some people prefer `np.newaxis` to `None` — they are identical

In [273]:
A[:, np.newaxis, :].shape

(3, 1, 4)

In [274]:
A[np.newaxis, ...].shape

(1, 3, 4)

In [275]:
x

array([[1],
       [2],
       [3],
       [4]])

In [276]:
x.shape, A[:, :, None].shape

((4, 1), (3, 4, 1))

In [277]:
A[:, :, None] + x

array([[[ 2],
        [ 4],
        [ 6],
        [ 8]],

       [[ 6],
        [ 8],
        [10],
        [12]],

       [[10],
        [12],
        [14],
        [16]]])

## Broadcasting example

In [278]:
w = np.arange(1, 13)

In [279]:
w.shape

(12,)

In [280]:
w[None, :].shape

(1, 12)

Note that w[None, :] converts the 1D array `w` into a row vector.

In [281]:
w[None, :]

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]])

In [282]:
w[:, None].shape

(12, 1)

Note that w[:, None] converts the 1D array `w` into a row vector.

In [283]:
w[:, None]

array([[ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10],
       [11],
       [12]])

What does this do?

In [284]:
w[:, None] * w[None, :]

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12],
       [  2,   4,   6,   8,  10,  12,  14,  16,  18,  20,  22,  24],
       [  3,   6,   9,  12,  15,  18,  21,  24,  27,  30,  33,  36],
       [  4,   8,  12,  16,  20,  24,  28,  32,  36,  40,  44,  48],
       [  5,  10,  15,  20,  25,  30,  35,  40,  45,  50,  55,  60],
       [  6,  12,  18,  24,  30,  36,  42,  48,  54,  60,  66,  72],
       [  7,  14,  21,  28,  35,  42,  49,  56,  63,  70,  77,  84],
       [  8,  16,  24,  32,  40,  48,  56,  64,  72,  80,  88,  96],
       [  9,  18,  27,  36,  45,  54,  63,  72,  81,  90,  99, 108],
       [ 10,  20,  30,  40,  50,  60,  70,  80,  90, 100, 110, 120],
       [ 11,  22,  33,  44,  55,  66,  77,  88,  99, 110, 121, 132],
       [ 12,  24,  36,  48,  60,  72,  84,  96, 108, 120, 132, 144]])

How does this work?

In [285]:
w[None, :].shape, w[:, None].shape

((1, 12), (12, 1))

We work from right to left in the shapes — so the first numbers to match are 12 and 1. Broadcasting thus promotes the second array to be (12, 12). Now we need to match 1 to 12, so the first array is also promoted to be (12, 12).

Now the first array looks like this

In [286]:
w

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [287]:
w1 = np.tile(w[None, :], (12, 1))
w1

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]])

And the second array looks like this

In [288]:
w2 = np.tile(w[:, None], (1, 12))
w2

array([[ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1],
       [ 2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2],
       [ 3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3],
       [ 4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4],
       [ 5,  5,  5,  5,  5,  5,  5,  5,  5,  5,  5,  5],
       [ 6,  6,  6,  6,  6,  6,  6,  6,  6,  6,  6,  6],
       [ 7,  7,  7,  7,  7,  7,  7,  7,  7,  7,  7,  7],
       [ 8,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8],
       [ 9,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9,  9],
       [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10],
       [11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11],
       [12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12]])

And element-wsie multiplication takes care of the rest.

In [289]:
w1 * w2

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12],
       [  2,   4,   6,   8,  10,  12,  14,  16,  18,  20,  22,  24],
       [  3,   6,   9,  12,  15,  18,  21,  24,  27,  30,  33,  36],
       [  4,   8,  12,  16,  20,  24,  28,  32,  36,  40,  44,  48],
       [  5,  10,  15,  20,  25,  30,  35,  40,  45,  50,  55,  60],
       [  6,  12,  18,  24,  30,  36,  42,  48,  54,  60,  66,  72],
       [  7,  14,  21,  28,  35,  42,  49,  56,  63,  70,  77,  84],
       [  8,  16,  24,  32,  40,  48,  56,  64,  72,  80,  88,  96],
       [  9,  18,  27,  36,  45,  54,  63,  72,  81,  90,  99, 108],
       [ 10,  20,  30,  40,  50,  60,  70,  80,  90, 100, 110, 120],
       [ 11,  22,  33,  44,  55,  66,  77,  88,  99, 110, 121, 132],
       [ 12,  24,  36,  48,  60,  72,  84,  96, 108, 120, 132, 144]])

## Vectorizing loops

Learn to recognise how to convert loops into vectorized `numpy` operations and vice versa.

### Universal functions

In [292]:
x

array([[1],
       [2],
       [3],
       [4]])

In [291]:
m, n = x.shape
s = x.copy()
for i in range(m):
    s[i] = x[i]**2
s

array([[ 1],
       [ 4],
       [ 9],
       [16]])

Can just be written as

In [190]:
s = x**2
s

array([[ 1],
       [ 4],
       [ 9],
       [16]])

### Scalar product

In [294]:
x

array([[1],
       [2],
       [3],
       [4]])

In [293]:
y

array([[4],
       [3],
       [2],
       [1]])

Vectorized form

In [299]:
x.T @ y

array([[20]])

Unrolled version

In [298]:
s = 0
for i in range(m):
     s += x[i] * y[i]
s

array([20])

### Matrix vector proudct

In [303]:
A

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [304]:
x

array([[1],
       [2],
       [3],
       [4]])

Vectroized form

In [300]:
A @ x

array([[ 30],
       [ 70],
       [110]])

Unrolled version

In [310]:
s = np.zeros(A.shape[0]).reshape(-1, 1)
m, n = A.shape
for i in range(m):
    for j in range(n):
        s[i] += A[i, j] * x[j]
s

array([[ 30.],
       [ 70.],
       [110.]])

### Solving a set of linear equations

See example 3 at https://people.duke.edu/~ccc14/sta-663-2018/notebooks/S03_Numpy_Annotated.html on vectorizing a set of linear equations.