# Numpy

We've already seen the `math` library that we used to compute the square root. It has several other built-in functions like logarithms, sine, cosine and so on.

`numpy` is an extended library, which offers all `math` has to offer and much more. Here we'll take a look at `numpy`s array functions, since these are the ones we'll be using the most.

A numpy array can work both as a list and a vector/matrix. In list form, we'll use it mostly for statistical computations. In array form, we'll use it for linear algebra.

You create a numpy array by calling the function `np.array` on a (1D- or 2D-)list.

In [22]:
import numpy as np

data = np.array([6, 565, 64, 65, 65, 456, 66, 879, 9, -54, -645, 489, -7])


`argmax` returns the index of the largest number of the array. This may seem like a strange thing, but if an array holds a list of properties, argmax is the most likely outcome. This is used a lot in classification.

In [37]:

print(data.argmax())

7


numpy has built-in functions for computing the mean and median.

In [24]:
print(np.mean(data))
print(np.median(data))


150.6153846153846
65.0


The mode is a bit trickier. We can use `unique` to get a count of how many we have of each distinct value in the array. Then `argmax` will tell us the index of the most common value. We can use this to look up the mode.

In [25]:
value, count = np.unique(data, return_counts=True, sorted=False)
print(value)
print(count)

mode_index = count.argmax()
mode=value[mode_index]
print(mode)

[-645  -54   -7    6    9   64   65   66  456  489  565  879]
[1 1 1 1 1 1 2 1 1 1 1 1]
65


Variance is easy with the built-in `var` function or a bit harder to do it ourselves.

In [26]:
print(np.var(data)) #Variance

squares = [(x - np.mean(data)) ** 2 for x in data]
print(np.sum(squares) / len(data))

129232.85207100592
129232.85207100592


For standard deviation, we can take the square root of the variance. It's easier to just call `std`, though.

In [27]:
print(np.sqrt(np.var(data)))
print(np.std(data)) # Standard Deviation


359.4897106608281
359.4897106608281


## Linear algebra

A vector in numpy is a one-dimensional array.

In [28]:
v1 = np.array([1, -2])
v2 = np.array([-4, 2])


We can compute the scalar product (dot-product) between the vectors.

In [29]:
print(v1.dot(v2))

-8


A matrix is a 2-dimensional array.

In [30]:
M = np.array([
  [1, 2, 3, 4],
  [1, 2, 4, 8]
])


We cannot multiply this matrix unto to the vectors because is has the wrong shape (it should have width 2). We can transpose it and multiply it, though.

In [31]:

print(M.T @ v1)

[ -1  -2  -5 -12]


We can also see vectors as matrices with width 1.

In [32]:
Mv1 = np.array([
  [1],
  [-2]
])

Mv2 = np.array([
  [-4],
  [2]
])

Again we need to transpose the vectors before multiplying them (remember width of first should equal height of second).

This is just another way of doing the dot-product. We'll be doing that a lot in this course.

In [33]:
print(Mv1.T @ Mv2)

[[-8]]


This is technically a 1x1 matrix, but we'll treat it as a number in the math.

numpy has a syntax for accessing elements in matrices more easily. (Remember, with lists we had to go `M[1][3]`)

In [34]:
print(M[1, 3])

8


Remember that we could take sub-lists out of lists by writing things like `l[1:3]`? That also works for numpy matrices.

In [35]:
print(M[0:2, 2:])

[[3 4]
 [4 8]]


Remember that leaving one end open means that we go all the way to or from that end.

We can use that to easily pick out rows and columns.

In [36]:
print(M[0, :])
print(M[:, 2])

[1 2 3 4]
[3 4]
