# Introducing `numpy`

**September 08 2020**  
*Vincenzo Perri*

This notebook introduces the numerical package `numpy`, which allows us to peform mathematical operations on vectors and matrices.

## `numpy` arrays

One of the reasons why `python` has become one of the most popular language for data science and scientific computing is the package `numpy`, which provides classes and functions to perform advanced mathematical and statistical analyses. One of the key features is its support for vector and matrix algebra that is built on the concept of `numpy` arrays. 

Let us first understand how the standard `list` class in python differs from a `numpy` array. Consider the following example of two lists containing numerical values. If we perform an addition operation on those two lists we get:

In [7]:
python_arr_1 = [0,1,0,1]
python_arr_2 = [0,1,2,1]

python_arr_1 + python_arr_2

[0, 1, 0, 1, 0, 1, 2, 1]

For `python` lists, the mathematical operators are overloaded in such a way, that they help us to create or merge lists, but there is no implementation of mathematical operations that would allow us to use them to perform vector, or matrix-based operations. For instance, if we multiply a list with a scalar value, we get:


In [8]:
[0, 1, 2]*5

[0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]

A `numpy` array is fundamentally different from a `python` list in the sense that it represents a mathematical vector on which we can perform arithmetic operations. 

To use numpy arrays, we first have to import the package. We can then directly initialise them from a python list as follows:

In [9]:
import numpy as np

np_arr_1 = np.array(python_arr_1)
np_arr_2 = np.array(python_arr_2)

np_arr_1 + np_arr_2

array([0, 2, 2, 2])

If we pass a simple list, we get a one-dimensional vector whose number of dimensions matches the number of elements in the list. To change the structure or dimensionality of an numpy array we can use the function `reshape`. We can pass a list of integers, which specify the length of the array in multiple dimensions. A value of -1 means that the length of that dimension will be automatically determined based on the entries in the array that are remaining. In fact, it can become pretty complicated to understand the consequence of a `reshape` call, so let us make some examples:

The array above is a one-dimensional array with 4 elements. We can turn this into a 2x2 array as follows:

In [10]:
x = np_arr_1.reshape(2, 2)
print(x)

[[0 1]
 [0 1]]


We can also change the one-dimensional to a two-dimensional array, where the first dimension contains 4 elements, each being an array with a single element. Rather than specifying a value of four for the first dimension, we can pass a value of -1, which automatically uses the number of all remaining values (in this case four) as an argument. The following call effectively transposes a row vector to a matrix that consists of a single column vector.

In [11]:
x = np_arr_1.reshape(-1, 1)
print(x)

[[0]
 [1]
 [0]
 [1]]


If we want to reshape this column vector into a one-dimensional row vector, we can call:

In [12]:
x = x.reshape(-1)
print(x)

[0 1 0 1]


Instead of passing a one-dimensional list and then reshaping the resulting array, we can also initialise numpy arrays from nested lists. The dimensions of the resulting array will be automatically inferred from the lengths of the lists.

In [13]:
matrix = np.array([[1,0,1], [0,1,0], [0,0,1]])
print(matrix)

[[1 0 1]
 [0 1 0]
 [0 0 1]]


if we are unsure about the shape of an array, we can look at the `shape` property, which returns a tuple that contains the number of elements in all dimensions:

In [14]:
print(matrix.shape)

(3, 3)


Naturally, this works with arbitrary of nesting, i.e. the following initialisation generates 3x3x3 tensor, which we can visualise as a cube consisting of three layers of two-dimensional matrices.

In [15]:
tensor = np.array([[[0,1,0],[1,0,1],[1,1,1]], [[0,0,0],[1,0,1],[1,1,0]], [[1,0,0],[0,0,0],[1,1,1]]])
print(tensor)
print(tensor.shape)

[[[0 1 0]
  [1 0 1]
  [1 1 1]]

 [[0 0 0]
  [1 0 1]
  [1 1 0]]

 [[1 0 0]
  [0 0 0]
  [1 1 1]]]
(3, 3, 3)


## Indexing and slicing `numpy` arrays

An important `numpy` feature that you need to understand if you want to work with arrays is so-called **array slicing**, a special type of indexing, that we can use certain (subsets) of values in different dimensions. First, we can directly use standard python indexing to access elements in an numpy array. For instance, we can access the top-left element (zero) in the first matrix in the tensor above as follows:

In [16]:
print(tensor[0][0][0])

0


Alternatively, we can also use a single bracket, separating indices in multiple dimensions through commas, i.e. if we want to access the bottom-right element in the third matrix, we could write:

In [17]:
print(tensor[2,2,2])

1


We can use slicing to specify a sequence of indices that we seek to return using the notation `start:stop:step`, where `start` is the inclusive start index, `stop` is the exclusive stop index, and `step` is the increment. For example, we can use the following to retrieve five elements in the middle of a one-dimensional array with seven elements:

In [18]:
x = np.array([1,2,3,4,5,6,7])
print(x[1:6:1])

[2 3 4 5 6]


We can select every second element by changing the step:

In [19]:
print(x[1:6:2])

[2 4 6]


If we omit the `step` parameter, a default step value of 1 is assumed, i.e. we can write:

In [20]:
print(x[1:6])

[2 3 4 5 6]


if we additionally omit the `start` or `stop` index, they default to zero or the last index in the array respectively, i.e. we have:

In [21]:
print(x[:6])

[1 2 3 4 5 6]


In [22]:
print(x[1:])

[2 3 4 5 6 7]


This implies that, if we omit the `start`, `stop` and `step` parameter alltogether and simply write a colon `:`, we retrieve all elements of the array:

In [23]:
print(x[:])

[1 2 3 4 5 6 7]


In fact, slicing is a standard feature that is provided by `python`, so we can do the same in a `python` list:

In [24]:
y = [1,2,3,4,5,6,7]
print(y[:])

[1, 2, 3, 4, 5, 6, 7]


However, `numpy` takes this powerful concept one step further by generalsing it to arrays with arbitrary dimensions, where we can simply separate the slicing expression for individual dimensions by commas.

Let's play with this in the matrix example from above. First, we can get the full matrix by using an empty slicing operator on both dimensions:

In [25]:
print(matrix[:,:])

[[1 0 1]
 [0 1 0]
 [0 0 1]]


What if we want to extract the first row of the matrix, i.e. [1 0 1]. For the first dimension, we can pass the index 0, which selects the first row. For the second dimension, we specify an empty slice, which returns all values in the rows selected in the first dimension:

In [26]:
print(matrix[0,:])

[1 0 1]


Clearly, this is just a complicated way to write:

In [27]:
print(matrix[0])

[1 0 1]


However, using the slicing notation we can also efficiently extract elements that would otherwise require us to write a loop. For instance, we can get the first column vector in the matrix above by (i) specifying an empty slice in the first dimensions (this retrieves all rows) and (ii) specifying index zero for the second dimension, which selects the first element in each of the rows selected by the first dimension:

In [28]:
print(matrix[:,0])

[1 0 0]


Finally, we can also do more complicated things, like, e.g. extracting the top left and bottom left 2x2 block matrices from the matrix above:

In [29]:
print(matrix[:2,:2])
print(matrix[1:,:2])

[[1 0]
 [0 1]]
[[0 1]
 [0 0]]


Working with multi-dimensional arrays, slicing can turn into a brain twister and thus requires a bit of practice. But once you have mastered slicing in `numpy`, you will not want to miss it in the analysis of multi-variate data.

## Vector and matrix algebra

Apart from storing multi-dimensional data, we can perform advanced algebraic operations like powers, matrix multiplications, or eigenvector calculations. In numpy, these operations are implemented in the module `linalg`. To compute the k-th power of the matrix above (i.e. we multiply the matrix k times with itself), we can use the `matrix_power` function:

In [30]:
print(np.linalg.matrix_power(matrix, 2))

[[1 0 2]
 [0 1 0]
 [0 0 1]]


The (dot) product of two matrices (as well as a matrix with a vector) is implemented in the `dot` function, i.e. the following yields the same result like the `matrix_power` function:

In [31]:
print(matrix.dot(matrix))

[[1 0 2]
 [0 1 0]
 [0 0 1]]


If you recall the definition of matrix multiplication (i.e. the dot product), you will remember that the product A*B is only defined if the number of **columns** in A equals the number of **rows** in B. The resulting product will then have the same number of rows as A and the same number of columns as B. Let us try this in `numpy`:

In [32]:
v = np.array([2,3,1])
M = np.array([[1,2,3], [2,4,2], [1,1,0]])

Let us consider `v` as a row vector (i.e. we have three columns and one row). `M` is a 3x3 matrix, i.e. the number of columns in `v` matches the number of rows in M and the result is again a row vector with three columns.

In [33]:
print(v.dot(M))

[ 9 17 12]


If we change the order of the multiplication, the result should not be defined. Let us try this:

In [34]:
print(M.dot(v))

[11 18  5]


That's a surprise! The reason for this `numpy` actually does not distinguish between a row and a column vector. That is, in this case it has interpreted `v` as a column vector, and the result is a (different) vector with three rows and a single column. If we explicitly reshape the vector as described above, we get the same result, but the result vector will now be in the same shape as `v`:

In [35]:
print(M.dot(v.reshape(-1, 1)))

[[11]
 [18]
 [ 5]]


A number of important methods in data analytics are based on the calculation of eigenvector and eigenvalues, i.e. for a given matrix A we are interested in solutions to the eigenvalue equation $\mathbf{A} v = \lambda v$, where $v$ is called an eigenvector and $\lambda$ is called an eigenvalue of $\mathbf{A}$.

A prominent example of an eigenvector based method is the PageRank value of a web page, which is one factor used by Google to rank search results. This is actually the eigenvalue of a matrix that encodes the hyperlink structure of web pages, i.e. entries in the matrix capture which web pages refer to each other.

To calculate eigenvalues and eigenvectors, we can use the function `eig` in the module `linalg`:

In [36]:
np.linalg.eig(M)

(array([ 5.74872177, -1.2886655 ,  0.53994373]),
 array([[-0.49838336, -0.82177082,  0.75029071],
        [-0.83533743,  0.09852589, -0.59738112],
        [-0.23200302,  0.56123558,  0.28319543]]))

You see that in the example above the output of the function consists of two arrays, that contain all of the eigenvectors and the eigenvalues of the matrix. It is important to highlight that those are the **right** eigenvalues or eigenvectors, i.e. these are the solutions of the equation $Av = \lambda v$ where the vector $v$ is multiplied from the right.

An easier way to access the eigenvalues and eigenvectors is to unpack the tuple-valued return value into two numpy arrays as follows:

In [37]:
w, v = np.linalg.eig(M)
print(w)
print(v)

[ 5.74872177 -1.2886655   0.53994373]
[[-0.49838336 -0.82177082  0.75029071]
 [-0.83533743  0.09852589 -0.59738112]
 [-0.23200302  0.56123558  0.28319543]]


The numpy array w contains the three eigenvalues of matrix M and the two-dimensional numpy array w contains the three eigenvectors. It is important to know that the eigenvectors are returned such that the first component in each of the three inner arrays contained in w gives the first eigenvector, while the second component gives the second, and so on. That is, we can use the following numpy slicing to access the i-th eigenvector:

In [38]:
i = 0
v[:,i]

array([-0.49838336, -0.83533743, -0.23200302])

We can easily confirm whether `v[:,i]` and `w[i]` are indeed the i-th eigenvector and an eigenvalue of M. Using the eigenvalue equation above, both sides must be equal and we find:

In [39]:
for i in range(3):
    print(np.dot(M, v[:,i]))
    print(np.dot(w[i], v[:,i]))
    print()

[-2.86506728 -4.80212248 -1.33372079]
[-2.86506728 -4.80212248 -1.33372079]

[ 1.0589877  -0.12696692 -0.72324493]
[ 1.0589877  -0.12696692 -0.72324493]

[ 0.40511476 -0.32255219  0.15290959]
[ 0.40511476 -0.32255219  0.15290959]



Looking at the i-th eigenvector above, it is easy to confirm visually that the multiplication with M has simply scaled the eigenvectors by a factor that corresponds to their associated eigenvalues.