# Vectors and arrays

This section will review the methods of constructing appropriate data structures in python to represent vectors, or more complex arrays.

The simplest way of storing a vector in python is with the use of lists. For example, if we wanted to store a row vector $a = (7, 3, 8, 11)$

In [9]:
a = [7, 3, 8, 11]

print(a)
print(len(a))

print(a * 5)

[7, 3, 8, 11]
4
[7, 3, 8, 11, 7, 3, 8, 11, 7, 3, 8, 11, 7, 3, 8, 11, 7, 3, 8, 11]


We say that vectors like the one above are a one-dimensional array of numbers, i.e. it only has one row so each element is characterised by one index which marks the column that it is in. If we were to increase the number of rows we would get a matrix, or a two-dimensional array, which then has 2 indices: one for the row number and one for the column number. We could further construct a three-, four-, and further hyper-dimensional arrays. Constructing multi-dimensional arrays can be achieved by nested lists.

While lists may work just fine for very simple cases, any manipulation of vectors stored as lists soon becomes counter-intuitive and complicated. Even scaling a vector is not as simple as we would like it to be.

It is almost always much more appropriate to use Numerical Python (NumPy) arrays to represent this kind of data structures. There are several reasons for this and they include faster computation time, operations on array without loops, and others.

There are many ways that we can construct NumPy arrays. One is to create an array from existing data, e.g. by using NumPy's array() function on a list.

In [27]:
import numpy as np

b = [[3, 5, 17], [1, 0, 6], [5, 3, 0]]

# Creates an array based on the list b
b_array = np.array(b)  
print(b_array)

# Creates an array based on a new list
c = np.array([0, 6, 7])
print(c)

[[ 3  5 17]
 [ 1  0  6]
 [ 5  3  0]]
[0 6 7]


Notice how b_array is now automatically formatted as a 3x3 matrix, rather than a nested list (in this case one big list containing three smaller lists). This would not have happened if all the columns and rows did not have equal length.

In [29]:
a = np.array([[1, 6, 4],
              [3, 8, 1, 1],
              [4, 0, 3],
              [7, 4, 2]])

print(a)

[list([1, 6, 4]) list([3, 8, 1, 1]) list([4, 0, 3]) list([7, 4, 2])]


We see that now our array a, rather than consisting only of numbers, actually consists of four lists. This is because three lists have 3 columns, while one has 4. NumPy arrays require that all column and row lengths are equal.

### Indexing and slicing

Indexing an array is very similar to indexing a list or nested lists. A nested list b is indexed as b[i][j], which will work on an array as well, b_array[i][j]. Arrays can be indexed in an additional way, by using only one pair of square brackets and separating the indices by a comma. This does not work on lists, as nested lists are indexed list by list.

In [38]:
print(b[0][1])  # Second element in the first list

print(b_array[0][1])  # First row, second column in the array
print(b_array[0, 1])

5
5
5


Array slicing works in a similar way to list slicing, but is more intuitive. Elements in nested lists have to be considered as a multi-level structure, where an element is in one list, but that whole list is then an element of another list, and so on. This makes indexing and slicing more complicated for lists than it is for arrays.

Let us construct a two-dimensional array (a matrix) using numpy.arange() function.

In [52]:
M = np.arange(0, 16)  # Constructs an array with integer values from 0 to 15

print(M)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]


We created a one-dimensional array with 16 numbers, which we can now reshape using the reshape() function into a 4x4 array. Remember that the column and row length in each row or column vector in the array needs to be equal. That means that the only possible shapes of our array are 1x16, 2x8, 4x4, 8x2 and 16x1.

In [53]:
M = np.reshape(M, (4,4))

print(M)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]


Let's say we want to take a slice of this array that does not include the first and last row and the first two columns. In arrays, this is much more easily achieved, using the same [start:stop:step] formatting, like in lists.

In [54]:
M = M[1:-1, 2:]
print(M)

[[ 6  7]
 [10 11]]


### Pros and cons

One of the reasons behind the speed of NumPy operations is that many of them are not actually implemented in python, but in C, Fortran, and other more efficient languages. However, this comes at a price, and some very handy options available for lists are not available for NumPy arrays.

For example, appending is not possible for NumPy arrays like it is for python lists. Let's say you have a list of 5 numbers and you want to append 3 more numbers to that list. What happens is that the computer has to allocate new memory to store those 8 numbers. When it allocated that memory space, it copies the 5 numbers from the list in the new array (memory) and then assigns the new 3 numbers to the other 3 positions in the array. Althought potentially very computationally expensive, this is just fine in python. The way that NumPy works is that NumPy immediately reserves just enough memory that it needs to store the array of the specified size, so the size of the array cannot be changed. There are, of course, ways of achieving a similar thing like appending to a list (e.g. numpy.concatenate() function), but they are not as user-friendly as list appending.


### Operations on arrays

Mathematical operations on arrays, unlike on lists, do not require loops but are instead performed on the entire array. This is called **vectorisation**. For example, scaling a vector is now as simple as:

In [61]:
v = np.linspace(0, 9, 10)  # Creates an array of 10 elements, evenly spaced from 0 to 9

print('v = ', v)

print('2v = ', v * 2)

v =  [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
2v =  [ 0.  2.  4.  6.  8. 10. 12. 14. 16. 18.]


Furthermore, we are able to input the whole array as an argument to functions, such as numpy.sin, which will create a new array where each element is the sinus of a corresponding element in the input array.

In [62]:
sin_array = np.sin(v)

print(sin_array)

[ 0.          0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427
 -0.2794155   0.6569866   0.98935825  0.41211849]


This is not limited only to numpy functions.

In [63]:
def sqrt(x):
    return x ** 0.5

sqrt_array = sqrt(v)

print(sqrt_array)

[0.         1.         1.41421356 1.73205081 2.         2.23606798
 2.44948974 2.64575131 2.82842712 3.        ]
