Numpy
====================



Now that we have a basic grasp of Python, we can move to one of the most useful (and widely used) Python packages:
[NumPy](https://numpy.org). Numpy allows to efficiently manipulate "arrays" of numbers (e.g. vectors, matrices) without needing loops and in a way that is similar to scientific computation packages such as Matlab.


## Numpy (number crunching)



Numpy is a numerical computation package that revolves around one important
object: an `array`. It is convenion to import numpy as follows:



In [1]:
import numpy as np

In English: &ldquo;import the `numpy` package and (for brevity) refer to it with the
identifier `np`&rdquo;.

We will refer to numpy specific objects with the `np` namespace hereafter, so
for example we will use `np.array` (internally the type of this object is
actually `np.ndarray`, but this syntax is hardly used unless we want to test the
type of an object).

Technically the `np.array` behaves similarly to a Python `list`, but
it is definitely a `list` on steroids and specifically taylored to operate on numbers.

Let&rsquo;s start by creating an array from a list:



In [2]:
x = np.array([0, 1, 2, 3])
print(x)

[0 1 2 3]


We can use this array object similarly to lists, so for example



In [3]:
print(len(x)) # Query its length
print(x[1:-1]) # Slice it

4
[1 2]


etc&#x2026;

Things get more interesting if we create multidimensional arrays, let&rsquo;s start by
automatically creating a **2d** array (say 5 rows, 10 columns) filled with zeros.
This can be done with



In [4]:
x = np.zeros((5, 10))
print(x)

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]


as you can see, the input to the function is a tuple (or a list) with the number
of elements in the array. The format for a 2d array is `(rows, columns)` so here
we have an array (filled with zeros) of 5 rows and 10 columns. Imagine this as a
list of lists, but with additional functionalities that we will see soon.

Note that we are not limited to 2d arrays, for example a R,G,B image could be
represented as a `(rows, columns, 3)` array where the last 3 dimensions each
represents a channel of the image&#x2026; more on this later when we look at
plotting.

Now if we use the `len` function on the previously created array, we will just
know the number of **rows**



In [5]:
print(len(x))

5


Again, this is like taking the `len` of a &ldquo;list of lists&rdquo;, which would simply
give us the number of lists (i.e. the number of rows in the array).

To actualy know both the number of columns we can use the `shape` property of
`np.array`:



In [6]:
rows, cols = x.shape

and to know only one or the other we could do



In [7]:
rows = x.shape[0]
cols = x.shape[1]

This can be interpreted as the rows being the &ldquo;height&rdquo; of the array and the
columns the &ldquo;width&rdquo;. But note that the order is rows first and columns after.
This has to do with how the elements of the array would ideally be stored in the
computer&rsquo;s memory.

Alternatively we can create a 1d array by passing a single number instead of a
tuple. Let&rsquo;s say we want an array with 10 times `1.0`, we can use the `np.ones`
function instead:



In [8]:
y = np.ones(10)
y

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

and this has exactly the same syntax as `np.zeros`.

Another very frequently used function is `np.linspace(start, end, num)`. This
gives us an array with `num` equally spaced elements between the number `start`
and the number `end`, e.g.:



In [9]:
np.linspace(-1, 1, 9)

array([-1.  , -0.75, -0.5 , -0.25,  0.  ,  0.25,  0.5 ,  0.75,  1.  ])

and since `np.array` is iterable we can concisely write a for loop doing things
(here quite pointless) with these numbers:



In [10]:
txt = ''
for t in np.linspace(-1, 1, 9):
    txt += str(t+10) + ' '
print(txt)

9.0 9.25 9.5 9.75 10.0 10.25 10.5 10.75 11.0 


Arrays are a very convenient representation to store the vectors we have seen in
Week 3, and allow us to easily do operations on those.



#### Random number generation



Numpy also has a powerful random number generation submodule `numpy.random`.
Because we called Numpy `np` we can directly access the functionalities of this module with `np.random`.
We can for example generate arrays with uniform random numbers within a range with the `np.random.uniform(min, max)` function:



In [11]:
np.random.uniform(-100, 100)

13.531505255673991

Or we can generate arrays by specifying the shape similarly to the `np.zeros` function, e.g.



In [12]:
np.random.uniform(-10, 10, (5, 2))

array([[-3.77988965, -8.26130979],
       [ 5.50020222,  2.43260077],
       [-5.45919219, -9.19517769],
       [-3.76811803,  2.05573769],
       [-4.3194171 ,  1.26570007]])

for a 2d array, or



In [13]:
np.random.uniform(-1, 1, 10)

array([ 0.65701107, -0.90477405, -0.69358584,  0.40301229,  0.30739354,
        0.45820759,  0.67942518,  0.91952451,  0.02171611, -0.92444478])

for a 1d array.

If we want to always get the same random values, we can set a number as a **seed**
to the random number generator. For example, running this code multiple times
will always return the same tow random sequences:



In [14]:
np.random.seed(100)
print(np.random.uniform(-1, 1, 3))
print(np.random.uniform(-1, 1, 5))

[ 0.08680988 -0.44326123 -0.15096482]
[ 0.68955226 -0.99056229 -0.75686176  0.34149817  0.65170551]


### Indexing and slicing arrays



Numpy arrays are iterable objects and behave similarly to lists. And using the
exact syntax we used to index and slice lists will result in the same behaviour.
For instance the following will give us the first 3 rows of the array



In [15]:
x[:3]

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

while this will give us the third row



In [16]:
x[2]

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

Things start to differ from lists if we want to access a single element:



In [17]:
x[0,2]

0.0

this could be done with the the same syntax that would work with the &ldquo;list of lists&rdquo;:



In [18]:
x[0][2]

0.0

but the former is more concise and more frequently used. We then can also use
the slicing syntax in a similar way, say to get the last three columns of the
array we can do



In [19]:
x[:,-3:]

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

where the `:` alone can be read as &ldquo;give me all the rows&rdquo;.
Now, what would you do to get only the first three rows?
Insert the code here:



### Adding elements, combining and &ldquo;transposing&rdquo; arrays



One downside of `np.array` with respect to a `list` is that we cannot really use
`append` to add elements dynamically. Well that is not a problem for a &ldquo;true
Pythonista&rdquo;, would probably disdain doing so. We can however use the
`np.concatenate` function to concatenate multiple arrays, or even simple Python
lists. For example, if we wanted to add a `1.0` to a 1d array of 3 zeros we
could do



In [27]:
y = np.zeros(3)
print(y)
y = np.concatenate([y, [1.0]])
print(y)

[0. 0. 0.]
[0. 0. 0. 1.]


We can concatenate any number of arrays, e.g. let&rsquo;s make an array that looks
like `[0.0, 0.0, 1.0, 1.0, 0.0, 0.0]`:



In [28]:
np.concatenate([np.zeros(2), np.ones(2), np.zeros(2)])

array([0., 0., 1., 1., 0., 0.])

Now let&rsquo;s say we want to concatenate 2d arrays or combine 1d arrays to create a
2d array. Here we can use the `np.hstack` and `np.vstack` functions, which
respectively concatenate arrays in the &ldquo;horizontal&rdquo; and &ldquo;vertical&rdquo; directions.
Observe this code:



In [29]:
np.vstack([np.ones(3), np.ones((2, 3))])

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

Note here that for NumPy, a 1d array is a &ldquo;row&rdquo; of elements (that can get a bit
ambiguous in certain cases, but we will not cover it here).

Similarly, we can stack arrays in the horizontal direction



In [30]:
A = np.hstack([np.ones((2, 1)), np.zeros((2, 2))])
A

array([[1., 0., 0.],
       [1., 0., 0.]])

These functions will only work if the size of the arrays we want to combine is
&ldquo;compatible&rdquo;, i.e. we can only stack horizontally if arrays have the same number
of rows, and we can only stack vertically if arrays have the number of columns.

Now what if I wanted to append a column to `A` with the values `[2, 3]`? Doing
this will not work because, as we previously mentioned, a &ldquo;1d array&rdquo; (a list is
equivalent to one), is considered as a row:



In [31]:
#np.hstack([A, [2, 3]])

One can use this kind of syntax



In [32]:
np.hstack([A, np.array([2, 3]).reshape(-1,1)])

array([[1., 0., 0., 2.],
       [1., 0., 0., 3.]])

and [others](https://stackoverflow.com/questions/5954603/transposing-a-1d-numpy-array), or use an operation known as &ldquo;transpose&rdquo;, which can be handy also in
other cases.



#### Transposing



For 2d array we can do a transpose operation, a term that comes from &ldquo;matrices&rdquo;
in linear algebra, which are also represented as 2d grids of numbers. Indeed 2d
numpy arrays are convenient representation of these mathematical objects, which
are fundamental to most machine learning techniques we will be using. We won&rsquo;t
conver these mathematical details but the interested reader can refer for
example to [https://www.statlect.com/matrix-algebra/>](https://www.statlect.com/matrix-algebra/>)for a primer.

Anyhow, transposing an array simply means &ldquo;transforming&rdquo; it so rows become
columns and columns become rows. Say we create an array as follows:



In [33]:
B = np.vstack([np.linspace(0, 4, 5),
               np.linspace(1, 5, 5)])
B

array([[0., 1., 2., 3., 4.],
       [1., 2., 3., 4., 5.]])

The transpose is given by the &ldquo;property&rdquo; `.T` as follows:



In [34]:
B.T

array([[0., 1.],
       [1., 2.],
       [2., 3.],
       [3., 4.],
       [4., 5.]])

Now to add a coumn to the previous `A` array, we could do:



In [35]:
np.vstack([A.T, [2, 3]]).T

array([[1., 0., 0., 2.],
       [1., 0., 0., 3.]])

### Reshaping arrays
In machine learning applications we will often encounter cases where we need to reshape a numpy array so it is compatible with the inputs of a given machine learning model.
For example, we will see image-based models that are trained on batches of multiple RGB images with shape `(height, width, 3)`, which are stored as a single array with shape `(number_of_images, height, width, 3)`. If we want to use such a model with a single image, we need to reshape an array by adding a dummy dimension. This can be done with the `np.expand_dims` function or the `reshape` method of the array itself. As long as we add *one* dimension, the order of the data in the array will remain unchanged.
So for example given an array:

In [36]:
img = np.zeros((600, 800, 3))
print(img.shape)

(600, 800, 3)


We can equivalently do:

In [37]:
img2 = img.reshape((1, 600, 800, 3))
print(img2.shape)

(1, 600, 800, 3)


Or

In [38]:
img2 = np.expand_dims(img, 0)
print(img2.shape)

(1, 600, 800, 3)


In another instance we will see grayscale images that are loaded as arrays of shape `(height, width)`. In this case we will need to add two dummy dimension, one at the beginning and one at the end resulting in a shape of `(number_of_images, height, width, 1)`. This stands for "one image of width, height with one channel". Again since we add single dimensions we are effectively not modifying the number of elements in the array. So in this instance, given an image

In [39]:
img = np.zeros((600, 800))
img.shape

(600, 800)

We would do:

In [40]:
img2 = img.reshape((1, 600, 800, 1))
img2.shape

(1, 600, 800, 1)

### Mathematical operations on arrays



Ok, now to some more &ldquo;meaty&rdquo; things we can do with arrays. Differntly from
lists, we can do operations with arrays such as multiplication, addition etc..etc&#x2026;

Multiplying/dividing/adding/subtracting/raising-to-a-power with a single number,
applies the operation to all elements in an array. So now, finally, we can make
an array all of 9&rsquo;s!



In [41]:
np.ones((5,3))*9

array([[9., 9., 9.],
       [9., 9., 9.],
       [9., 9., 9.],
       [9., 9., 9.],
       [9., 9., 9.]])

As a small exercise, try creating the same array with addition.

We can also apply other operations in &ldquo;batch&rdquo; form for example let&rsquo;s take the square root of
a sequence of numbers:



In [42]:
np.sqrt(np.linspace(0, 7, 3))

array([0.        , 1.87082869, 2.64575131])

See [https://numpy.org/doc/stable/reference/routines.math.html>](https://numpy.org/doc/stable/reference/routines.math.html>)for a list of
available operations.

As another example we can very rapidly get the values of a
cosine wave with an expression such as



In [43]:
np.cos(np.linspace(0, np.pi*2, 20))

array([ 1.        ,  0.94581724,  0.78914051,  0.54694816,  0.24548549,
       -0.08257935, -0.40169542, -0.67728157, -0.87947375, -0.9863613 ,
       -0.9863613 , -0.87947375, -0.67728157, -0.40169542, -0.08257935,
        0.24548549,  0.54694816,  0.78914051,  0.94581724,  1.        ])

&#x2026;we will plot this one soon.

As long as two arrays have the same dimension, we can also
multiply/add/subtract/divide **between them**, e.g:



In [44]:
np.linspace(0, 1, 5) + np.linspace(-2, -1, 5)

array([-2. , -1.5, -1. , -0.5,  0. ])

Or



In [45]:
np.array([[1, 2, 3],
          [4, 2, 4],
          [3, 3, 5]])*np.eye(3)

array([[1., 0., 0.],
       [0., 2., 0.],
       [0., 0., 5.]])

where `np.eye(num)` gives us a square array with `num` columns and ones along
the diagonal. Again these operations are element-wise and require the arrays to
have exactly the same shape. When dealing with with 2d arrays, this form of
multiplication is &ldquo;unusual&rdquo; as typically 2d array represent matrices and
multiplication between matrices is not element-wise and obeys specific rules and
restrictions (it is fundamental to artificial neural networks). If you want to
sound sophisticated with nerd friends, this element-wise form of multiplication
has a special name, it is called the [&ldquo;Hadamard product&rdquo;](https://en.wikipedia.org/wiki/Hadamard_product_(matrices)) (say it with a serious
face while stroking your chin), which also shows how math sometimes sounds
scarier than what it actually is.



### More operations on arrays



Numpy provides some useful functions to examine and manipulate arrays. The
`np.max` and `np.min` functions give us the maximum and minimum value of an
array (or even a list). For example



In [46]:
z = np.random.uniform(-10, 10, 10) # generate an array with 10 random values between -10 and 10
print(z)
print('Maximum is: ' + str(np.max(z)))

[-7.26586821  1.50186659  7.82643909 -5.81595756 -6.29343561 -7.83246219
 -5.60605015  9.57247569  6.23366298 -6.56117975]
Maximum is: 9.572475694147393


This could be used for example to normalize the array



In [47]:
(z - np.min(z)) / (np.max(z) - np.min(z))

array([0.03255363, 0.53630348, 0.89968154, 0.11585819, 0.08842471,
       0.        , 0.12791841, 1.        , 0.80816865, 0.07304148])

Another couple of particularly useful functions ar `np.argmin` and `np.argmax`. These give us **the index** of the minimum and maximum value in an array (or a list).
One example use of these functions is in a classification setting. Say our classifier gives us a series of probabilities for given classes. We can concisely find the maximum class (as an index) with



In [48]:
p = [0.3, 0.5, 0.1, 0.1]
print(p)
print("Maximum is " + str(np.argmax(p)))

[0.3, 0.5, 0.1, 0.1]
Maximum is 1


Note that here we used a simple list and not an array.

One more useful function is `np.sum`. It  sums all the values in an array and returns the result.
E.g.



In [49]:
z = np.linspace(0, 5, 4)
print(z)
print(np.sum(z))

[0.         1.66666667 3.33333333 5.        ]
10.0


For multi-dimensional arrays it is possible to also sum along rows, or columns, etc.., by specifying the `axis` optional argument:



In [50]:
z = np.ones((3, 4))
print(z)
print(np.sum(z, axis=0)) # sum along columns
print(np.sum(z, axis=1)) # sum along rows

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
[3. 3. 3. 3.]
[4. 4. 4.]


In this case the result is an array and not a number. The same &ldquo;axis&rdquo; trick also holds for `np.min` and `np.max` and many other functions.

