# Numpy

Numpy is a python library that is widely used for mathematics, computer vision and a wide variety of other applications.

The main data structure in Numpy is the `array`. This tutorial focuses on this important data structure, and covers:
* Importing `numpy`
* Creating arrays
* Accessing values in arrays
* Mathematics with arrays (e.g. multiplication)

First of all we need to import the numpy library. Here I've used the `as` syntax to create an alias for `numpy`. This means we can refer to it using the shorthand `np` in the code.

In [3]:
import numpy as np

A numpy array stores a number of elements of the same datatype. In this way it is similar to a python list, but they address some of the shortcomings of python lists. 

Whilst lists are one-dimensional, numpy arrays can have any number of dimensions, so they can represent vectors, matrices or tensors.

Unlike lists, numpy arrays have a set datatype and will only contain elements of that type. This is more restrictive, but means that arrays are more memory-efficient and performing operations on them is faster.

Arrays can be created in a number of ways:

In [18]:
# Creating from python lists:
vector = np.array([0,1,2,3])
matrix = np.array([ [1, 0, 0, 0],
                    [0, 1, 0, 0],
                    [0, 0, 1, 0],
                    [0, 0, 0, 1]])

# Making arrays of ones or zeros
ones = np.ones([10,10])
zeros = np.ones([3,5,2])

# Making a 4x4 identity matrix
# Note the identity matrix function is called `eye`
identity = np.eye(4)

# Making random arrays
rand_array = np.random.random([5,2]) # Drawn from uniform random distribution on [0,1)
rand_array2 = np.random.normal(size=[3,3,3]) #Drawn from a standard normal distribution (mean 0, stdev 1)

They are indexed in the same way as arrays, using square brackets. When the array has more than one dimension, the indices are separated by commas.

In [19]:
a = vector[2]
print(a)
b = matrix[3,3]
print(b)

2
1


Slices also work on numpy arrays, just like lists.

In [20]:
print(vector[1:])
print(matrix[:,2])

[1 2 3]
[0 0 1 0]


There is a new slice option for numpy, `...`, which can be used to get all the remaining dimensions in an array. This is handy when you have lots of dimensions:

In [2]:
big_array = np.ones([5,5,5,5,5])

slice_1 = big_array[...,0] # Grabs all the first 4 dimensions of big_array, and the first entry in the last dimension
print(slice_1.shape) # This will be (5,5,5,5) as we only sampled one entry from the last dimension

slice_2 = big_array[:2, ...] # You can also use ... at the end of the slice. This takes the first 2 entries from the first dimension of the array.
print(slice_2.shape) # This will be (2,5,5,5,5) as we took 2 entries in the first dimension.

(5, 5, 5, 5)
(2, 5, 5, 5, 5)


You can find out the dimensions of an array using the `shape` property.

In [21]:
print(rand_array.shape)
print(rand_array.shape[1]) # shape is a tuple, so you can access elements of it

(5, 2)
2


Arrays have a set datatype, and all elements of an array must be of the same type. However, we can convert arrays from one type to another.

In [22]:
print(vector.dtype) # vector was initially made from a list of ints, so its type is int32
float_vector = vector.astype(np.float32) # but we can convert it to an array of floats
print(float_vector.dtype)

int32
float32


The types available in numpy start with the format, like `int`, `uint` (unsigned int), `float` or `complex`. They end in an integer that says how many bits each item in the array takes up.

For example, `np.float32` is a 32-bit (single-precision) floating point datatype. `np.uint8` is an 8-bit unsigned integer datatype.

Mathematical operations work on arrays, and all of the standard operations `+ - * /` work element-wise.

This means they're equivalent to running through the arrays in a for loop and applying the operation to each pair of elements, like this:

In [23]:
array_1 = np.array([1,1,1])
array_2 = np.array([2,2,2])

# You could multiply these arrays element-wise with a for loop like this:
result = np.zeros([3])
for i in range(array_1.shape[0]):
    result[i] = array_1[i] * array_2[i]
print(result)

# But using the operator * is quicker to write, and runs faster too:
result = array_1 * array_2

[2. 2. 2.]


Note that `array_1 * array_2` *does not* perform matrix multiplication. For this you must use `np.matmul()`, or the `@` operator.

In [2]:
array_1 = np.eye(3)
array_2 = np.ones([3,3])*2.0

# This will do element-wise multiplication, and give a diagonal matrix.
print("Multiply\n", array_1 * array_2)

# This does proper matrix multiplication with an identity matrix, so will just output array_2.
print("Matrix multiply\n", np.matmul(array_1, array_2))

# The @ operator does the same thing, but is a bit quicker to write:
print("Alternative matrix multiply\n", array_1 @ array_2)

Multiply
 [[2. 0. 0.]
 [0. 2. 0.]
 [0. 0. 2.]]
Matrix multiply
 [[2. 2. 2.]
 [2. 2. 2.]
 [2. 2. 2.]]
Alternative matrix multiply
 [[2. 2. 2.]
 [2. 2. 2.]
 [2. 2. 2.]]


### Note: Broadcasting

So far we have applied operations like multiplication and division to arrays with exactly the same dimensions. In some cases, you can apply them to arrays with different dimensions, and numpy will repeat elements as necessary to apply the operation.

For example, if you multiply an array of shape `[1]` with an array of shape `[3]`:


In [4]:
array_a = np.array([1,2,3])
array_b = np.array([2])

print(array_a * array_b) # Here array b is repeated 3 times and then multipled with array a.

[2 4 6]


This can be very handy - suppose you have an RGB image, which is stored in an array of shape `[128, 128, 3]`. You want to add a bit to the red channel of the image. You can do this easily:

In [7]:
my_image = np.ones([128, 128, 3]) # This is of shape [128, 128, 3]
color_offset = np.array([0.1, 0, 0]) # This is of shape [3]
my_image += color_offset # This adds dimensions to the color_offset array and repeats 128 times on the first 2 dimensions, then adds to the image.
print(my_image.shape)

(128, 128, 3)


In this case dimensions were added to the array to be added, in addition to repeating it along the first two dimensions, before it was added.

Be careful - broadcasting can be very useful, but can sometimes give confusing results, or hide errors where you unintentionally multiply the wrong things. 

For more information on broadcasting see the [numpy documentation here.](https://numpy.org/doc/stable/user/basics.broadcasting.html)

# Further Reading

See ["Numpy for Absolute Beginners"](https://numpy.org/doc/stable/user/absolute_beginners.html) if you'd like to see the concepts from this notebook introduced in a different way. For more in-depth information, then look at [Numpy Fundamentals](https://numpy.org/doc/stable/user/basics.html).

If you're experienced with MATLAB, you might want to check out the [Numpy for MATLAB users guide](https://numpy.org/doc/stable/user/numpy-for-matlab-users.html).