In this notebook, we're going to talk about [`numpy`](https://numpy.org/) — one of the most commonly used Python libraries for scientific and numerical computing. It provides high-performance when working with multidimensional arrays of data.

To use this library, we first need to import it into our runtime environment.

In [None]:
# Installs the library into your Python environment if it doesn't exist already
!pip install numpy
# Loads the library and gives it a name (np) so that you can conveniently reference it in your code
import numpy as np

# A Note on Numbers

`numpy` is intended to work with numerical data of the form `int` and/or `float`. However, there are special values that you might encounter to handle unique situations. These include the following:

| Value | Name | Usage |
|-------|---------|--------|
| `np.inf` | Mathematical infinity | Outcome of division by zero or exceeding the maximum representable value |
| `np.nan` | Not a number | Placeholder for a non-existing value or the outcome of an invalid calculations |

# Creating Arrays

An ***array*** is a way of storing multiple values of the same data type in a structure manner.

Arrays can have one or many ***dimensions***. A one-dimensional (1D) array is a linear sequence of numbers similar to a list. A two-dimensional (2D) array is a grid of numbers similar to a table or spreadsheet. A three-dimensional (3D) array can be imagined as a cube of numbers. We can even have arrays with higher numbers of dimensions.

We can create arrays from scratch by specifying their structure using lists. Here is how we can create 1D, 2D, and 3D `numpy` arrays from scratch:

In [None]:
# 1D arrays
x = np.array([2, 3, 4, 5, 6, 7])
print(x)
print(x.shape)

In [None]:
# 2D arrays
x = np.array([[2, 3, 4, 5, 6, 7], [4, 9, 16, 25, 36, 49]])
print(x)
print(x.shape)

In [None]:
# 3D arrays
x = np.array([[[2, 3, 4, 5, 6, 7],
               [4, 9, 16, 25, 36, 49]],
              [[4, 6, 8, 10, 12, 14],
               [16, 36, 64, 100, 144, 196]]])
print(x)
print(x.shape)

We can keep going and create numpy arrays with any number of dimensions, but we'll generally stick with three dimensions in this course. Notice how we have been printing the shape of the arrays using the `.shape` attribrute. When we are looking at a 1D array, `.shape` simply tells us the number of elements in the array. Once we get to multiple dimensions, `.shape` tells us the size of the array along each ***axis*** (i.e., direction). Going up to 3D, the axes correspond to rows, columns, and depth.

`numpy` also provides some helper functions to help you generate arrays with a particular structure.

In [None]:
# Array full of zeros
np.zeros((5, 2))

In [None]:
# Array full of ones
np.ones((5, 2))

In [None]:
# Array with linearly spaced values according to the desired number of elements
np.linspace(0, 20, num=5)

In [None]:
# Array with linearly spaced values according to the desired step size
np.arange(0, 20, step=5)

In [None]:
# Array with random values between 0 and 1
# (range can be adjusted by scaling and offsetting)
np.random.rand(2, 6)

We can also load arrays from other files, but we are going to skip that for now since there is another library better suited for doing that.

# Accessing Array Values

We can do many of the same operations that we would do with standard Python `lists`. For example, we can access and assign values by using indeces. Let's start off with a 1D array:

In [None]:
x_1d = np.array([2, 3, 4, 5, 6, 7])

In [None]:
# Retrieve a single value from a 1D array
print(x_1d[0])
print(x_1d[1])
print(x_1d[3])
print(x_1d[-1]) # x[len(x)-1]

In [None]:
# Retrieve a slice of values from a 1D array
print(x_1d[1:3]) # From index 1 up to but not including index 3
print(x_1d[:2]) # From index 0 up to but not including index 2
print(x_1d[2:]) # From index 2 up to and including the last index

In [None]:
# Modifying a value in a 1D array
x_1d[0] = 30
print(x_1d)

Now let's look at indexing into 2D arrays,

In [None]:
x_2d = np.array([[2, 3, 4, 5, 6, 7], [4, 9, 16, 25, 36, 49]])

In [None]:
print(x_2d[1, 4]) # Index 1 along axis 0, index 4 along axis 1
print(x_2d[0, :]) # Index 0 along axis 0, all of the elements across axis 1
print(x_2d[:, 0]) # All of the elements across axis 0, index 0 along axis 1

This pattern extends to 3D arrays, 4D arrays, etc.

# Changing the Shape of Arrays

You will occasionally find yourselves in situations where you need to change the dimensions of your array. We won't go over those situations for now, but we will cover how you can use the `.reshape()` method to change the shape of an array.

In [None]:
# Reshaping while keeping the same number of dimensions
x = np.array([[2, 3, 4, 5, 6, 7], [4, 9, 16, 25, 36, 49]])
x.reshape((6, 2))

Notice how we need to know the number of elements ahead of time in order to do this reshaping. Fortunately, we can use `-1` in place of one of the dimensions and `numpy` will figure out the rest of the math for us.

In [None]:
# Reshaping while keeping the same number of dimensions
x = np.array([[2, 3, 4, 5, 6, 7], [4, 9, 16, 25, 36, 49]])
x.reshape((-1, 2))

There are also some special methods that we can call to perform special kinds of transformations.

In [None]:
# Reducing the array to a 1D array
x = np.array([[2, 3, 4, 5, 6, 7], [4, 9, 16, 25, 36, 49]])
x.flatten() # x.reshape(-1)

In [None]:
# Expanding the array to a 3D array
x = np.array([[2, 3, 4, 5, 6, 7], [4, 9, 16, 25, 36, 49]])
np.expand_dims(x, axis=2) # x.reshape((2, 6, -1))

In [None]:
# Transposing
x = np.array([[2, 3, 4, 5, 6, 7], [4, 9, 16, 25, 36, 49]])
x.transpose()

# Combining Arrays

You might also find situations when you need to combine arrays to suit your needs. For example, you might have each patient's data in an array and want to combine them all together into a single array for the entire dataset. Let's look at the `concatenate()` function:

In [None]:
# Appending two 1D arrays along the "0th axis"
x1 = np.array([2, 3, 4, 5, 6, 7])
x2 = np.array([4, 9, 16, 25, 36, 49])
np.concatenate((x1, x2), axis=0)

In [None]:
# DOESN'T WORK: Appending two 1D arrays along the "1st axis"
x1 = np.array([2, 3, 4, 5, 6, 7])
x2 = np.array([4, 9, 16, 25, 36, 49])
np.concatenate((x1, x2), axis=1)

Notice that we cannot concatenate two 1D arrays to make a 2D array; in other words, `concatenate()` cannot be used to create new dimensions. If we want to do that, we need to create a new dimension in advance. The examples below show different ways of doing this:

In [None]:
# Appending two 2D arrays along the "0th axis"
x1 = np.array([[2, 3, 4, 5, 6, 7]])
x2 = np.array([[4, 9, 16, 25, 36, 49]])
np.concatenate((x1, x2), axis=0)

In [None]:
# Extend the dimensionality of each array using np.newaxis
x1 = np.array([2, 3, 4, 5, 6, 7])
x2 = np.array([4, 9, 16, 25, 36, 49])
np.concatenate((x1[np.newaxis, :], x2[np.newaxis, :]), axis=0)

In [None]:
# Extend the dimensionality of each array using .reshape()
x1 = np.array([2, 3, 4, 5, 6, 7])
x2 = np.array([4, 9, 16, 25, 36, 49])
np.concatenate((x1.reshape(1, -1), x2.reshape(1, -1)), axis=0)

# Other Operations

`numpy` provides hundreds of methods and functions that to help you process the data within an array. We will only go over a small handful of them using our arrays from the previous section:

In [None]:
x1 = np.array([2, 3, 4, 5, 6, 7])
x2 = np.array([4, 9, 16, 25, 36, 49])
x1and2 = np.concatenate((x1.reshape(1, -1), x2.reshape(1, -1)), axis=0)

Let's start with some of the built-in methods:

In [None]:
# Compute the min, max, mean
print(f'Min: {x1.min()}')
print(f'Mean: {x1.mean()}')
print(f'Max: {x1.max()}')

In [None]:
# Find the index of the min and max values
print(f'Argmin: {x1.argmin()}')
print(f'Argmax: {x1.argmax()}')

In [None]:
# Sort the array
# Note: This does not return a new list, but rather sorts the array it was called on
x1.sort()
print(x1)

There are even more functions that take `numpy` arrays as input and return something useful. Some of these functions accept one or many arrays with the same shape as input, performs a mathematical operation at each array index, and then returns a new array with the same shape. These operations are often said to function ***element-wise***.

In [None]:
# Multiplies each element by 2
2 * x1

In [None]:
# Adds elements at the same position across two arrays
x1 + x2

In [None]:
# Calculate sin(x) for each element
np.sin(x1)

There are also functions that accept a single array and return a single value:

In [None]:
# Calculate the sum of all the elements
np.sum(x1)

These functions also work with multidimensional arrays. However, different things will happen depending on whether or not you specify the axis along which you do the operation.

In [None]:
print(f'Sum over all elements: {np.sum(x1and2)}')
print(f'Sum along the columns: {np.sum(x1and2, axis=0)}')
print(f'Sum along the rows: {np.sum(x1and2, axis=1)}')