## Prelude: NumPy

NumPy is the basis of most scientific computing in Python, and we'll read and write HDF5 data as NumPy arrays. If you already know about NumPy, you can skip this. If not, this is a very brief introduction, but there's much more you can learn about it elsewhere.

In [1]:
import numpy as np

NumPy lets us work with *arrays* of numbers (and arrays of other data, but numbers are the main use case).

An array has a shape, which can include any number of dimensions. Here's a 2D array:

In [2]:
a = np.arange(30).reshape((5, 6))
a

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29]])

In [3]:
a.shape

(5, 6)

Arrays also have a dtype, a description of the type of data in each cell. This array contains 64-bit integers:

In [4]:
a.dtype

dtype('int64')

You can do lots of mathematical operations on an array, to do the operation on every number and make another array as the result. This is much faster than looping through each number in Python:

In [5]:
a / 10

array([[0. , 0.1, 0.2, 0.3, 0.4, 0.5],
       [0.6, 0.7, 0.8, 0.9, 1. , 1.1],
       [1.2, 1.3, 1.4, 1.5, 1.6, 1.7],
       [1.8, 1.9, 2. , 2.1, 2.2, 2.3],
       [2.4, 2.5, 2.6, 2.7, 2.8, 2.9]])

You can select a smaller part of the array by slicing some or all of the dimensions:

In [6]:
a[:2, :4]

array([[0, 1, 2, 3],
       [6, 7, 8, 9]])

Or an individual number by indexing:

In [7]:
a[2, 4]

16