## An introduction to NumPy

NumPy is a set of Python modules to store and manipulate numeric data.

As with matplotlib, there's a convention to abbreviate it on import:

```python
import numpy as np
```

Do this now.

The fundamental unit of data in NumPy is an array of numerical data. Imagine you want to store some numerical data in Python, this could be:

* A regularly spaced timeseries (1 dimension)
* An image (2 dimensions)
* An MRI scan (3 dimensions)

In each of these cases the data are arranged regularly -- in a list (1D), grid (2D) or volume (3D).

NumPy can handle all of these, and higher dimensions too.

Let's create an example 2D array. There are several ways to create one; first, let's do so from some starting data:

```python
A = np.array([[1, 36],[6, 98],[3, 55]])
```

To display this quickly, you can just type the name of the variable

```python
A
```

at the end of a cell (or as the only thing in the cell) and run it. Try this now:

You'll find that for a lot of Python objects, running e.g.

```python
print(my_stuff)
```

and just

```python
my_stuff
```

have very different results. The first works everywhere, in programs and in interactive environments like Jupyter and the Python prompt. The second only works in interactive environments, but will sometimes produce a much more informative result.

## Array operations

As you might expect, you can do maths with these!

For example:

```python
B = np.array([[1, 2],[3, 4],[5, 6]])
A + B
```

For each of add (+), subtract (-), multiply (\*) and divide (/) this works on each element in the matrix individually.

Try this:

```python
C = np.array([[0.1, 0.2],[0.3, 0.4]])
A + C
```

This gives an error about "`shapes`". Every array has a shape, which can be queried with (e.g.)

```python
A.shape
```

(Note that this doesn't have brackets, as it's not a function but an attribute of A).

Look at the shapes of `A`, `B` and `C`. When something doesn't work in NumPy, it's often useful to look at the shapes of the arrays involved.

## Broadcasting

*Broadcasting* is NumPy's fancy word for array shapes matching such that it makes sense to combine them with addition, subtraction and other operations. If two shapes are exactly the same, trivially they match and we can (for example) add each element in one array to its counterpart in the other.

If you give a single numeric value<sup>1</sup> to NumPy then this is "broadcast" to the whole array -- in other words, it will (for example) add that number to each element in the whole array.

Try this:

```python
A + 100
```

<sup>1</sup> .... called a *scalar* in contrast to an array

That's not the only thing we can do with broadcasting though. If the input is a row or column vector, this will be added as if it was repeated along the other axis.

Broadcasting rows:

```python
A + [100, 1000]
```

which will work as if you had added:

```python
array([[ 100, 1000],
       [ 100, 1000],
       [ 100, 1000]])
```

and broadcasting columns:

```python
A + [[10], [100], [1000]]
```

which will work as if you had added:

```python
array([[  10,   10],
       [ 100,  100],
       [1000, 1000]])
```


## Indexing NumPy arrays

The slightly mindbending thing about NumPy indexing is that rows are always put before columns. This seems ok, but then you find you're doing with images, and you have some code that looks something like:

```python
my_data[y, x]
```

This comes from the world of maths, where a matrix might be written like this:

\begin{bmatrix}
a_{11}&a_{12}&\cdots &a_{1n} \\
a_{21}&a_{22}&\cdots &a_{2n} \\
\vdots & \vdots & \ddots & \vdots\\
a_{n1}&a_{n2}&\cdots &a_{nn}
\end{bmatrix}

As with (almost) all things Python, rather than starting at 1, the indexed start at 0. So for example,

```python
A[0,0]
```

gives the value in the first row and first column of `A`,

```python
A[0,1]
```

gives the value in the first row and second column, and so on.

## Slicing and :

Just like normal Python lists, you can use `:` to get a range within an array. For example,

```python
A[0:2,1]
```

will give the first and second rows, in the second column. (Like list slices, 0:2 means 0 *up to* 2 but not inclusive, so it will get row 0, row 1 but not row 2).

You can also use `:` on its own to give the whole of a row or column. For example,

```python
A[0:2,:]
```

means "first and second row, all columns".

## Arrays and images

Let's look at this with some images. We can use the `matplotlib` function `imshow()` to plot an array, mapping the values to colours on the screen.

Run the cell below, which does some setup.

In [None]:
%matplotlib inline
# Import NumPy and matplotlib
import numpy as np
import matplotlib.pyplot as plt
# Set a greyscale colourmap (we want white for 0 and black for 1)
plt.set_cmap('Greys')
# Remove scale labels and tick marks from the plot
plt.rc('xtick', color='none')
plt.rc('ytick', color='none')

## A blank canvas

The NumPy function `zeros` gives an array filled with zeros of the specified shape. So this:

```python
canvas = np.zeros((100,50))
```

gives an array with 100 rows and 50 columns. Try this now. If you're confused by the double brackets, it might help to have a look at the value of 

```python
canvas.shape
```

Now let's show this canvas:

```python
plt.imshow(canvas, interpolation='none')
```

(The bit about interpolation is because we want to see pixels in the image sharply -- otherwise it would try to blur it to make it more pleasing to the eye).

Unsurprisingly the image is blank!

Let's set a value in the middle of the canvas to 1 and take a look:

```python
canvas[50,25] = 1
plt.imshow(canvas, interpolation='none')
```

## Exercise: The ends of the Earth

The way that `matplotlib` shows the array might not be what you expect. Use indexing to set the bottom left and top right of the image to black, so it looks like this:

![white rectangle with small black squares at the corners](http://softdev.ppls.ed.ac.uk/static/images/corners.png)

If you need to you can always reset the canvas with:

```python
canvas = np.zeros((100,50))
```

It's useful to remember that NumPy indexing works like this:

```python
canvas[row_number, column_number]
```

## Exercise: Draw lines

Reset your canvas with:

```python
canvas = np.zeros((100,50))
```

Now use slicing to set parts of `canvas` to 1, and:

* draw a horizontal line all the way across the canvas
* draw a vertical line all the way down the canvas

Be sure that you know which one is which.

Remember that when you're indexing, : can be used to select all rows or all columns.

## Exercise: More advanced slices

You can also specify a "skip" in your slices. So for example:

```python
my_array[3:50:5,:]
```

means "give me every 5th row, starting at 3, and stopping before 50".

Again, reset your canvas. Use slicing to draw even black and white stripes.

## Exercise: dots!

One last exercise: can you use slicing with a skip of 5 to get the image below?

![white rectangle with dots](http://softdev.ppls.ed.ac.uk/static/images/dots.png)

## Showing things the right way up

As you've almost certainly gathered by now, the images you create with `imshow()` have the origin in the top-left corner, and are arranged such that in NumPy, you have to give the `y` coordinate first:

```python
canvas[y,x]
```

It doesn't have to be like this! An array is just data -- and we can choose to interpret it any way we wish. So we could decide that we want the first coordinate to represent columns and the second rows in the image, so we can reference them like this:

```python
canvas[x,y]
```

If we choose this interpretation, we then need to fix it so that `imshow()` shows us the right result. The first step is transpose, which swaps over rows and columns. Remind yourself of the contents of our example array `A` by just running:

```python
A
```

The transpose of an array is in a data attribute, not a method. You can get it with (for example):

```python
A.T
```

## Extra Extra Exercise

The worksheet `04Life` defines the framework for a cellular automaton, Conway's Game of Life.