## Numpy

`Numpy` is the main way to deal with numerical data in python. Basic python can be slow (for loops are slow for example). Numpy avoids these problems by using efficient C code behind the scenes.

Numpy was developed to be Matlab-like. As such there are many functions named the same thing. However, there are important differences. I will try to point them out here, but a good reference is this numpy guide for Matlab users: https://numpy.org/doc/stable/user/numpy-for-matlab-users.html

Also note that much of this tutorial will be a parred down version of Jake VanderPlas' excellent guide *Python Data Science Handbook*, which is available free online here: https://jakevdp.github.io/PythonDataScienceHandbook

Let's import the numpy package.

Numpy is conventionally renamed to `np` on import. This is widely used by lots of people so it is good to keep following it.

In [1]:
import numpy as np

### Arrays

The basic unit of numpy is the **numpy array**. Like a `list` it is a container of objects. Unlike a list, it usually contains numbers and this is part of what makes it efficient to use.

We can make a numpy array from a list of numbers as follows:

In [2]:
a = np.asarray([10, 20, 30])
a

array([10, 20, 30])

Arrays have several important properties:
+ `size` is the number of elements in the array.
+ `ndim` is the number of dimensions of the array
+ `shape` is the number of elements in each dimension
+ `dtype` is the type of element in the array. Here are the most common types:
    + `int64`: 64-bit integers
    + `float64`: double precision floating point numbers
    + `bool`: boolean

In [3]:
a.size

3

In [4]:
a.ndim

1

In [5]:
a.shape

(3,)

In [6]:
a.dtype

dtype('int64')

We can also look at the size of the array in memory using the `nbytes` property

In [7]:
f'{a.nbytes} bytes'

'24 bytes'

Let's make a multidimensional array and compare:

In [8]:
b = np.asarray([[10.0, 11.0, 12.0],
                [20.0, 21.0, 22.0],
                [30.0, 31.0, 32.0]])
b

array([[10., 11., 12.],
       [20., 21., 22.],
       [30., 31., 32.]])

In [9]:
b.size

9

In [10]:
b.ndim

2

In [11]:
b.shape

(3, 3)

In [12]:
b.dtype

dtype('float64')

In [13]:
f'{b.nbytes} bytes'

'72 bytes'

We can see that this array has two dimensions with three elements each. The array takes up more space in memory because it has more elements and because it is of type `float64` (floating point numbers take up more space than integers).

### Indexing into numpy arrays

Indexing into numpy arrays is very similar to indexing into lists, except we have to deal with the dimensions of the array.

In [14]:
b

array([[10., 11., 12.],
       [20., 21., 22.],
       [30., 31., 32.]])

In [15]:
b[0, 0]

10.0

In [16]:
b[1, 0]

20.0

In [17]:
b[2, 0]

30.0

In [18]:
b[0, 1]

11.0

In [19]:
b[0, 2]

12.0

Like in lists, we can use the colons to give us several elements of a dimension using `array[start:stop:step]`

We can also omit start, stop, and step for convenience. For example:

In [20]:
b[0, 0:3:1]

array([10., 11., 12.])

This is the same as:

In [21]:
b[0, :]

array([10., 11., 12.])

We can do it for the other dimensions as well:

In [22]:
b[:, 0]

array([10., 20., 30.])

In [23]:
b[:, 1]

array([11., 21., 31.])

You can omit the colon if it's the first dimension

In [24]:
b[0]

array([10., 11., 12.])

In [25]:
b[0, :]

array([10., 11., 12.])

You can also reverse elements of the array like in lists using step

In [26]:
b[0]

array([10., 11., 12.])

In [27]:
b[0, ::-1]

array([12., 11., 10.])

You can also index into two dimensions at once

In [28]:
b

array([[10., 11., 12.],
       [20., 21., 22.],
       [30., 31., 32.]])

In [29]:
b[::2, ::2]

array([[10., 12.],
       [30., 32.]])

In [30]:
b[:2, :2]

array([[10., 11.],
       [20., 21.]])

### Modifying existing arrays

Like lists, you can also modify elements of the array using indexing

In [31]:
b

array([[10., 11., 12.],
       [20., 21., 22.],
       [30., 31., 32.]])

In [32]:
b[0, 0] = 9
b

array([[ 9., 11., 12.],
       [20., 21., 22.],
       [30., 31., 32.]])

In [33]:
b[:, 2] = 2
b

array([[ 9., 11.,  2.],
       [20., 21.,  2.],
       [30., 31.,  2.]])

In [34]:
b[:, 2] = [1, 2, 3]
b

array([[ 9., 11.,  1.],
       [20., 21.,  2.],
       [30., 31.,  3.]])

You can add a new dimension to the array by using `np.newaxis`

In [35]:
b.shape

(3, 3)

In [36]:
b[:, :, np.newaxis].shape

(3, 3, 1)

#### Reshaping arrays

In [37]:
c = np.asarray([[0, 0, 0],
                [1, 1, 1]])
c

array([[0, 0, 0],
       [1, 1, 1]])

In [38]:
c.shape

(2, 3)

In [39]:
c.reshape((3, 2))

array([[0, 0],
       [0, 1],
       [1, 1]])

In [40]:
c.reshape((3, 2)).shape

(3, 2)

In [41]:
c.reshape((6,))

array([0, 0, 0, 1, 1, 1])

In [42]:
c.flatten()

array([0, 0, 0, 1, 1, 1])

### Joining arrays

Use the functions `np.concatenate` and `np.stack` to join arrays together.

Notice that these each take an `axis` keyword argument to specify which dimensions to combine the arrays.

In [48]:
a = np.asarray([1, 1, 1])
b = np.asarray([0, 0, 0])

In [49]:
a

array([1, 1, 1])

In [50]:
b

array([0, 0, 0])

In [51]:
a.shape

(3,)

In [52]:
b.shape

(3,)

In [54]:
c = np.concatenate((a, b), axis=0)

c

array([1, 1, 1, 0, 0, 0])

In [55]:
c.shape

(6,)

In [61]:
# np.concatenate((a, b), axis=1)

In [56]:
np.concatenate((a[:, np.newaxis], b[:, np.newaxis]), axis=1)

array([[1, 0],
       [1, 0],
       [1, 0]])

In [57]:
np.concatenate((a[:, np.newaxis], b[:, np.newaxis]), axis=1).shape

(3, 2)

`stack` is just like concatenate but it automatically adds a new axis

In [58]:
np.stack((a, b), axis=0)

array([[1, 1, 1],
       [0, 0, 0]])

In [59]:
np.stack((a, b), axis=1)

array([[1, 0],
       [1, 0],
       [1, 0]])

### Universal Functions

You can apply functions for all elements of an array.

In [62]:
a

array([1, 1, 1])

In [63]:
np.log(a)

array([0., 0., 0.])

In [64]:
a * 3

array([3, 3, 3])

In [65]:
a - 3

array([-2, -2, -2])