# Numpy Workshop

### Australian Synchrotron - 15/12/2016

## Resources

[Numpy Seminar Notebook](https://github.com/AustralianSynchrotron/intro-numpy-seminar/blob/master/index.ipynb)

## Prerequisites

### Software
- Python 3.5
- Jupyter notebook
- Alternatively: Text editor or Python IDE

### Python Packages (via conda install or pip install)
- numpy
- matplotlib

## The Basics

It is common practice to import `numpy` with the alias `np`. This will make referring to numpy shorter in the future.

In [1]:
import numpy as np

The main data structure you will deal with in numpy is the homogeneous, fixed size, multidimensional **array**. It stores elements (usually numbers), all of the same type. This is the main reason why it is so fast.

Numpy's array class is called **ndarray**. It is also known by the alias **array**.

**Please note**: `numpy.array` is not the same as the built-in Python class `array.array`, which only handles one-dimensional arrays and offers less functionality.

There are various functions available to create an array. One can for example create a numpy array from a standard Python list:

In [3]:
np_array_1 = np.array([1.0, 4.0, 9.0])
print(np_array_1)

[ 1.  4.  9.]


The type of the resulting array is deduced from the type of the elements in the list.

**Task**: Create a numpy array from a mixed type Python list. For example, by mixing floating point numbers and strings. Check the content of the array.

The most common methods to interrogate the properties of the array are:

**`ndarray.ndim`**
the number of axes(dimensions) of the array.

**`ndarray.shape`**
the dimensions of the array as a tuple indicating the size of the array in
each dimension. For example, a matrix with n rows and m columns, will return `(n,m)`.

**`ndarray.size`**
the total number of elements of the array.

**`ndarray.dtype`**
an object describing the type of the elements in the array.

Applied to our simple array:

In [4]:
np_array_1.ndim

1

**Task**: Try out the other attributes.

As stated above, a `numpy.array` has a fixed length, so you can't append new elements:

In [5]:
np_array_1.append(5.0)

AttributeError: 'numpy.ndarray' object has no attribute 'append'

## Array creation methods

When creating an array, you can specify the type you would like to store with the `dtype` parameter. A complete list can be found here: https://docs.scipy.org/doc/numpy/reference/arrays.scalars.html

In [7]:
np.array([1, 2, 3], dtype='float16')

array([ 1.,  2.,  3.], dtype=float16)

In order to create a 2-dimensional numpy array, supply a list of lists to the `array` function:

In [8]:
np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype='float16')

array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.]], dtype=float16)

**Task**: What does a 3-dimensional array look like? Can you create one?

Often you will know the size of the array initially, but not the elements. Therefore, Numpy offers functions in order to create arrays with placeholder content:

In [9]:
np.zeros((3,4))

array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

In [10]:
np.ones((2,3,4))

array([[[ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]],

       [[ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.],
        [ 1.,  1.,  1.,  1.]]])

In [11]:
np.empty((2,3))

array([[  2.58082344e-316,   6.94391548e-310,   5.39111785e-317],
       [  5.39113366e-317,   6.94387721e-310,   6.94390685e-310]])

In [15]:
np.identity(4)

array([[ 1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  1.]])

To create sequences of numbers you can use the `arange` function:

In [12]:
np.arange(10, 30, 5)

array([10, 15, 20, 25])

In [13]:
np.arange(0, 2, 0.3)

array([ 0. ,  0.3,  0.6,  0.9,  1.2,  1.5,  1.8])

**Task**: Play around with the values and check the output

When `arange` is used with floating point arguments, it is generally not possible to predict the number of elements due to the finite floating point precision. Thus, it is usually better to use the function `linspace` that takes the number of elements that we want as an argument:

In [14]:
np.linspace(0, 2, 9)

array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ,  1.25,  1.5 ,  1.75,  2.  ])

In [17]:
np.logspace(-2, 2, 5)

array([  1.00000000e-02,   1.00000000e-01,   1.00000000e+00,
         1.00000000e+01,   1.00000000e+02])

## Printing Arrays

Numpy prints arrays as following

- the last axis is printed from left to right
- the second-to-last is printed from top to bottom
- the rest are also printed from top to bottom, with each slice separated from the next by an empty line

This means, 1-dimensional arrays are printed as rows, 2-dimensional arrays as matrices and 3-dimensional arays as lists of matrices.

In order to illustrate this, we will use the function `reshape`, which gives a new shape to the array without changing its data:

In [18]:
np.arange(12).reshape(4, 3)

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [19]:
np.arange(24).reshape(2, 3, 4)

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

If the array us too large to be printed, numpy will skip the central part of the array and only print the corners:

In [20]:
np.arange(10000)

array([   0,    1,    2, ..., 9997, 9998, 9999])

## Arithmetic operations

Arithmetic operations are applied elementwise, with a new array being created and filled with the result:

In [26]:
a = np.array([20, 30, 40, 50])
b = np.arange(4)

print(a)
print(b)

[20 30 40 50]
[0 1 2 3]


In [27]:
c = a - b
print(c)

[20 29 38 47]


In [28]:
 b**2

array([0, 1, 4, 9])

In [30]:
 10 * np.sin(a)

array([ 9.12945251, -9.88031624,  7.4511316 , -2.62374854])

In [31]:
np.log(a)

array([ 2.99573227,  3.40119738,  3.68887945,  3.91202301])

In [32]:
a < 35

array([ True,  True, False, False], dtype=bool)

Unlike in many matrix languages, the product operator * operates elementwise in Numpy arrays:

In [34]:
A = np.array([[1,1],
              [0,1]])

B = np.array([[2,0],
              [3,4]])
A * B

array([[2, 0],
       [0, 4]])

If you would like to use the matrix product, use Numpy's `dot` function:

In [35]:
np.dot(A, B)

array([[5, 4],
       [3, 4]])