In [1]:
import numpy as np

## NumPy Array

The main object in NumPy is the multi-dimensional array data structure called `ndarray`, or simply as `array`. The `ndarray` data structure is different to Python standard list, tuple, or `array.array` object, since it offers a lot more functionality. The following notebook lists the more important concepts and functions involving `ndarray` that we are going to use in this course, so that you can quickly refer to it should you need to. For a more complete introduction to NumPy, please refer to the official quickstart page at https://docs.scipy.org/doc/numpy-1.15.1/user/quickstart.html.

To simplify the discussion, we are going to refer to the `ndarray` object simply as array. 

## Contents:

* [Array Creation](#array-creation)
* [Basic Array Operations](#Basic-array-operations)


## Array creation

Main reference: https://docs.scipy.org/doc/numpy-1.13.0/user/basics.creation.html

There are two main methods of creating an array: we can convert a Python list or tuple into one, or we construct one from scratch using one of the array-creation function.

Conversion from other Python structures is straightforward:

In [2]:
a = np.array([1,2,3,4])
a

array([1, 2, 3, 4])

Otherwise, we can use one of many array-creation functions, which are mostly self-explanatory:
* `np.ones([m,n])` - creates an array of 1s
* `np.zeros([m,n])` - creates an array of 0s
* `np.full([m,n], x)` - creates an array filled with `x`
* `np.empty([m,n])` - creates an array without initialising the values, marginally faster but can contain garbage
* `np.identity(m)` - creates an m x m identity matrix

The first four functions above create an array with m rows and n columns. You can increase the dimension by adding more values in the list. For example:

In [3]:
a = np.zeros([2,3,3])
a

array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]])

The first four functions mentioned above also has the 'like' version that takes an existing array as an input parameter and then construct a new array with the same dimension. Useful if you are going to be working with the arrays of the same dimension.

In [4]:
a = np.zeros([3,3])
b = np.full_like(a,9)
b

array([[9., 9., 9.],
       [9., 9., 9.],
       [9., 9., 9.]])

To create identity matrix, you can use the `identity` function:

In [5]:
a = np.identity(3)
a

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

or you can use the more general `eye` function (see https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.eye.html). I have personally not find much use of this function.

You can specify the type of the elements inside the array using the `dtype` parameter:

In [6]:
a = np.full([2,3],5,dtype=np.complex64)
a

array([[5.+0.j, 5.+0.j, 5.+0.j],
       [5.+0.j, 5.+0.j, 5.+0.j]], dtype=complex64)

In [7]:
a = np.identity(3,dtype=np.int16)
a

array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]], dtype=int16)

The basic data types in NumPy can be found here: https://docs.scipy.org/doc/numpy-1.13.0/user/basics.types.html. 

We are not going to care much about data types for this part of the course, but it is an important topic because it can help speed up your computation. For a more thorough discussion on data types, you can refer to https://jakevdp.github.io/PythonDataScienceHandbook/02.01-understanding-data-types.html.

### Array creation using numerical ranges

The most common functions are:
* `np.arange(start, stop, step)` - creates an array with evenly spaced value (step)
* `np.linspace(start, stop, num)` - as above, but specifies the number elements instead
* `np.logspace(start,stop, num, base)` - creates an evenly spaced array of num elements on a log scale (from base^start to base^stop)
* `np.geomspace(start,stop,num)` - creates an evenly spaced array of num elements on a log scale (from start to stop)

I provided some examples below to see how these functions work, in particular for `logspace` and `geomspace`, because it is easier to see how they work that way.
See the reference on https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.array-creation.html if you need more details. Each of these functions can be given more input parameters, but to my experience the parameters I listed above are the most commonly used ones. `np.arange()` is often used to quickly generate a list of integers, for example:

In [8]:
np.arange(21) # note that 21 is not included in the list

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20])

The step parameter can be a floating point number:

In [9]:
np.arange(0,5,step=0.5)

array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])

and doesn't even need to divide the range evenly:

In [10]:
np.arange(0,5,step=0.8)

array([0. , 0.8, 1.6, 2.4, 3.2, 4. , 4.8])

In contrast, with `np.linspace()`, we control how many intervals there are:

In [11]:
np.linspace(0,5,num=9)

array([0.   , 0.625, 1.25 , 1.875, 2.5  , 3.125, 3.75 , 4.375, 5.   ])

In essence, use `arange()` if you know the size of the intervals and use `linspace()` if you know how many intervals there are. 

`logspace()` and `geomspace()` both create an array with values on an exponential scale:

In [12]:
np.logspace(0.0,9.0, num=10, base=2.0)

array([  1.,   2.,   4.,   8.,  16.,  32.,  64., 128., 256., 512.])

In [13]:
np.geomspace(1.0,512.0,num=10)

array([  1.,   2.,   4.,   8.,  16.,  32.,  64., 128., 256., 512.])

The difference is that, as you can see above, for `logspace()`, you give the exponent range (as well as the base), where as for `geomspace()`, you just give the starting and end values. 

### 2D Array

The simplest approach to create a 2D array is to pass it a list of lists.

In [15]:
np.array([np.linspace(0,1,num=5),np.zeros(5)])

array([[0.  , 0.25, 0.5 , 0.75, 1.  ],
       [0.  , 0.  , 0.  , 0.  , 0.  ]])

which is fine, and in most cases is how we can define matrices. However, note that this works even when the arrays have different dimensions, so you do need to be a bit careful:

In [16]:
np.array([np.linspace(0,1,num=5),np.zeros([2,3])])

array([array([0.  , 0.25, 0.5 , 0.75, 1.  ]),
       array([[0., 0., 0.],
       [0., 0., 0.]])], dtype=object)

Another approach is to use the `reshape` function to convert a linear array into a 2D array. Here is an example:

In [17]:
np.linspace(0.05,1,num=20).reshape(4,5)

array([[0.05, 0.1 , 0.15, 0.2 , 0.25],
       [0.3 , 0.35, 0.4 , 0.45, 0.5 ],
       [0.55, 0.6 , 0.65, 0.7 , 0.75],
       [0.8 , 0.85, 0.9 , 0.95, 1.  ]])

The `reshape` function is much more versatile since it can reshape from one form to any other as long as there are enough elements. For example, we can convert a 3D array into a 2D array:

In [21]:
np.arange(48).reshape(3,4,4)

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]],

       [[32, 33, 34, 35],
        [36, 37, 38, 39],
        [40, 41, 42, 43],
        [44, 45, 46, 47]]])

## Basic array operations

### Arithmetic operations

The basic operations that can be performed on arrays include the standard mathematical operations `+`, `-`, `*`, `/`, and `%`. If you perform these operations on two arrays, then these will be done element by element, so
```
  [1,1] + [2,2] = [3,3]
```
and 
```
   8 * [1,1,1] = [8,8,8].
```
More importantly however,
```
  [1,1] * [3,3] = [3,3]
```
not `[6]`, as if it would be in a matrix multiplication. To perform matrix multiplication, you need to call the `np.dot` function. 

In [27]:
np.array([1,1])*np.array([3,3])

array([3, 3])

In [24]:
8*np.array([1,1,1])

array([8, 8, 8])

In [31]:
np.dot(np.array([1,1]),np.array([3,3]))

6

### Common statistical and mathematical functions

In [None]:
np.mean(a)

In [None]:
a.ravel()

In [None]:
a.flat

In [None]:
a.shape

### Looping through an array

## Advanced array operations

### Broadcasting

### Array stacking