# NumPy - Routines

## Table of contents

* [Mathematical functions](#Mathematical-functions)
* [Linear algebra: `np.linalg`](#Linear-algebra:-np.linalg)
* [Statistics](#Statistics)
* [Array manipulation](#Array-manipulation)
    * [Transposing arrays: `np.transpose(<array>)` or `<array>.T`](#Transposing-arrays:-np.transpose(<array>)-or-<array>.T)
    * [Raveling (flattening) arrays: `np.ravel(<array>)`](#Raveling-(flattening)-arrays:-np.ravel(<array>))
    * [Reshaping arrays: `np.reshape(<array>, <new_shape>)` or `np.ndarray.reshape(<new_shape>)`](#Reshaping-arrays:-np.reshape(<array>,-<new_shape>)-or-np.ndarray.reshape(<new_shape>))
        * [`np.reshape()` on a *contiguous* array returns a *view* object](#np.reshape()-on-a-contiguous-array-returns-a-view-object)
        * [`np.reshape()` on a *non-contiguous* array returns a *copy*](#np.reshape()-on-a-non-contiguous-array-returns-a-copy)
        * [`np.reshape(<array>, <new_shape>)` vs `np.ndarray.shape = <new_shape>`](#np.reshape(<array>,-<new_shape>)-vs-np.ndarray.shape-=-<new_shape>)
    * [Combining arrays](#Combining-arrays)
        * [Concatenate arrays: `np.concatenate((<array1>, <array2>), axis=0)`](#Concatenate-arrays:-np.concatenate((<array1>,-<array2>),-axis=0))
        * [Vertical stacking: `np.ndarray.vstack()`](#Vertical-stacking:-np.ndarray.vstack())
        * [Horizontal stacking: `np.ndarray.hstack()`](#Horizontal-stacking:-np.ndarray.hstack())
    * [Splitting arrays](#Splitting-arrays)
        * [Vertical split: `np.vsplit(<array>, <indices_or_sections>)`](#Vertical-split:-np.vsplit(<array>,-<indices_or_sections>))
        * [Horizontal split: `np.hsplit(<array>, <indices_or_sections>)`](#Horizontal-split:-np.hsplit(<array>,-<indices_or_sections>))
* [Logic functions](#Logic-functions)
    * [Truth value testing](#Truth-value-testing)
        * [`np.any(<array>, <axis=None>)`](#np.any(<array>,-<axis=None>))
        * [`np.all(<array>, <axis=None>)`](#np.all(<array>,-<axis=None>))
* [Miscellaneous](#Miscellaneous)
    * [Load data from file: `np.genfromtxt()`](#Load-data-from-file:-np.genfromtxt())
* [Challenge](#Challenge)
    * [Task](#Task)
    * [Solution](#Solution)

***

NumPy Reference - Routines: https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.html

In the chapter linked above, routine docstrings are presented, grouped by functionality. 

In [502]:
import numpy as np

In [503]:
# 1D array
a = np.array([1,2,3])

In [504]:
# 2D array
b = np.array([[1,2,3],[4,5,6]])

In [505]:
# 3D array
c = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])

## Mathematical functions

NumPy Reference - Mathematical functions: https://docs.scipy.org/doc/numpy/reference/routines.math.html

In [506]:
b

array([[1, 2, 3],
       [4, 5, 6]])

In [507]:
b + 2

array([[3, 4, 5],
       [6, 7, 8]])

In [508]:
b - 2

array([[-1,  0,  1],
       [ 2,  3,  4]])

In [509]:
b * 2

array([[ 2,  4,  6],
       [ 8, 10, 12]])

In [510]:
b / 2

array([[0.5, 1. , 1.5],
       [2. , 2.5, 3. ]])

In [511]:
b ** 2

array([[ 1,  4,  9],
       [16, 25, 36]], dtype=int32)

In [512]:
# Add 2 arrays of the same shape
d = np.array([[1,0,1],[0,1,0]])
b + d

array([[2, 2, 4],
       [4, 6, 6]])

In [513]:
# Apply sin()
np.sin(b)

array([[ 0.84147098,  0.90929743,  0.14112001],
       [-0.7568025 , -0.95892427, -0.2794155 ]])

In [514]:
# Apply cos()
np.cos(b)

array([[ 0.54030231, -0.41614684, -0.9899925 ],
       [-0.65364362,  0.28366219,  0.96017029]])

In [515]:
# Round to the given numnber of decimal places
np.round(np.cos(b), 2)

array([[ 0.54, -0.42, -0.99],
       [-0.65,  0.28,  0.96]])

## Linear algebra: `np.linalg`

To read more on linear algebra in NumPy: https://docs.scipy.org/doc/numpy/reference/routines.linalg.html


In [516]:
a = np.ones((2,3))
a

array([[1., 1., 1.],
       [1., 1., 1.]])

In [517]:
b = np.full((3,2), 2)
b

array([[2, 2],
       [2, 2],
       [2, 2]])

In [518]:
# multiply matrices
np.matmul(a, b)

array([[6., 6.],
       [6., 6.]])

In [519]:
# find determinant of a matrix (ad-bc for a 2x2 matrix)
a = np.array([[2,3],[-1,5]])
np.linalg.det(a)

13.0

## Statistics

NumPy Reference - Statistics: https://docs.scipy.org/doc/numpy/reference/routines.statistics.html


In [520]:
stats = np.array([[1,3],[2,5]])
stats

array([[1, 3],
       [2, 5]])

In [521]:
np.min(stats)

1

In [522]:
np.max(stats)

5

In [523]:
np.min(stats, axis=0)          # will return an array with the min of each 'column'

array([1, 3])

In [524]:
np.min(stats, axis=1)          # will return an array with the min of each 'row'

array([1, 2])

In [525]:
np.sum(stats)

11

In [526]:
np.sum(stats, axis=0)          # will return an array with the sum of each 'column'

array([3, 8])

In [527]:
np.sum(stats, axis=1)          # will return an array with the sum of each 'row'

array([4, 7])

## Array manipulation

NumPy Reference - Array manipulation routines: https://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html

### Transposing arrays: `np.transpose(<array>)` or `<array>.T`

In [528]:
a_test = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
a_test

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [529]:
np.transpose(a_test)

array([[ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11],
       [ 4,  8, 12]])

In [530]:
a_test.T

array([[ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11],
       [ 4,  8, 12]])

### Raveling (flattening) arrays: `np.ravel(<array>)`

`np.ravel()` returns a contiguous flattened array.

In [531]:
a_test

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [532]:
np.ravel(a_test)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [533]:
a_test.ravel()

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

### Reshaping arrays: `np.reshape(<array>, <new_shape>)` or `np.ndarray.reshape(<new_shape>)`
 
> `np.reshape()` gives a new shape to an array without changing its data.<br>
`np.reshape()` **returns** a new **view object** if possible, otherwise, it will return a **copy**.

#### `np.reshape()` on a *contiguous* array returns a *view* object

A **contiguous array** is just an array stored in an unbroken block of memory: to access the next value in the array, we just move to the next memory address.

Applying `np.reshape()` to a **contiguous array** returns a ***view* object**, i.e. the *shape* of the returned object can be modified without modifying the initial object.

In [534]:
a_test = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
a_test

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [535]:
a_test.shape

(3, 4)

In [536]:
b_test = a_test.view()                           # b_test is a contiguous array
b_test

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [537]:
b_test.shape

(3, 4)

In [538]:
c_test = np.reshape(b_test, (6,2))              # .reshape() on a contiguous array will return a view object

In [539]:
c_test

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10],
       [11, 12]])

In [540]:
c_test[0,0] = 100                              # b_test changes because c_test is a view of b_test.
b_test

array([[100,   2,   3,   4],
       [  5,   6,   7,   8],
       [  9,  10,  11,  12]])

#### `np.reshape()` on a *non-contiguous* array returns a *copy*

Transposing an array with `<array>.T` means that the contiguity is lost because adjacent row entries are no longer in adjacent memory addresses.

Applying `np.reshape()` to a **non-contiguous array** returns a ***copy***, i.e. both, the *shape* and *data* of the returned object can be modified without modifying the initial object.

In [541]:
a_test = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
a_test

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [542]:
a_test.shape

(3, 4)

In [543]:
b_test = a_test.T                               # makes b_test a non-contiguous array
b_test

array([[ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11],
       [ 4,  8, 12]])

In [544]:
b_test.shape

(4, 3)

In [545]:
c_test = np.reshape(b_test, (6,2))              # .reshape() on a non-contiguous array will return a copy

In [546]:
c_test

array([[ 1,  5],
       [ 9,  2],
       [ 6, 10],
       [ 3,  7],
       [11,  4],
       [ 8, 12]])

In [547]:
c_test[0,0] = 100                              # will NOT change b_test because c_test is a 'copy' of b_test.
b_test

array([[ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11],
       [ 4,  8, 12]])

#### `np.reshape(<array>, <new_shape>)` vs `np.ndarray.shape = <new_shape>`

As seen in the previous notebook, we could have used `np.ndarray.shape = <new_shape>` to reshape the arrays above, instead of using `np.reshape(<array>, <new_shape>)`.

The difference, compared to the above cases would have been:
- `np.ndarray.shape = <new_shape>` changes the array shape **inplace**, whereas `np.reshape(<array>, <new_shape>)` **returns** a reshaped *view* or *copy* of the input array. 
- `np.ndarray.shape = <new_shape>` on a **non-contiguous array** would have raised an error saying: `AttributeError: incompatible shape for a non-contiguous array`. Whereas `np.reshape(<array>, <new_shape>)` on a **non-contiguous array** **returns** a reshaped *copy* of the input array without raising any errors 

### Combining arrays

#### Concatenate arrays: `np.concatenate((<array1>, <array2>), axis=0)`

`np.concatenate()` joins a sequence of arrays along an existing axis. 

In [548]:
c1 = np.array([[1,2,3,4],[5,6,7,8]])
c2 = np.array([[9,10,11,12],[13,14,15,16]])

In [549]:
# concatenate along axis=0
np.concatenate((c1, c2), axis=0)

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [550]:
# concatenate along axis=1
np.concatenate((c1, c2), axis=1)

array([[ 1,  2,  3,  4,  9, 10, 11, 12],
       [ 5,  6,  7,  8, 13, 14, 15, 16]])

#### Vertical stacking: `np.ndarray.vstack()`

In [551]:
v1 = np.array([1,2,3,4])
v2 = np.array([5,6,7,8])
np.vstack((v1, v2))

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

#### Horizontal stacking: `np.ndarray.hstack()`

In [552]:
h1 = np.ones((2,4))
h2 = np.zeros((2,2))
np.hstack((h1,h2))

array([[1., 1., 1., 1., 0., 0.],
       [1., 1., 1., 1., 0., 0.]])

### Splitting arrays

#### Vertical split: `np.vsplit(<array>, <indices_or_sections>)`

>*Docstring*<br>
`np.vsplit()` splits an array into multiple sub-arrays vertically (row-wise).<br>

>*Parameters*<br>
- `<array>`: `np.ndarray`<br>
&emsp;&emsp;Array to be divided into sub-arrays<br>
- `<indices_or_sections>`: `int` or 1D array<br>
&emsp;&emsp;If *indices_or_sections* is an **integer**, N, the array will be divided into N equal arrays along axis.<br>
&emsp;&emsp;If *indices_or_sections* is a **1-D array** of sorted integers, the entries indicate where along axis the array is split.<br>
For example, `[2, 3]` would result in:<br>
    - `<array>[:2]`
    - `<array>[2:3]`
    - `<array>[3:]`
    
>*Returns*<br>
List of `np.ndarray` objects.

`np.vsplit()` is equivalent to `np.split()` with `axis=0`. 
    

In [553]:
a_test = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
a_test

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [554]:
# <indices_or_sections> = int
np.vsplit(a_test, 2)

[array([[1, 2, 3, 4],
        [5, 6, 7, 8]]),
 array([[ 9, 10, 11, 12],
        [13, 14, 15, 16]])]

In [555]:
# <indices_or_sections> = 1D array
np.vsplit(a_test, np.array([1,2]))

[array([[1, 2, 3, 4]]),
 array([[5, 6, 7, 8]]),
 array([[ 9, 10, 11, 12],
        [13, 14, 15, 16]])]

#### Horizontal split: `np.hsplit(<array>, <indices_or_sections>)`

`np.hsplit()` is similar to `np.vsplit()` except that it is used for *horizontal splitting*.

`np.hsplit()` is equivalent to `np.split()` with `axis=1`. 
    

## Logic functions

NumPy Reference - Logic functions: https://docs.scipy.org/doc/numpy/reference/routines.logic.html

### Truth value testing

#### `np.any(<array>, <axis=None>)`

`np.any()` tests whether ANY array element along a given axis evaluates to `True`.

`np.any()` **returns** single boolean unless axis is not `None`.

In [556]:
# 3D array
# 2D array
b = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
b

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [557]:
# axis=None: looks in the entire array, and returns single boolean
np.any(b > 10)

True

In [558]:
# axis=0: looks in each column, and returns a boolean array
np.any(b > 10, axis=0)

array([False, False,  True,  True])

In [559]:
# axis=1: looks in each row, and returns a boolean array
np.any(b > 10, axis=1)

array([False, False,  True])

#### `np.all(<array>, <axis=None>)`

`np.all()` tests whether ALL array elements along a given axis evaluate to `True`.

In [560]:
b

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [561]:
# axis=0: looks in each column, and returns a boolean array
np.all(b > 2, axis=0)

array([False, False,  True,  True])

## Miscellaneous

### Load data from file: `np.genfromtxt()`

In [562]:
filearray = np.genfromtxt('work_directory/numpy/data/data.txt', delimiter=',')
filearray

array([[  1.,  13.,  21.,  11., 196.,  75.,   4.,   3.,  34.,   6.,   7.,
          8.,   0.,   1.,   2.,   3.,   4.,   5.],
       [  3.,  42.,  12.,  33., 766.,  75.,   4.,  55.,   6.,   4.,   3.,
          4.,   5.,   6.,   7.,   0.,  11.,  12.],
       [  1.,  22.,  33.,  11., 999.,  11.,   2.,   1.,  78.,   0.,   1.,
          2.,   9.,   8.,   7.,   1.,  76.,  88.]])

In [563]:
filearray = filearray.astype('int32')
filearray

array([[  1,  13,  21,  11, 196,  75,   4,   3,  34,   6,   7,   8,   0,
          1,   2,   3,   4,   5],
       [  3,  42,  12,  33, 766,  75,   4,  55,   6,   4,   3,   4,   5,
          6,   7,   0,  11,  12],
       [  1,  22,  33,  11, 999,  11,   2,   1,  78,   0,   1,   2,   9,
          8,   7,   1,  76,  88]])

In [564]:
# return a boolean array for values >50
filearray > 50

array([[False, False, False, False,  True,  True, False, False, False,
        False, False, False, False, False, False, False, False, False],
       [False, False, False, False,  True,  True, False,  True, False,
        False, False, False, False, False, False, False, False, False],
       [False, False, False, False,  True, False, False, False,  True,
        False, False, False, False, False, False, False,  True,  True]])

In [565]:
# return a boolean array for values (>50 and <100)
(filearray > 50) & (filearray < 100)

array([[False, False, False, False, False,  True, False, False, False,
        False, False, False, False, False, False, False, False, False],
       [False, False, False, False, False,  True, False,  True, False,
        False, False, False, False, False, False, False, False, False],
       [False, False, False, False, False, False, False, False,  True,
        False, False, False, False, False, False, False,  True,  True]])

In [566]:
# return a boolean array for values that are NOT (>50 and <100)
~((filearray > 50) & (filearray < 100))

array([[ True,  True,  True,  True,  True, False,  True,  True,  True,
         True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True, False,  True, False,  True,
         True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True,  True, False,
         True,  True,  True,  True,  True,  True,  True, False, False]])

## Challenge

### Task

1. Create the following array
2. Index the part of the array outlined in <font color=blue>blue</font>
3. Index the part of the array outlined in <font color=green>green</font>
4. Index the part of the array outlined in <font color=red>red</font>

![asdasd.png](attachment:asdasd.png)

### Solution

#### 1. Create the main array

In [567]:
# starting 1D array
start_array = np.arange(1,31)
start_array

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30])

In [568]:
# reshape start_array to create main_array
main_array = np.reshape(start_array, (6,5))                 # np.reshape() on contiguous array will return view object         
main_array

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25],
       [26, 27, 28, 29, 30]])

In [569]:
main_array.base is start_array                              # main_array is a reshaped 'view' of the start_array

True

#### 2. Index the part of the array outlined in <font color=blue>blue</font>

In [570]:
blue_array = main_array[2:4,0:2]
blue_array


array([[11, 12],
       [16, 17]])

#### 3. Index the part of the array outlined in <font color=green>green</font>

In [571]:
green_array = main_array[np.array([0,1,2,3]), np.array([1,2,3,4])]
green_array

array([ 2,  8, 14, 20])

#### 4. Index the part of the array outlined in <font color=red>red</font>

In [572]:
# using indexing with an iterable (tuple) passed for the rows, and a slice passed for the columns
red_array = main_array[(0,4,5), 3:]
red_array

array([[ 4,  5],
       [24, 25],
       [29, 30]])

In [573]:
# using np.vstack()
red_array = np.vstack((main_array[0,3:5], main_array[4:6,3:5]))
red_array

array([[ 4,  5],
       [24, 25],
       [29, 30]])