### 6️⃣ 👩‍💻 📚 NumPy for Data-Driven Engineering

<img width=550px align=left src='https://apmonitor.com/dde/uploads/Main/Python_6Numpy.png'>

Data-driven engineering relies on information, often stored in the form of collections of numbers as matrices and arrays. `NumPy` (Numerical Python extensions) is a library for data processing with multi-dimensional arrays and mathematical functions to operate on the arrays. This series is an introduction to the Python NumPy libraries.

<html>
<ul>
<li> 6️⃣.1️⃣ NumPy Install and Import
<li> 6️⃣.2️⃣ Numpy Arrays
<li> 6️⃣.3️⃣ Import and Export Data
<li> 6️⃣.4️⃣ Unary Operations
<li> 6️⃣.5️⃣ Binary Operations
</ul>
</html>

### 6️⃣.1️⃣ 📒 NumPy Install and Import

Anaconda comes with `numpy`, `scipy`, and other foundational libraries. If a library is not installed, it can be added by using the name of the library with `pip` in a Jupyter Notebook cell or from the computer command line. Additional information on managing packages is avaiable in the [Data-Driven Engineering course](https://apmonitor.com/dde/index.php/Main/InstallPythonPackages).

```python
pip install numpy
```

Once a library is installed, functions are imported in one of many ways:

```python
import numpy
import numpy as np
from numpy import array
```

The first option is rarely used because the full `numpy` name would need to be used on every function call. The second option shortens the library name and is the most popular way to import all `numpy` functions and attributes. The third method imports only the specific function `array` instead of all functions. Never use `from numpy import *` because it clutters the namespace and the source of the function is unclear when multiple libraries are used.

In [None]:
import numpy as np

There are NumPy functions for statistical analysis, linear algebra operations, and to generate summary information. NumPy can be slower than base Python for simple operations, but is much faster for large-scale transformations. Many of the functions are written in Fortran and called from a Python interface.

- `array`: create a NumPy array
- `hstack`: join arrays along a new horizontal axis
- `linspace`: generate evenly spaced numbers
- `max` and `min`: maximum and minimum values
- `mean`: mean (average) value
- `ones`: generate an array of ones
- `reshape`: reshape an array
- `sort`: sort an array
- `std`: standard deviation
- `shape`: return the shape of an array 
- `transpose`: reverse (transpose) the axes of an array 
- `vstack`: join arrays along a new vertical axis
- `zeros`: generate an array of zeros

A first step in working with NumPy is to create an array. The next section demonstrates how to create a NumPy array, a fundamental data structure in the NumPy package.

### 6️⃣.2️⃣ 📒 NumPy Arrays

NumPy arrays are collections of number or objects stored as a vector (1D array), matrix (2D array), or multi-dimensional array (3D+ array). A tensor is a type of multi-dimensional array with certain transformation properties. 

#### 🔢 NumPy Scalar

A single number is called a scalar in NumPy. Use the `np.array()` function to create a new array. The input argument of the function is the array as a Python `list` (e.g. `[7]`) or `tuple` (e.g. `(7,)`).

In [None]:
y = np.array([7])
print(y)
print(type(y))

#### 🔢 NumPy 1D Array

A 1-dimensional array is a row vector in NumPy.

In [None]:
y = np.array([0,1,2,3,4])

#### 📝 Print List and Length

Print the `array` with the `print()` function.

In [None]:
print(y)

The length of the array is obtained with `len(y)`.

In [None]:
len(y)

#### 🔢 NumPy 2D Array

A 2-dimensional array is a matrix in NumPy. The 2D array is input as a list (e.g. `[0,1,2,3]`). Each row list is seperated by a comma as `[[],[],[]]` to create the matrix.

In [None]:
z = np.array([[0,1,2,3],
              [10,11,12,13],
              [20,21,22,23]])
print(z)
print(type(z))

📏 The `len()` function returns the number of rows.

In [None]:
len(z)

⬛ Use `np.size()` to get the total number of array elements.

In [None]:
np.size(z)

📐 Use `np.shape()` to get the number of rows and columns as a `tuple`.

In [None]:
np.shape(z)

#### 🔢 NumPy 3D Array

A 3-dimensional array is used for color images with pixel location horizontal position, pixel location vertical position, and color intensity (0-255) for red, green, blue (RGB). Each pixel is stored as `[R,G,B]` with `Red=[255,0,0]`, `Green=[0,255,0]`, `Blue=[0,0,255]`, `White=[255,255,255]` and `Black=[0,0,0]`. Common image processing packages `Matplotlib` and `Pillow` use RGB while `OpenCV` uses the opposite order with blue first and red last (BGR).

In [None]:
import matplotlib.pyplot as plt

R = [255,0,0]; G = [0,255,0]; B = [0,0,255]
W = [255,255,255]; K = [0,0,0]

img = np.array([[R,R,B,B],
                [R,R,B,B],
                [G,G,B,B],
                [G,G,B,B],
                [K,K,W,W]])

plt.imshow(img)
plt.grid()

Use `np.shape()` to get the shape of the array with `row=height (h)`, `columns=width (w)`, `RGB=color (c)` of the image.

In [None]:
h,w,c = np.shape(img)
print(h,w,c)

### 6️⃣.3️⃣ Export and Import Data

Importing and exporting data to files is an important task for data-driven engineering. There are many methods to import and export data in Python such as the `open()` and `close()` functions in base Python. There are specialized functions to import and export large data sets in Python with Numpy, Pandas, and other packages.

One way to create a file in Numpy is with the `np.save()` function that saves the data in binary form (not human readable). Opening the `img.npy` file with a text editor has some information about the array but no human readable information about the data that is stored in compressed format.

***Binary img.npy file***

```
“NUMPY v {'descr': '<i4', 'fortran_order': False, 'shape': (8, 7, 3), }                                                       
ÿ   ÿ   ÿ
ÿ   ÿ  
ÿ   ÿ  
ÿ   ÿ  
```

In [None]:
np.save('img',img)

#### 🗄 Save Multiple Arrays

Multiple arrays can be saved to a single data zip file. Any name can be provided instead of `f1`, `f2`, or `f3`. These names become the `key` when the file is loaded. 

In [None]:
np.savez('dz',f1=img,f2=z,f3=y)

#### 📖 Load Data

Load the arrays with `np.load()` after they are saved with `np.savez()`. The values are available by using the name of the string key `data['f3']`.

In [None]:
data = np.load('dz.npz')
print(data['f3'])

#### 🗒 Human Readable Text Files

Save the file as a text file with `np.savetxt()` to create a human-readable file `z.txt` file.

```
0.000000000000000000e+00 1.000000000000000000e+00 2.000000000000000000e+00 3.000000000000000000e+00
1.000000000000000000e+01 1.100000000000000000e+01 1.200000000000000000e+01 1.300000000000000000e+01
2.000000000000000000e+01 2.100000000000000000e+01 2.200000000000000000e+01 2.300000000000000000e+01
```

Attempting to save a higher-dimensional array than 1D or 2D produces and error: `ValueError: Expected 1D or 2D array, got 3D array instead`.

In [None]:
np.savetxt('z.txt',z)

Switch to comma separated values (CSV versus the default tab-delimited) with `delimiter=','`, add a heading with `header`, remove `#` comments with `comments=#`, and change the format to 8 decimals with `fmt='%.8e'`.

```
Col0,Col1,Col2
0.00000000e+00,1.00000000e+00,2.00000000e+00,3.00000000e+00
1.00000000e+01,1.10000000e+01,1.20000000e+01,1.30000000e+01
2.00000000e+01,2.10000000e+01,2.20000000e+01,2.30000000e+01
```

Disadvantages of using text files to store data are that the file sizes are larger and the numbers may not be exactly loaded because of potential trucation.

In [None]:
np.savetxt('z.txt',z,
           header='Col0,Col1,Col2',comments='',
           delimiter=',',fmt='%.8e')

Other common format specifiers include:

* `d`: signed integer
* `e` or `E`: floating point exponential format (`e`=lowercase, `E`=uppercase)
* `f` or `F`: floating point decimal format
* `g` or `G`: same as `e/E` if exponent is >=6 or <=-4, `f` otherwise
* `s`: string

The number `8` indicates how many decimal places are displayed.

### 6️⃣.4️⃣ Unary Operations

Unary operations are those performed on a single array. An example is to reverse the sign of all numbers in an array.

In [None]:
-z

Mathematical operations operate on the array separately for each entry. The expression `1/(z+1)` is equivalent to `(z+1)**-1`. This is not the inverse of the matrix but only the inverse of each element in the matrix.

In [None]:
1/(z+1)

#### 🔃 Transpose Matrix

A common unary operation is to transpose a matrix by reversing the order of the axes.

In [None]:
z.T

#### 🆎 Convert to Another Data Type

Convert to `int`, `float`, or `str` with `astype()` at the end of the array. There is no numerical difference with switching from an `int` to a `float` but some functions require one or the other.

In [None]:
z.astype(str)

#### 📇 Array Index

An array index refers to the location of the data. Python is zero-index so `z[0,1]` refers to the upper left value at `row=0` and `column=1`.

In [None]:
z[0,1]

The last value has an index of `-1`, the second to last value has an index of `-2`, and so on.

In [None]:
z[-1,-2]

#### 🔪 Array Slicing

A subset of the array is returned by slicing by indicating a range `start:end` instead of a single value. The slice `z[0:2]` returns the first two rows.

In [None]:
z[0:2]

The last two columns of any matrix are available with `z[:,:-2]`. A blank `start` or `end` value indicates that it should start at the beginning or proceed to the end.

In [None]:
z[:,:-2]

A third index refers to the step size with `start:end:row_increment`. Return every other row with `z[0:5:2]` or a shortened form with `z[::2]`.

In [None]:
z[::2]

Reverse the row order with `z[::-1]`.

In [None]:
z[::-1]

A matrix inverse is only applicable to square matrices. This slice creates a `3x3` matrix with `np.linalg.inv()` calculating the inverse.

In [None]:
np.linalg.inv(z[:,1:])

### 6️⃣.5️⃣ Binary Operations

Binary operations are those that involve 2 arrays. The operators available for scalars are also available for NumPy arrays.

#### ⚙ Operators

- `+` `-` `*` `/` addition, subtraction, multiplication, division
- `%` modulo (remainder after division)
- `//` floor division (discard the fraction, no rounding)
- `**` exponential
- `@` matrix dot product

In [None]:
z+z

In [None]:
z/(z+1)

#### 🔢🔢 Array Multiplication

Array multiplication is available with the cross product `np.cross()` or dot product `np.dot()` or shortened notation `@`. The dot product of `z.T` and `z` is `z.T@z`.

In [None]:
z.T@z

#### 🧭 Comparison Operators

Comparison operators return a boolean `True` or `False` for each element of the array as a new array.

- `>` greater than, `>=` or equal to
- `<` less than, `<=` or equal to
- `==` equal to (notice the double equal sign, single assigns a value)
- `!=` or `<>` not equal to



In [None]:
b=(z-5)>(z/2)
b

The `np.where()` command selects only the subset that are `True`. Use `np.reshape()` to get the flattened array into a desirebale form, if needed.

In [None]:
z[np.where(~b)]

#### ➕ Add to Array

Items are added to an array using the `np.append()` function. 

In [None]:
np.append(z,[1,1,1,1])

Preserve the dimensions of the matrix by using `np.vstack()` for vertical stacking.

In [None]:
np.vstack((z,[1,1,1,1]))

Use `np.hstack()` for horizontal placement. Convert `[-1,-1,-1]` from a row vector to a column vector with `np.reshape([-1,-1,-1],(-1,1))` or `np.array([-1,-1,-1]).reshape(-1,1)`.

```
array([[-1],
       [-1],
       [-1]])
```

In [None]:
a = np.reshape([-1,-1,-1],(-1,1))
np.hstack((z,a))

#### 💻 Exercise 6A

Create `y` as a `5x5` Numpy array of ones. Modify the array to place zeros along the diagonal.

#### 💻 Exercise 6B

Create 20 uniformly distributed random numbers between 0 and 1 with `np.random.rand(20)`

```python
z = np.random.rand(20)
```

Find the array index of the highest number with `np.argmax(z)`

```python
i = np.argmax(z)
```

Using the index `i`, print the highest number.

#### 💻 Exercise 6C

Create 20 uniformly distributed random numbers between 0 and 1 with `np.random.rand(20)`

```python
z = np.random.rand(20)
```

Use the `np.where()` function to create a new array with values from `z` that are greater than 0.7.

#### 💻 Exercise 6D

Save the array `z=np.random.rand(20)` as a text file. 

#### 💻 Exercise 6E

Calculate the dot product of matrices `A` and `B`.

$A = \begin{bmatrix}1 & -1\\ 5 & -7\end{bmatrix}$    $B = \begin{bmatrix}3 & -4\\ -2 & 2\end{bmatrix}$