# Appendix 2: Mathematical operations

[*NumPy*](https://numpy.org/) is a first-rate library for numerical programming. It is widely used in academia, finance and also in the industry.

The *Pandas* library introduced in [Chapter 9](09_tabular.ipynb) is also built on top of *NumPy*, providing high-performance, easy-to-use data structures and data analysis tools, making data manipulation and visualization more convinient.

## How to install numpy?

If you have Anaconda installed, then numpy was already installed together with it.

If you have a standalone Python3 and Jupyter Notebook installation, open a command prompt / terminal and type in:
```
pip3 install numpy
```

## How to use numpy?

The numpy package is a module which you can simply import. It is usually aliased with the `np` abbreviation:
```python
import numpy as np
```

---

## NumPy Arrays

The most important structure that NumPy defines is an array data type formally called a [`numpy.ndarray`](https://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html) - for *N dimensional array*.

In [None]:
import numpy as np

a = np.zeros(3)
a

In [None]:
type(a)

NumPy arrays are somewhat like native Python lists, except that:
 * data must be homogeneous (all elements of the same type);
 * these types must be one of the data types (dtypes) provided by NumPy.;

The most important of these dtypes are:
 * `float64`: 64 bit floating-point number
 * `int64`: 64 bit integer
 * `bool`: 8 bit True or False
There are also dtypes to represent complex numbers, unsigned integers, etc.

The default dtype for arrays is `float64`:

In [None]:
a = np.zeros(3)
type(a[0])

If we want to use integers we can specify it:

In [None]:
a = np.zeros(3, dtype=int)
type(a[0])

---

## Shape and Dimension

Here `b` is a flat array with no dimension - neither row nor column vector.

The dimension is recorded in the `shape` attribute, which is a tuple.

In [None]:
b = np.zeros(10)
b.shape

To give it dimension, we can change the `shape` attribute:

In [None]:
b.shape = (10, 1)
b

Make it a 2 by 2 array:

In [None]:
b = np.zeros(4)
b.shape = (2, 2)
b

Dimension can also be specified initially when using the `np.zeros()` function.

In [None]:
b = np.zeros((2, 2))
b

You can probably guess what `np.ones` creates.

In [None]:
b = np.ones(10)
b

---

## Creating Arrays

We have already discussed `np.zeros()` and `np.ones()`.

Set up a grid of evenly spaced numbers.

In [None]:
b = np.linspace(2, 4, 5)
b

Create an identity matrix.

In [None]:
b = np.identity(3)
b

NumPy arrays can be created from Python lists, tuples, etc.

In [None]:
b = np.array([10, 20])
b

The data type can also be configured, here `float` is equivalent to `np.float64`:

In [None]:
b = np.array((10, 20), dtype=float)
b

Create a 2 dimensional, 2 by 2 array:

In [None]:
b = np.array([[1, 2], [3, 4]])
b

---

## Array indexing

For a flat array, indexing is the same as Python sequences.

In [None]:
c = np.linspace(1, 2, 5)
c

In [None]:
c[0]

In [None]:
c[1:3]

In [None]:
c[-1]

For 2D arrays we use an index position for each dimension.

In [None]:
d = np.array([[1, 2], [3, 4]])
d

In [None]:
d[0, 1]

Note that indices are still zero-based, to maintain compatibility with Python sequences.

Columns and rows can be extracted as follows:

In [None]:
d[0, :]

In [None]:
d[:, 1]

NumPy arrays of integers can also be used to extract elements.

In [None]:
indices = np.array((0, 2, 3))
c[indices]

A NumPy array of boolean values can be used to filter elements at the `True` locations.

In [None]:
e = np.array([0, 1, 1, 0, 0], dtype=bool)
e

In [None]:
c[e]

---

## Array Methods

Numpy arrays have useful methods, many of them should be familiar from previous lectures.

In [None]:
f = np.array((3, 2, 4, 1))
f

In [None]:
f.sort() # Sorts a in place
f

In [None]:
f.sum() # Sum

In [None]:
f.mean() # Mean

In [None]:
f.max() # Max

In [None]:
f.argmax() # Returns the index of the maximal element

In [None]:
f.cumsum() # Cumulative sum of the elements

In [None]:
f.cumprod() # Cumulative product of the elements

In [None]:
f.var() # Variance

In [None]:
f.std() # Standard deviation

In [None]:
f.shape = (2, 2)
f

In [None]:
f.transpose() # or simpy f.T

Many of the methods discussed above have equivalent functions in the NumPy namespace, e.g.:

In [None]:
print("Sum: {0}".format(np.sum(f)))
print("Mean: {0:.2f}".format(np.mean(f)))

---

## Arithmetic Operations

The operators `+`, `-`, `*`, `/` and `**` all act **elementwise** on NumPy arrays.

In [None]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])
a + b

In [None]:
a * b

In [None]:
a + 10

In [None]:
a * 10

Multi dimensional arrays follow the same general rules.

In [None]:
a.shape = (2, 2)
b.shape = (2, 2)
a + b

In [None]:
a + 10

In [None]:
a * b

Calculate the *dot product* of two NumPy arrays.

In [None]:
np.dot(a, b)

The `@` operator does the same thing.

In [None]:
a @ b

Calculate the *cross product* of two NumPy arrays.

In [None]:
np.cross(a, b)

---

## Random generation

Generate random numbers of the *standard normal* distribution:

In [None]:
g = np.random.randn(3)
g

Generate random integers between a lower (inclusive) and a higher (exclusive) bound:

In [None]:
g = np.random.randint(0, 100, 5)
g

---

## Mutability and Copying Arrays

NumPy arrays are mutable data types, like Python lists.
In other words, their contents can be altered (mutated) in memory after initialization.

To make an independent copy of a NumPy array, the `np.copy()` function can be used.

In [None]:
h = g
i = g.copy()
h[0] = 42

print(g)
print(h)
print(i)

---

## Vectorized Functions

The `np.vectorize()` creates a *vectorized* function, which can be performed on a NumPy array in an elementwise manner.

In [None]:
# is_even() can be called on an integer number
def is_even(x): return x % 2 == 0

# is_even_vectorized() can be called on an array of integers
is_even_vectorized = np.vectorize(is_even)
is_even_vectorized(g)

The NumPy function `np.where()` provides a vectorized alternative.

In [None]:
np.where(g % 2 == 0, 1, 0)

---

## Comparisons

As a rule, comparisons on arrays are done elementwise.

In [None]:
z = np.array([2, 3])
y = np.array([2, 3])
z == y

In [None]:
y[0] = 5
z == y

In [None]:
z != y

The situation is similar for `>`, `<`, `>=` and `<=`.

We can also do comparisons against scalars:

In [None]:
x = np.linspace(0, 10, 5)
x

In [None]:
x > 3

This is particularly useful for *conditional extraction*:

In [None]:
cond = x > 3
x[cond]

Of course we can - and frequently do - perform this in one step:

In [None]:
x[x > 3]

---

## Linear algebra

In [None]:
k = np.array([[1, 2], [3, 4]])
k

Compute the determinant:

In [None]:
np.linalg.det(k)  

Compute the inverse:

In [None]:
np.linalg.inv(k)

---

## Interpolation

Generate 20 evenly distributed number between 0 and 10 into `x`. Generate the sine function value into `y` for each elements in `x`.

In [None]:
x = np.linspace(0, 10, 20)
y = np.sin(x)
print(x)
print(y)

Generate 100 evenly distributed number between 0 and 10 into `xvals`. Calculate the interpolated values into `yinterp` for each elements in `xvals`, based on `x` and `y`.

In [None]:
xvals = np.linspace(0, 10, 100)
yinterp = np.interp(xvals, x, y)
print(xvals)
print(yinterp)

Visualize the results on a plot. *(For plotting, see [Chapter 10](10_plotting.ipynb).)*

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

plt.plot(x, y, 'o')
plt.plot(xvals, yinterp, '-x')
plt.show()