<a href="https://colab.research.google.com/github/weymouth/NumericalPython/blob/main/03NumpyAndPlotting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Numerical Python and Plotting

We have established the main features of Python, but up to this point we have only created simple functions and applied them to create lists of numbers. Even for simple tasks like this, the programming approach is **highly** preferred to using spreadsheets (like `Excel`) which are [extremely dangerous to use for important work](https://www.forbes.com/sites/timworstall/2013/02/13/microsofts-excel-might-be-the-most-dangerous-software-on-the-planet/?sh=536d1fa0633d). Well documented spreadsheet errors have led to catastrophes in [business](https://www.marketwatch.com/story/88-of-spreadsheets-have-errors-2013-04-17), [economic policy](https://www.bloomberg.com/news/articles/2013-04-18/faq-reinhart-rogoff-and-the-excel-error-that-changed-history) and [healthcare](https://www.theguardian.com/politics/2020/oct/05/how-excel-may-have-caused-loss-of-16000-covid-tests-in-england). This is because:
1. Spreadsheets *hide their methodology* behind the data. This makes it extremely difficult to transfer methods to new data and to find errors. In contrast, a program *is* the methodology, making testing and reproduction much easier.  
1. Spreadsheets are not extensible. Their available numerical methods make them inappropriate for any advanced engineering analysis.

In contrast we can easily extend the base-language of Python for arbitrarily advanced numerical work. We only need to import some additional libraries to address key issues with the base-language:
 - There are no built-in data structures for arrays, matrices, and tables (unlike, say, `Matlab` or `Julia`).
 - Using lists of `float` numbers is generally very slow and lacks useful built-in features like matrix multiplication.
 - There is no built-in method to visualize data in plots.

This notebook will introduce the `NumPy` and `PyPlot` libraries to address these issues.

# NumPy

The numerical python, or [NumPy](https://numpy.org/), library enables fast and simple numerical methods in Python. To starting using this library (or any other) we need to use a new python keyword `import`:

In [1]:
import numpy as np

This gives us access to all the methods and functions in `NumPy` using the short name `np`. 

There are [far too many](https://numpy.org/doc/stable/reference/routines.html) new methods available to go through in this introduction, but most can be grouped into a few basic categories

| Category       | Sub module   | Description                                                 |
|----------------|--------------|-------------------------------------------------------------|
| math           | numpy        | Scientific operations like $\sqrt{a},\log(a),\sin(a)$, etc  |
| arrays         | numpy        | Array and matrix creation, and array operations like multiplication |
| linear algebra | numpy.linalg | Matrix decomposition and solving linear systems             |
| fft            | numpy.fft    | Discrete Fourier Transform (of many type) and their inverse |
| random sampling| numpy.random | Create samples from different random variable distributions |

<span style="display:none"></span>

Notice we use the `np.*` notation to access things inside NumPy. Just as an example let see what is in the `numpy.random` submodule.

In [2]:
print("it contains methods such as...",dir(np.random)[30:])

it contains methods such as... ['binomial', 'bytes', 'chisquare', 'choice', 'default_rng', 'dirichlet', 'division', 'exponential', 'f', 'gamma', 'geometric', 'get_state', 'gumbel', 'hypergeometric', 'laplace', 'logistic', 'lognormal', 'logseries', 'mtrand', 'multinomial', 'multivariate_normal', 'negative_binomial', 'noncentral_chisquare', 'noncentral_f', 'normal', 'pareto', 'permutation', 'poisson', 'power', 'print_function', 'rand', 'randint', 'randn', 'random', 'random_integers', 'random_sample', 'ranf', 'rayleigh', 'sample', 'seed', 'set_state', 'shuffle', 'standard_cauchy', 'standard_exponential', 'standard_gamma', 'standard_normal', 'standard_t', 'test', 'triangular', 'uniform', 'vonmises', 'wald', 'weibull', 'zipf']


Let's get help on one of those...

In [3]:
?np.random.randint

All NumPy functions are really documented like this, including examples. Using this function we can generate a sample of what might happen if we, say, roll a 20 sided-dice 4 times.

In [4]:
samples = np.random.randint(20,size=4)
samples

array([ 9, 13, 11, 10])

Every time you run this code, the result will be different. Try it!

Also, note that `samples` is a new type called `array`. This is the building block for everything in NumPy. The easiest way to make arrays from scratch is to pass a list (or a list of lists) to `np.array`.

In [14]:
r_array = np.array(range(4))
r_matrix = np.array([[1,2],[3,4],[5,6]])
print(r_array)
print(r_matrix)

[0 1 2 3]
[[1 2]
 [3 4]
 [5 6]]


All the functions will work on arrays, just like they would for numbers. Guess what these statements will print before running:

In [6]:
np.set_printoptions(precision=3)   # This sets numpy printing precision 
np.set_printoptions(suppress=True) # Don't use scientific notation by default
print(np.sqrt(9))
print(np.sqrt(r_array))
print(np.log(np.e))
print(np.log(samples))
print(np.sin(np.pi/2))
print(np.sin(r_matrix*np.pi/4))

3.0
[0.    1.    1.414 1.732]
1.0
[2.197 2.565 2.398 2.303]
1.0
[[ 0.707  1.   ]
 [ 0.707  0.   ]
 [-0.707 -1.   ]]


Notice that NumPy defines some important constants like $e,\pi$ that we can use as well. 

More importantly, notice that we didn't need to use a list comprehension to create these new arrays. This is called _operator broadcasting_ (or vectorizing); the operation is applied to each element in the array or matrix individually. This wouldn't work for lists

In [7]:
print(2.*r_array-samples)
2.*[0,1,2,3]-[10,5,11,2]

[ -9. -11.  -7.  -4.]


TypeError: can't multiply sequence by non-int of type 'float'

There are also a number of operations that only make sense to apply to arrays vectors and matrices:

In [8]:
a = np.array([1,0])
b = np.array([4,5])
print(r_matrix.T)    # transpose
print(r_matrix @ a)  # matrix multiplication
print(np.inner(a,b)) # inner product
print(np.cross(a,b)) # cross product

[[1 3 5]
 [2 4 6]]
[1 3 5]
4
5


Looping and slicing works on arrays just like it did on lists. It also works on matrices (and higher dimensional arrays), but there are more options

In [9]:
for x in r_array:
    print(x)
print('')
print(r_array[-2:])

0
1
2
3

[2 3]


In [10]:
for row in r_matrix: # row by row
    print(row)
    for element in row: # element by element
        print(element)
print('')
print(r_matrix[0,0]) # first element
print(r_matrix[0])   # first row
print(r_matrix[:,0]) # first column

[1 2]
1
2
[3 4]
3
4
[5 6]
5
6

1
[1 2]
[1 3 5]


As an example let's create a few functions to rotate a point $p$ in 2D space by an angle $q$ around the origin.

In [11]:
def rotationMatrix(q):
    return np.array([[np.cos(q),-np.sin(q)],
                     [np.sin(q), np.cos(q)]])

def rotatePoint(p,q):
    return rotationMatrix(q) @ p

for q in np.linspace(0,2*np.pi,9):
    print("q={:.3g} rad, new p={}".format(q,rotatePoint(a,q)))

q=0 rad, new p=[1. 0.]
q=0.785 rad, new p=[0.707 0.707]
q=1.57 rad, new p=[0. 1.]
q=2.36 rad, new p=[-0.707  0.707]
q=3.14 rad, new p=[-1.  0.]
q=3.93 rad, new p=[-0.707 -0.707]
q=4.71 rad, new p=[-0. -1.]
q=5.5 rad, new p=[ 0.707 -0.707]
q=6.28 rad, new p=[ 1. -0.]


The `numpy.linspace` function makes an array numbers across a range, equally spaced.

The `rotatePoint` seems to have worked but it is a little hard to tell... 

# PyPlot

Even for the previous simple example, it would be much easier to check the results if we could plot them. Luckily, there's a library for that [Matplotlib](https://matplotlib.org/). This is the most developed plotting library in python, and since it was designed using the Matplot interface as a model, it may seem familiar.