# Scientific Computing in Python
An introduction to scientific computing in Python by [Dr. Yi-Xin Liu](http://www.yxliu.group) at Fudan University (lyx@fudan.edu.cn).  
This is a part of the course: *Road to Scientific Research: Powerful Computer Applications* (XDSY118019.01).  
Lecture date: 2022.09.22

#### Resources
- [Numpy quickstart](https://numpy.org/doc/stable/user/quickstart.html)
- [Numpy absolute beginners guide](https://numpy.org/doc/stable/user/absolute_beginners.html)
- [Scipy official tutorial](https://docs.scipy.org/doc/scipy/tutorial/index.html)

*Side note*

- [Google CoLab: Online Notebook](https://colab.research.google.com/)
- [Youtube: Playing with Data in Jupyter Notebooks with VS Code](https://www.youtube.com/watch?v=r0wLl_rfxRs)


## Essential packages

Python list is useful but it is not efficient as well as convenience for scientific computing which involves many linear algebra calculations. For example, if we want to multiply two vectors of same size elementwisely, using Python list, a first naive try will fail miserably as

In [192]:
v1 = [1, 2, 3]
v2 = [4, 5, 6]

In [193]:
v1 * v2

TypeError: can't multiply sequence by non-int of type 'list'

Instead, we have to implement a function like:

In [194]:
def vec_multiply(v1, v2):
    v = []
    for e1, e2 in zip(v1, v2):
        v.append(e1 * e2)

    return v

In [195]:

vec_multiply(v1, v2)

[4, 10, 18]

Or use list comprehension like

In [196]:
[e1 * e2 for e1, e2 in zip(v1, v2)]

[4, 10, 18]

In practice, we will use packages to facilitate us to do linear algebra calculations. In Python, there are two such prominent packages: `numpy` and `scipy`. With numpy, we can simply do

In [197]:
import numpy as np

# convert Python lists to numpy arrays
v1 = np.array(v1)
v2 = np.array(v2)

v = v1 * v2
v

array([ 4, 10, 18])

### Numpy

Numpy is the fundamental package for scientific computing with Python.

It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

Numpy comes with Anaconda by default.

Tutorial to learn Numpy:
- https://numpy.org/doc/stable/user/quickstart.html
- https://numpy.org/doc/stable/user/absolute_beginners.html.

### SciPy

SciPy provides fundamental algorithms for scientific computing in Python.

SciPy is a collection of mathematical algorithms and convenience functions built on the NumPy extension of Python. It adds significant power to the interactive Python session by providing the user with high-level commands and classes for manipulating and visualizing data. With SciPy, an interactive Python session becomes a data-processing and system-prototyping environment rivaling systems, such as MATLAB, IDL, Octave, R-Lab, and SciLab.

The additional benefit of basing SciPy on Python is that this also makes a powerful programming language available for use in developing sophisticated programs and specialized applications. Scientific applications using SciPy benefit from the development of additional modules in numerous niches of the software landscape by developers across the world. Everything from parallel programming to web and data-base subroutines and classes have been made available to the Python programmer. All of this power is available in addition to the mathematical libraries in SciPy.

SciPy also comes with Anaconda by default.

Tutorial to learn SciPy:
- https://docs.scipy.org/doc/scipy/tutorial/index.html.

## Numpy array

### Python list vs. Numpy array

NumPy gives you an enormous range of fast and efficient ways of creating arrays and manipulating numerical data inside them. While a Python list can contain different data types within a single list, all of the elements in a NumPy array should be homogeneous. The mathematical operations that are meant to be performed on arrays would be extremely inefficient if the arrays weren’t homogeneous.

**Why use NumPy?** NumPy arrays are faster and more compact than Python lists. An array consumes less memory and is convenient to use. NumPy uses much less memory to store data and it provides a mechanism of specifying the data types. This allows the code to be optimized even further.

### What is an array?
An array is a central data structure of the NumPy library. An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element. It has a grid of elements that can be indexed in various ways. The elements are all of the same type, referred to as the array dtype.

### Dimension, shape, and size

- **vector** - One-dimension (1D) array

In [20]:
v1 = np.array([1, 2, 3])
v1

array([1, 2, 3])

In [24]:
v1.ndim, v1.shape, v1.size

(1, (3,), 3)

- **matrix** - Two-dimension (2D) array

In [19]:
v2 = np.array([[1, 2, 3],
               [4, 5, 6]])
v2

array([[1, 2, 3],
       [4, 5, 6]])

In [25]:
v2.ndim, v2.shape, v2.size

(2, (2, 3), 6)

- **tensor** - Three-dimension (3D) array and above

In [198]:
v3 = np.array([[[1, 2, 3], [4, 5, 6]],
               [[7, 8, 9], [0, 1, 2]]])
v3

array([[[1, 2, 3],
        [4, 5, 6]],

       [[7, 8, 9],
        [0, 1, 2]]])

In [199]:
v3.ndim, v3.shape, v3.size

(3, (2, 2, 3), 12)

### Array creation

In [200]:
np.arange(10, dtype=float)  # vector

array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [201]:
np.linspace(0, 1, 11)  # vector

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

In [202]:
np.random.rand(2, 3)

array([[0.28854144, 0.1798191 , 0.38464953],
       [0.32366069, 0.4465389 , 0.6885494 ]])

In [203]:
np.random.randn(2, 3)

array([[ 0.41579741,  0.78876368, -1.15107587],
       [ 0.47760424,  0.22386297,  0.14167771]])

In [204]:
np.zeros((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [205]:
np.ones((2, 3))

array([[1., 1., 1.],
       [1., 1., 1.]])

In [206]:
np.eye(2)  # matrix

array([[1., 0.],
       [0., 1.]])

### Array attributes and methods

In [207]:
v2 = np.array([[1, 2, 3],
               [4, 5, 6]])
v2

array([[1, 2, 3],
       [4, 5, 6]])

In [208]:
v2.max(), v2.argmax(), v2.min(), v2.argmin(), v2.sum(), v2.cumsum()

(6, 5, 1, 0, 21, array([ 1,  3,  6, 10, 15, 21]))

### Indexing and slicing

Numpy offers several ways to index into arrays and accessing/changing specific elements, rows, columns, etc.

#### Python list like indexing and slicing

In [284]:
v1 = np.arange(10)
v1

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [285]:
v1[4]

4

In [290]:
v1[3:6], v1[:3], v1[7:], v1[:]

(array([3, 4, 5]),
 array([0, 1, 2]),
 array([7, 8, 9]),
 array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))

_Slice is a view of orginal array._ Change the slice will change the orignial array.

In [291]:
slice_v1 = v1[3:6]
slice_v1[:] = 99
v1

array([ 0,  1,  2, 99, 99, 99,  6,  7,  8,  9])

In [296]:
v2 = np.arange(24).reshape(4, 6)
v2

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])

In [297]:
v2[1]  # indexing a row

array([ 6,  7,  8,  9, 10, 11])

In [298]:
v2[1, :]

array([ 6,  7,  8,  9, 10, 11])

In [299]:
v2[1, 1]  # indexing a single element

7

In [300]:
v2[:, 1]  # the second column

array([ 1,  7, 13, 19])

In [301]:
v2[:2, 3:5]  # elements in the first and second rows, the fourth and fifth colums.

array([[ 3,  4],
       [ 9, 10]])

#### Integer array indexing

integer array indexing allows you to construct arbitrary arrays using the data from another array.

In [276]:
A = np.array([[1, 2, 3],
              [4, 5, 6]])
A.shape

(2, 3)

In [277]:
I = [0, 1]  # indices for first dimension, thus should less than 2 (not included)

In [278]:
J = [2, 0]  # indices for second dimension, thus should less than 3 (not included)

In [279]:
A[I, J]

array([3, 4])

The result array is a view of the original array. Thus we can use it to mutate its content,

In [280]:
A[I, J] += 9
A

array([[ 1,  2, 12],
       [13,  5,  6]])

#### Boolean array indexing

Boolean array indexing lets you pick out arbitrary elements of an array. Frequently this type of indexing is used to select the elements of an array that satisfy some condition.

In [260]:
a = np.arange(1, 11)
a

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [261]:
a > 4

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [262]:
a[a>4]

array([ 5,  6,  7,  8,  9, 10])

In [263]:
b = np.random.rand(4, 5)
b

array([[0.08902198, 0.41935228, 0.26680782, 0.74410475, 0.39257866],
       [0.55968341, 0.61875226, 0.68150949, 0.44039012, 0.99340642],
       [0.13688794, 0.48035128, 0.94218335, 0.3552585 , 0.85673765],
       [0.23754581, 0.38956544, 0.19819892, 0.3499117 , 0.8062432 ]])

In [264]:
b[b<0.5]

array([0.08902198, 0.41935228, 0.26680782, 0.39257866, 0.44039012,
       0.13688794, 0.48035128, 0.3552585 , 0.23754581, 0.38956544,
       0.19819892, 0.3499117 ])

#### Exercise
- Create the following matrix and assign it to variable `B`.

$$
\begin{bmatrix}
    1 &2 &3 &4 &5 \\
    6 &7 &8 &9 &10 \\
    11 &12 &13 &14 &15 \\
    16 &17 &18 &19 &20 \\
    21 &22 &23 &24 &25 \\
    26 &27 &28 &29 &30
\end{bmatrix}
$$

- Retrieve `23`.

- Retrieve `[2, 8, 14, 20]`

- Retrieve

$$
\begin{bmatrix}
    11 &12 \\
    16 &17
\end{bmatrix}
$$

- Retrieve

$$
\begin{bmatrix}
    4 &5 \\
    24 &25 \\
    29 &30
\end{bmatrix}
$$

- Find all even elements.

- Compute the sum of all odd elements.

In [267]:
# Do the exercise below


### Resizing and reshaping

- Adding or removing elements: `np.append`, `np.insert`, `np.delete`, `np.resize`.

In [209]:
u = np.array([])
u

array([], dtype=float64)

In [210]:
v = np.append(u, 1)
u, v

(array([], dtype=float64), array([1.]))

In [211]:
v = np.array([[1, 2, 3],
              [4, 5, 6]])
v

array([[1, 2, 3],
       [4, 5, 6]])

- Flattening: `array.flatten()`

In [212]:
v.flatten()

array([1, 2, 3, 4, 5, 6])

In [213]:
v

array([[1, 2, 3],
       [4, 5, 6]])

- Stacking: `numpy.hstack`, `numpy.vstack`.

In [303]:
f = np.array([1,2,3])
g = np.array([4,5,6])

np.hstack((f, g))

array([1, 2, 3, 4, 5, 6])

In [304]:
np.vstack((f, g))

array([[1, 2, 3],
       [4, 5, 6]])

In [309]:
h1 = np.ones((2,4))
h2 = np.zeros((2,2))

np.hstack((h1,h2))

array([[1., 1., 1., 1., 0., 0.],
       [1., 1., 1., 1., 0., 0.]])

In [311]:
np.vstack((h1, h2))  # Error: dimension size not match

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 4 and the array at index 1 has size 2

In [313]:
v1 = np.ones((4,2))  # transpose the matrix
v2 = np.zeros((2,2))

np.vstack((v1,v2))  # now it is correct to stack vertically.

array([[1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [0., 0.],
       [0., 0.]])

## Linear algebra

### Arithmetic and broadcasting

In [214]:
u = np.array([1, 2, 3])
v = np.array([4, 5, 6])

In [215]:
# basic operations
u + v, u - v, u * v, u / v

(array([5, 7, 9]),
 array([-3, -3, -3]),
 array([ 4, 10, 18]),
 array([0.25, 0.4 , 0.5 ]))

In [216]:
# Math functions
np.sqrt(u), np.sin(u), np.cos(u), np.log(u), np.exp(u)

(array([1.        , 1.41421356, 1.73205081]),
 array([0.84147098, 0.90929743, 0.14112001]),
 array([ 0.54030231, -0.41614684, -0.9899925 ]),
 array([0.        , 0.69314718, 1.09861229]),
 array([ 2.71828183,  7.3890561 , 20.08553692]))

In [217]:
# basic statistics
np.mean(u), np.median(u), np.std(u)

(2.0, 2.0, 0.816496580927726)

**Broadcasting** is super cool and super useful. When doing elementwise operaitons, arrays expand to the "correct" shape.

_Broadcasting is even cooler in Julia. Check it out!_

In [218]:
# broadcasting a scalar over a vector
u + 4

array([5, 6, 7])

In [219]:
a = np.arange(5)
a

array([0, 1, 2, 3, 4])

In [220]:
b = np.random.rand(2, 5) 
b

array([[0.91328045, 0.52119624, 0.68368728, 0.57745181, 0.23883614],
       [0.36910582, 0.74231474, 0.76104743, 0.54168836, 0.80033989]])

In [221]:
a * b

array([[0.        , 0.52119624, 1.36737457, 1.73235544, 0.95534454],
       [0.        , 0.74231474, 1.52209486, 1.62506509, 3.20135956]])

In [222]:
a + b

array([[0.91328045, 1.52119624, 2.68368728, 3.57745181, 4.23883614],
       [0.36910582, 1.74231474, 2.76104743, 3.54168836, 4.80033989]])

### Vector product

Given two equal-length column vectors $\mathbf{u}=[u_1, u_2, u_3]$ and $\mathbf{v}=[v_1, v_2, v_3]$, we can compute their inner, cross, and outer products.

In [223]:
u = np.array([1, 2, 3])
v = np.array([4, 5, 6])

- Inner product

$$
    \mathbf{u}\cdot\mathbf{v} = u_1v_1 + u_2v_2 + u_3v_3
$$

In [224]:
np.dot(u, v)

32

- Cross product

$$
    \mathbf{u} \times \mathbf{v} = 
    \begin{vmatrix}
        \mathbf{i}\; &\mathbf{j}\; &\mathbf{k} \\
        u_1\; &u_2\; &u_3 \\
        v_1\; &v_2\; &v_3
    \end{vmatrix}
    =
    \begin{bmatrix}
        u_2v_3 - u_3v_2 \\
        u_3v_1 - u_1v_3 \\
        u_1v_2 - u_2v_1
    \end{bmatrix}
$$

In [225]:
np.cross(u, v)

array([-3,  6, -3])

- Outer product

$$
\mathbf{u} \otimes \mathbf{v} = \mathbf{u}\mathbf{v}^T =
    \begin{bmatrix}
        u_1 \\ u_2 \\ u_3
    \end{bmatrix}
    \begin{bmatrix}
        v_1\; v_2\; v_3
    \end{bmatrix}
    =
    \begin{bmatrix}
        u_1v_1\; &u_1v_2\; &u_1v_3 \\
        u_2v_1\; &u_2v_2\; &u_2v_3 \\
        u_3v_1\; &u_3v_2\; &u_3v_3
    \end{bmatrix}
$$

In [226]:
u.reshape(3,1) @ v.reshape(1,3)

array([[ 4,  5,  6],
       [ 8, 10, 12],
       [12, 15, 18]])

### Matrix

In [227]:
A = np.array([[1, 2, 3],
              [6, 5, 4]])
A  # is a 2x3 matrix

array([[1, 2, 3],
       [6, 5, 4]])

In [228]:
A.shape

(2, 3)

In [229]:
# transpose
A.T  # 2x3 to 3x2

array([[1, 6],
       [2, 5],
       [3, 4]])

In [230]:
A  # A is not modified by `.T` operation.

array([[1, 2, 3],
       [6, 5, 4]])

In [231]:
# matrix-vector product
A @ np.array([0, 1, 0])

array([2, 5])

In [232]:
# matrix-matrix product
A @ A.T

array([[14, 28],
       [28, 77]])

In [233]:
M = np.array([[1, 2, 3],
              [6, 5, 4],
              [8, 9, 7]])
M  # is a 3x3 matrix

array([[1, 2, 3],
       [6, 5, 4],
       [8, 9, 7]])

In [234]:
# determinant
np.linalg.det(M) # requires a square matrix

20.99999999999999

In [235]:
# matrix inversion
np.linalg.inv(M)

array([[-0.04761905,  0.61904762, -0.33333333],
       [-0.47619048, -0.80952381,  0.66666667],
       [ 0.66666667,  0.33333333, -0.33333333]])

In [236]:
# eigen values and eigen vectors
np.linalg.eig(M)

(array([14.78674789+0.j        , -0.89337394+0.78871641j,
        -0.89337394-0.78871641j]),
 array([[ 0.2527029 +0.j        , -0.53434387+0.31563923j,
         -0.53434387-0.31563923j],
        [ 0.49476275+0.j        ,  0.67841481+0.j        ,
          0.67841481-0.j        ],
        [ 0.83147523+0.j        , -0.19802223-0.33968963j,
         -0.19802223+0.33968963j]]))

In [237]:
# QR decomposition
Q, R = np.linalg.qr(M)

In [238]:
Q

array([[-0.09950372,  0.56871112, -0.81649658],
       [-0.59702231, -0.69057779, -0.40824829],
       [-0.79602975,  0.44684446,  0.40824829]])

In [239]:
R

array([[-10.04987562, -10.34838678,  -8.25880868],
       [  0.        ,   1.70613337,   2.07173338],
       [  0.        ,   0.        ,  -1.22474487]])

In [240]:
Q @ R

array([[1., 2., 3.],
       [6., 5., 4.],
       [8., 9., 7.]])

In [241]:
A

array([[1, 2, 3],
       [6, 5, 4]])

In [242]:
np.linalg.qr(A)

(array([[-0.16439899, -0.98639392],
        [-0.98639392,  0.16439899]]),
 array([[-6.08276253, -5.26076759, -4.43877266],
        [ 0.        , -1.15079291, -2.30158582]]))

In [243]:
U, s, V = np.linalg.svd(A, full_matrices=False)
U.shape, s.shape, V.shape

((2, 2), (2,), (2, 3))

In [244]:
U @ np.diag(s) @ V

array([[1., 2., 3.],
       [6., 5., 4.]])

### Solving linear systems

Given a set of linear equations:

$$
\begin{cases}
a_{11} x_1 + a_{12} x_2 +\dots + a_{1n} x_n = b_1 \\
a_{21} x_1 + a_{22} x_2  + \dots + a_{2n} x_n = b_2 \\ 
\vdots\\
a_{m1} x_1 + a_{m2} x_2 + \dots + a_{mn} x_n = b_m,
\end{cases}
$$

where $x_1, x_2,\dots,x_n$ are the unknowns, $a_{11},a_{12},\dots,a_{mn}$ are the coefficients of the system such that $a_{11} + a_{12} + \dots + a_{mn}\neq 0$, and $b_1,b_2,\dots,b_m$ are the constant terms.

We can reformulate it into a matrix form,

$$
\mathbf{A}\mathbf{x} = \mathbf{b}
$$

where $\mathbf{A}$ is an $m\times n$ matrix, $\mathbf{x}$ is a column vector with $n$ entries, and $\mathbf{b}$ is a column vector with $m$ entries.

$$
\mathbf{A} =
\begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1n} \\
a_{21} & a_{22} & \cdots & a_{2n} \\
\vdots & \vdots & \ddots & \vdots \\
a_{m1} & a_{m2} & \cdots & a_{mn}
\end{bmatrix},\quad
\mathbf{x}=
\begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_n
\end{bmatrix},\quad
\mathbf{b}=
\begin{bmatrix}
b_1 \\
b_2 \\
\vdots \\
b_m
\end{bmatrix}
$$

If $\mathbf{A}$ is nonsigular, the solution is

$$
\mathbf{x} = \mathbf{A}^{-1}\mathbf{b}
$$

#### Example

Can you solve the following linear system?

$$
\begin{cases}
x_1 - x_3 = 1 \\
4x_1 + x_2  + x_3 = 2 \\ 
3x_1 + 2x_2  - 5x_3 = 3
\end{cases}
$$

In [245]:
# A = ?

In [246]:
# b = ?

In [247]:
# x = ?

## Interpolation

## Root finding

## Optimization

## Numerical integration

## Differential equations