## Scientitic Python: Computational Fluid Dynamics

### Introduction

This exercise takes an example from one of the most common
applications of HPC resources: Fluid Dynamics. We will look
at how a simple fluid dynamics problem can be run using
Python and numpy; and how Fortran and/or C code can be
called from within Python. The exercise will compare the performance of the different approaches.

We will also use this exercise to demonstrate the use of matplotlib to plot a visualisation of the simulation results.

This exercise aims to use:
* Python lists and functions
* Basic numpy array manipulation
* Plotting using matplotlib
* Calling Fortran/C from Python
* Benchmarking Python performance

### Fluid Dynamics: a brief overview

Fluid Dynamics is the study of the mechanics of fluid flow, liquids and gases in motion. This can encompass aerodynamics
and hydrodynamics. It has wide ranging applications from
vessel and structure design to weather and traffic modelling. Simulating and solving fluid dynamic problems often requires
large computational resources.

Fluid dynamics is an example of continuous system that can be described by Partial Differential Equations. For a computer to simulate these systems, the equations must be discretised onto a grid. If this grid is regular, then a finite difference approach can be used. Using this method means that the value at any point in the grid is updated using some combination of the neighbouring points.

_Discretisation_ is the process of approximating a continuous (i.e. infinite- dimensional) problem by a finite-dimensional problem suitable for a computer. This is often accomplished by putting the calculations into a grid or similar construct.

### The Problem

In this exercise the finite difference approach is used to determine the flow pattern of a fluid in a cavity. For simplicity, the liquid is assumed to have zero viscosity, which implies that there can be no vortices (i.e. no whirlpools) in the flow. The cavity is a square box with an inlet on one side and an outlet on another as shown below.

<img = "./cavity.svg">

#### Mathematical background

In two dimensions it is easiest to work with the stream function
$\Psi$ (see below for how this relates to the fluid velocity). For zero viscosity, $\Psi$ satisfies the following equation:
\[
\Nabla^2 \Psi = \frac{\partial^2 \Psi}{\partial x^2}
\]
The finite difference version of this equation is:
\[
\Psi_{i-1,j} + \Psi_{i=1,j} + \Psi_{i,j-1} + \Psi_{i,j+1}
-4 \Psi_{i,j} = 0.
\]
With the boundary values fixed, the stream function can be calculated for each point in the grid by averaging the value
at that point with its four nearest neighbours. The process continues until the algorithm converges on a solution that
stays unchanged by the averaging process. This simple approach
to solving a PDE is called the Jacobi algorithm.

In order to obtain the flow pattern of the fluid in the cavity
we want to compute the velocity field u. The x and y components of u are related to the stream function by
\[
u_x =  \frac{\partial \Psi}{\partial y}
\]
\[
u_y = -\frac{\partial \Psi}{\partial x}
\]

This means that the velocity of the fluid at each grid point
can also be calculated from the surrounding grid points. The magnitude of the velocity $\mathbf{u}$ at point $(x, y)$ is
given by:
\[
u = (u_x^2 + u_y^2)^{1/2}
\]

### An algorithm

The outline of the algorithm for calculating the velocities is
as follows:

Set the boundary values for stream function

while (convergence is FALSE) do

for each interior grid point do

update value of stream function by averaging with its 4 nearest
neighbours

check for convergence

for each interior grid point:
calculate x component of velocity calculate y component of velocity

For simplicity, here we simply run the calculation for a fixed number of iterations; a real simulation would continue until
some chosen accuracy was achieved.

### Using python

This calculation is useful to look at in Python for a number of reasons:
* It requires the use of 2-dimensional lists/arrays
* The algorithm can easily be implemented in Python, NumPy,    Fortran and C
* Visualising the results demonstrates the use of matplotlib

You are given a basic code that uses Python lists to run the simulation. There are a number of different files:

cfd.py
jacobi.py
util.py

Two additional files are provided:

plot_flow.py
cfdvort.py     Version with finite Reynolds Number (requires a slightly different formulation)
jacobivort.py


Look at the structure of the code. In particular, note:

* How the external "jacobi" function is included
* How the lists are declared and initialised to zero
* How the timing works

### First Run and Verification

First, verify that your copy of the code is producing the correct results.

Navigate to the python subdirectory and run the main program with:
```bash
prompt:~/python> ./cfd.py 1 1000
```

This runs the CFD simulation with a scale factor of 1 and 1000 Jacobi iteration steps. As the program is running you should see output that looks something like:
```

2D CFD Simulation
=================
Scale factor = 1
Iterations   = 1000

Grid size = 32 x 32

Starting main Jacobi loop ...

completed iteration 1000

... finished

Calculation took 1.11856s
```
The program will produce two text output files called
'velocity.dat' and 'colourmap.dat' with the computed velocities
at each grid point, and data providing a representation of the velocity magnitude, respectively. A simple verification is to use diff to compare your output with one of the verification datasets. For example:
```
prompt:~> diff velocity.dat ../verify/cfd_velocity_1_1000.dat
```

'diff' will only produce any output if it finds any differences between the two files. If you see any differences at this point, please ask a tutor.

### Initial benchmarking

Now produce some baseline figures with which to compare your future versions of the code. You should pick a set of representative problem sizes (defined by scale and number of iterations) that run in a sensible time on your machine but do
not complete instantaneously. (A good place to start is with scale factor 2 and 5000 iterations. You will also need some smaller and larger examples.)

Record the benchmarking calculation times for future reference.

The directory includes a utility called 'plot_flow.py' that produces a graphical representation of the final state of the simulation. You can use this to produce a PNG image as follows:
```bash
prompt:~> ./plot_flow.py velocity.dat colourmap.dat flow.png
```

<img "./reference.png">

If the fluid is flowing around the edge of the domain, rather than through the middle of the cavity, then this is an indication that the Jacobi algorithm has not yet converged. Convergence requires more iterations on larger problem sizes.

### Using numpy arrays

We will now re-factor the CFD code to use numpy arrays rather than Python lists. This has a number of advantages:

* numpy is closely integrated with matplotlib and using numpy arrays will allow us to produce the visualisation directly from our simulation rather than using a separate utility.

* numpy arrays should allow us to access better performance using more concise code.

* numpy arrays are directly compatible with native code produced by Fortran and C compilers. This will allow us to re-code the key part of our algorithm and achieve better performance while still having the benefits of coding in Python.

Replace the "psi" and "tmp" lists in the code with numpy arrays. (Remember to make a copy of the code in a new directory before you start work so you do not lose the original version.) You will need the statement:
```
import numpy as np
```
at the top of all your source files to ensure you can access the numpy functionality. The arrays will need to be implemented in the main function and all of the other functions where they are used.

Declaring and zeroing numpy arrays can be done in a single statement such as:
```
psi = np.zeros((m+2, n+2))
```

Once you think you have the correct code, run your new program and compare the output with that from the original code. If you are convinced that the output is the same then move on and benchmark your new code.

What do you find? Has using numpy arrays increased the performance of the code? Can you think of an explanation of why the performance has altered in the way that it has?

Can you change the implementation to produce a better performing version of the CFD code?

Hint 1: which method of accessing 2D array elements is faster: "a[i][j]" or "a[i,j]"?

Hint 2: you should use array index syntax (or “slicing”) if you have not already done so to specify blocks of arrays to operate on.

### Incorporating matplotlib

matplotlib and numpy have a very close relationship: matplotlib can use the typing and structure of numpy arrays to simplify the plotting of data.

We will use matplotlib to add a function to out program that produces an image of the final state of the flow.

### Define the plotting function

Define a function in your main "cfd.py" file called, e.g., plot_flow(). This function should take two arguments: the first is "psi" the numpy array containing the stream function values and the second is the name of the file in which to save the image. For example:
```
def plot_flow(psi, outfile):
```

This function should use the stream function values to compute the x and y components of the velocities and the magnitude of the velocity and store them in appropriate numpy arrays. Remember, you will need to extract the velocities of the internal part of the matrix and exclude the fixed boundary conditions.

Once you have these arrays you need to initialise the matplotlib module to plot to an image file rather than the console with the following lines:
```
import matplotlib
# Plot to image file without need for X server
matplotlib.use("Agg")
```

Next, you should import the required matplotlib functionality with:
```
import matplotlib.pyplot as plt
from matplotlib import cm
```

The first import is the core plotting functionality, and the second line is the colour-mapping module.

The simplest plot to produce is a heatmap of the magnitude of the velocity. You can use the imshow function to do this in a single line. Assuming that the velocity magnitudes are in a numpy array called vmag:
```
plt.imshow(vmag)
```
To produce the image file we need to add one further line:
```
fig.savefig(outfile)
```