Transport phenomena is the study of the transferring of momentum, energy and mass. Oftentimes these processes are coupled as in boiling water, where bubbles of air trapped in the water nucleate due to the elevated temperature reducing the solubility of gases, followed by the density differences between the bubble and water causing the bubble to rise. These processes are further mediated through convection currents caused by temperature gradients in the boiling water. 

In this example, we will look at a far simple example of energy transport, characterized as the change in temperature of a heat source placed in an infinitely large cooler surrounding region. We will assume periodic boundary conditions to overcome the limitation of finite sizes in compute and assume that the temperature in the source reduces at the same time.  (INSERT IMAGE HERE)

# Theory

For our problem, we will be basing our grid on rectilinear or cartesian coordinates. First, we will review the relevant transport equations. A more detailed explanation of each expression and derivation can be found in BSLK. We begin by writing a general energy transport equation through a control volume.

$$ change = accumulation + removal + generation + destruction $$

In our closed system, there is no generation, destruction and accumulation of energy, meaning that the change in energy over time can be written as

$ change = removal$

If we approximate that the energy lost from the addition of heat is lost through conduction, the removal of energy can be approximated using fourier's law, defined as 

$$ q = -k \nabla T $$

where $k (\frac{W}{mK})$ is the thermal conductivity of the matrix, $\nabla T (\frac{K}{m})$ is the temperature gradient between the heated area and its surroundings and $q (\frac{W}{m^2 K})$ is the heat flux through a surface. This is related to the energy change at each grid point over time as, 

$\frac{d E}{dt} = \nabla q$

Where $E (\frac{J}{m^3})$ is the energy density of a specific grid point and $t (s)$ is the time. This leads to the expression, 

$$ \frac{d E}{d t} = -k \nabla^2 T $$

The energy density of the heat source can be expressed from the heat capacity of the heat source, defined as $E = \rho c_p (T - T_{ref})$. If we make the further approximation that density $\rho (\frac{kg}{m^3})$ and specific heat capacity $\frac{J}{K kg}$ stay constant with temperature, we can rearrange the above expression to obtain the following PDE written as a function of temperature exclusively,

$$ \frac{dT}{dt} = -\frac{k}{\rho c_p} \nabla^2 T$$

Oftentimes, the factor $\frac{k}{\rho c_p}$ is condensed into the term $\alpha (\frac{m^2}{s})$ defining the thermal diffusivity of the system and will be used as an input parameter.

Now that we have laid out the basic theory being implemented, we shall turn to how we can convert math to something a computer can solve. There are derivatives of time and space that we will need to solve for. I will stick to the simplest ones here as the focus is on creating an illustration how the code looks different between implementations in different languages. However, the literature of how both are done is vast with (SOURCES PROVIDED).

## Finite difference

The simplest way to discretize a differential equation in space is using a finite difference technique. The core idea of this scheme is we convert the continous differential equations above into discrete sums and differences which we can solve. Commonly, this is done using a central difference scheme shown below for the first $f^1_x$ and second derivative $f^2_x$ for a function $f(x,y,t)$

\begin{align}
f^1_x &= \frac{f(x + \Delta x, y, t) - f(x - \Delta x, y, t)}{2\Delta x} \\
f^2_x &= \frac{f(x + \Delta x, y, t) - 2f(x,y,t) + f(x - \Delta x, y, t)}{\Delta x^2} 
\end{align}

Where $\Delta x$ is the spacing between grid points. Central difference offers produced second order error compared to forward or backward difference which produce first order error.

## Forward Euler

Next, I look at how we can integrate over time. Multiple time integration schemes exist which are split into explicit and implicit schemes. Explicit schemes include techniques such as Forward Euler and the Runge-Kutta time integrators while implicit schemes include the Backwards Euler and Crank-Nicholson schemes. For now, we will stick to the Forward Euler technique. 

Forward Euler solves an initial value problem by iterating a value of a variable through time using the initial value and the change in the variable through time. Notationally, this looks like

\begin{align}
g(x, y, t_0) &= b \\
\frac{dg}{dt} &= f(x,y,t) \\
g(x, y , t + \Delta t) &= g(x, y, t) + \Delta t \frac{dg}{dt}
\end{align}

where $g(x, y, t)$ is a value of a function at time $t$ and location $(x,y)$, $\frac{dg}{dt}$ is the derivative of function $g$ with respect to time $t$ and is expressed as any function $f(x,y,t)$. The value of $g(x, y, t + \Delta t)$ is a sum of the value of $g(x, y, t)$ and the timestep over which the derivative $\frac{dg}{dt}$ is calculated.

The Forward Euler technique can be numerically unstable. Therefore care must be taken when selecting parameters for time and space integration and must fall within the CFL condition, defined as $C = \frac{u \Delta t}{\Delta x} < 1$. In this case as we do not have advection we will be using the CFL condition as defined from the thermal diffusivity, defined as $C = \frac{\alpha \Delta t}{\Delta x^2}$

## Python implementation

Using the established rules above, we move to generating a prototype in python. We assume the heat source places is a circle at the middle of the system with a radius that can be specified with the system containing periodic boundary conditions. 

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import HTML
import glob

import analysis_src.visualization as visualize
import analysis_src.forward_euler_1D as FD_1D
import analysis_src.forward_euler_2D as FD_2D

figheight = 5

We first implement the system in 1D to demonstrate the conceptual structure.

### 1D version

In [None]:
## EDIT THIS ##
L = 256
dx = 1/L
hot_point = 64
halo = 1

dt = 0.0002
timesteps = 10000
dump_freq = 100

T_cold = 273
T_hot = 280
alpha = 0.002

bc_type = FD_1D.BoundaryType.PERIODIC

T_l = 275 # only does something if boundary type if set to constant_value
T_r = 275 # only does something if boundary type if set to constant_value
## EDIT THIS ##

C = alpha/(dx**2/dt)
if C > 1:
    raise RuntimeWarning("CFL condition not fulfilled")

In [None]:
# creating data structures
rod_old = FD_1D.make_system(L, hot_point, T_hot, T_cold)

# inserting halo regions
rod_old = np.insert(rod_old, 0, 0)
rod_old = np.insert(rod_old, L+1, 0)

rod_new = rod_old.copy()
total_E = np.empty(timesteps + 1)

output_temperatures = np.empty((timesteps//dump_freq+1, L))
output_length = np.empty((timesteps//dump_freq+1, L))

for t in range(0, timesteps + 1):
    rod_old = FD_1D.apply_boundary(rod_old, bc_type, T_left=T_l, T_right=T_r, halo=halo)

    if t%dump_freq == 0:
        idx = t//dump_freq
        output_temperatures[idx] = rod_old[1:L+1]
        output_length[idx] = np.arange(0, L)
    
    rod_new = FD_1D.FD_timestep(rod_old, dx, dt, alpha, halo)

    total_E[t] = np.sum(rod_new[1:L])

    rod_old = rod_new.copy()

In [None]:
ar = 1
fig, axs = plt.subplots(1, 2, figsize = (2*figheight*ar, figheight))

ax = axs[0]
ax.plot(total_E/total_E[0]*100)

ax.set_ylabel("Total Energy")
ax.set_xlabel("t"+r"$\Delta t$")

ax = axs[1]
ax.plot(rod_new)
ax.set_ylim([T_cold, T_hot])


ax.set_ylabel("Temperature (K)")
ax.set_xlabel("Rod length"+r"$\Delta x$")

fig.tight_layout()

Left plot shows energy conservation while right plot shows the temperature profile at the final timestep. Now that we have shown a succesful implementation in 1D, lets up the difficulty somewhat and move to 2D

### 2D version

In [None]:
## CHANGE THESE ##
nx = 256
ny = 256
# keep square for now (nx == ny)
halo = 1
Lx = nx//4
Ly = ny//4
bc = FD_2D.BoundaryType.PERIODIC

dt = 0.0001
timesteps = 10000
dump_freq = 100

alpha = 0.01
T_h = 373 # temperature of hot region
T_c = 273 # temperature of cold region

T_l = 275 # temperature of left wall
T_r = 275 # temperature of right wall
T_t = 275 # temperature of top wall
T_b = 275 # temperature of bottom wall
## CHANGE THESE ##

dx = 1/nx
dy = 1/ny
C = alpha/(min(dx, dy)**2/dt)
if C > 0.1:
    raise RuntimeWarning("CFL condition not fulfilled")

In [None]:
grid_old = FD_2D.make_system_square(nx+2*halo, ny+2*halo, Lx, Ly, T_h, T_c)
grid_new = np.empty((nx+2*halo, ny+2*halo))

temperature_fields_time = np.empty((timesteps//dump_freq+1, nx, ny))
times = np.empty(timesteps//dump_freq+1)
total_E = np.empty(timesteps//dump_freq+1)

for time in range(0, timesteps+1):
    grid_old = FD_2D.apply_boundary(grid_old, bc, T_left=T_l, T_right=T_r, T_top = T_t, T_bottom = T_b, halo=halo)

    if time%dump_freq == 0:
        temperature_fields_time[time//dump_freq] = grid_old[halo:nx+halo, halo:ny+halo]
        times[time//dump_freq] = time
        total_E[time//dump_freq] = np.sum(grid_old[halo:nx-halo,halo:ny-halo])
    
    grid_new = FD_2D.FD_timestep(grid_old, dx, dy, dt, alpha, halo)

    grid_old = grid_new.copy()

In [None]:
ar = 1.25
fig, ar = plt.subplots(1, 1, figsize = (ar*figheight, figheight))
plt.plot(times, total_E/total_E[0]*100, "rx")

Energy is conserved

In [None]:
ani = visualize.animate_colormap(temperature_fields_time, times = times)
HTML(ani.to_jshtml())

Animation of temperature over time

# Single core implementation in C++

Now that we have the base in python, lets implement the FD technique in C++. This section will use various `std` libraries in C++ but will omit MPI at the moment. We will be using pass by reference, although pass by value to more closely align with the python code can also be done, if slower.

TODO

1. [] input file with parsing
2. [] different BC's
3. [x] speed improvements through pass by reference

In [None]:
path = "src/single_core"
data_paths = sorted(glob.glob(f"{path}/T*.txt"))
total_E = np.empty(len(data_paths))
times = np.empty(len(data_paths), dtype = int)
raw_data = []

for i, path in enumerate(data_paths):
    time = int(path.split("/")[-1].split("_")[-1].split(".")[0])
    data = visualize.read_data(path)

    times[i] = time
    raw_data.append(data)
    total_E[i] = np.sum(data)

raw_data = np.array(raw_data)

In [None]:
ar = 1.25
fig, ax = plt.subplots(1, 1, figsize = (ar*figheight, figheight))

ax.plot(times, total_E/total_E[0], 'rx')
ax.set_ylabel(r"$\frac{E(t)}{E_0}$")
ax.set_xlabel(r"$t \Delta t$")

Energy conservation

In [None]:
ani = visualize.animate_colormap(raw_data, times = times)
HTML(ani.to_jshtml())

Simulations runs as expected

# Multicore (CPU) implementation in C++

2 main ways to perform compute on multiple cores are available. They are openMP and MPI based techniques. These differ on how memory is allocated between processors, which also impacts the scalability and performance of each based on system size and allocated resources. openMP uses shared memory to remove the need for message passing but is only supported by systems that share memory. MPI supports parallel computation for systems with distributed or shared memory as each process is executes independently from one another.

For personal computers, openMP offers a simple way to add parallelism without needing to resort to the extra preparation and planning needed to write and execute MPI based code at the expense of less flexibility if wanting to use HPC compute clusters such as the DoE's Frontier, Aurora or Perlmutter. 

Please read the source of my knowledge on this topic, the fantastic textbook "Parallel Programming in C with MPI and OpenMP" by Michael J. Quinn. It may be slightly outdated given its 20+ years old, but it holds great information on the topic without needing to resort to AI. 

## OpenMP

This section is based on chapter 17 in the textbook "Parallel Programming in C with MPI and OpenMP" by Michael J. Quinn.

In [None]:
path = "src/open_MP"
data_paths = sorted(glob.glob(f"{path}/T*.txt"))
total_E = np.empty(len(data_paths))
times = np.empty(len(data_paths), dtype = int)
raw_data = []

for i, path in enumerate(data_paths):
    time = int(path.split("/")[-1].split("_")[-1].split(".")[0])
    data = visualize.read_data(path)

    times[i] = time
    raw_data.append(data)
    total_E[i] = np.sum(data)

raw_data = np.array(raw_data)

In [None]:
ar = 1.25
fig, ax = plt.subplots(1, 1, figsize = (ar*figheight, figheight))

ax.plot(times, total_E/total_E[0], 'rx')
ax.set_ylabel(r"$\frac{E(t)}{E_0}$")
ax.set_xlabel(r"$t \Delta t$")

Energy is converved

In [None]:
ani = visualize.animate_colormap(raw_data, times = times)
HTML(ani.to_jshtml())

## OpenMPI

In [None]:
path = "src/MPI"
data_paths = sorted(glob.glob(f"{path}/T*.txt"))
total_E = np.empty(len(data_paths))
times = np.empty(len(data_paths), dtype = int)
raw_data = []

for i, path in enumerate(data_paths):
    time = int(path.split("/")[-1].split("_")[-1].split(".")[0])
    data = visualize.read_data(path)

    times[i] = time
    raw_data.append(data)
    total_E[i] = np.sum(data)

raw_data = np.array(raw_data)

In [None]:
ar = 1.25
fig, ax = plt.subplots(1, 1, figsize = (ar*figheight, figheight))

ax.plot(times, total_E/total_E[0], 'rx')
ax.set_ylabel(r"$\frac{E(t)}{E_0}$")
ax.set_xlabel(r"$t \Delta t$")

In [None]:
ani = visualize.animate_colormap(raw_data, times = times)
HTML(ani.to_jshtml())