In [None]:
import numpy as np

# <center>`josiepy`</center>
## <center>A Python framework to simulate Partial Differential Equations focused on usability</center>
### <center>(without sacrifying performances)</center>

<div style="text-align:center;margin-top:1em">
    <i class="fa fa-user"></i> Ruben Di Battista <br /> 
    <i class="fa fa-twitter"></i> <a href="https://twitter.com/rdbisme">@rdbisme</a><br /><br />
</div>

# Background
----------------------
## M. Massot Research Tem
### Mathematical modeling
- Mathematical modeling of two-phase injection in rocket engines
- Mathematical modeling of plasma flows for heliophysics and hall thrusters

### Numerical Calculus for High Performance Computing (HPC)
- Development of numerical schemes and algorithms adapted to HPC



# `canoP`
## A C++ wrapper on `p4est`, a quad/octo-trees for parallel AMR computation, to simulate PDEs w/ AMR

- C++ is notoriously hard to master
- `p4est` is an advanced library w/ features not immediately needed when exploring numerical schemes

# Motivations
--------------------

An easy to use playground to easily implement numerical schemes on 1D, 2D (and maybe 3D in the future) structured (and maybe unstructured and non-conformal meshes... dreaming is still free 😇) meshes, in Python

# Design Choices
--------------------------




- Use Python instead of a Domain Specific Language (DSL), like the one OpenFOAM uses, or another scripting language, like `lua` in `canoP` to describe your case configuration. 

- Modern Python (Python >= 3.7) features allows static checking (through the usage of type hinting) of the code that predates some advantages of the compiled languages

- The `numpy` library and its API allows to write algorithms once while being able to run them on different architectures and different programming paradigms

   - CPU Shared memory ([numpy](https://numpy.org) + OpenMP-enable BLAS library)
   - CPU distributed memory ([Dask](https://dask.org))
   - Nvidia GPUs with CUDA ([cupy](https://cupy.chainer.org))
   - (Soon) AMD GPUs with HIP ([cupy](https://cupy.chainer.org))
   - ... FPGA? Anyone? It's a cool project too... 
   
   Everything mostly at the same computational speed as C and C++, or faster (GPU for example)


# Numpy for `josiepy` 101
## Vectorized operations

What you would generally do in a C++-like compiled code is: 

```
for( cell : mesh.cells() ) {
    apply_compute_kernel(cell)
}

```

What we generally will do in `numpy` is operating on an array containing **all** the cells values at the same time.

Let's imagine we have the velocity vector stored in an array of size $N_\text{cells} \times \text{dim}$, where $\text{dim}$ is the dimensionality of the problem (e.g. 2D)

In [None]:
cell_values = np.random.random((30, 2))
U = cell_values[:, 0]
V = cell_values[:, 1]

Let's imagine now to have to compute a normal product between all the velocity values in each cell, and a singular 
normal vector

In [None]:
normal = np.array([1, 0])

What we could do is repeating the `normal` array $N_\text{cells}$ times and multipliying it by `cell_values`...

... but there's a better way!
The operation we want to do is

$$
    \left\{\mathbf{U}\right\} \cdot \mathbf{n} = U_{ij}n_j
$$

In [None]:
normal_velocities = np.einsum("ij,j->i", cell_values, normal)
print(normal_velocities.shape)

np.array_equal(U, normal_velocities)

`numpy.einsum` is an implementation of the Einstein's tensor summation on powerups. You specify the input vectors dimensions and the output dimension you want to achieve, and it figures it out automatically.

Basically:
$$
U_{ij} n_j \rightarrow u_j
$$
```
np.einsum("ij,j->i", cell_values, normal)
```
* Repeated indices on the inputs are summed as a result if they're not present in the result (after the `->`) are: `normal_velocities[i][0] * normal[0] + normal_velocities[i][1] * normal[1]`
    
* Indices that are not present on the right side of the statement, are multiplied

`np.einsum` avoids memory copies and if we provide the `optimize=True` keyword argument

In [None]:
np.einsum("ij,j->i", cell_values, normal, optimize=True)

it is gonna exploit parallelization of operations like `np.dot` and `np.tensordot` that are parallelized natively with OpenMP if the BLAS library which `numpy` is compiled on allows it 

(hint: the classical `pip install numpy` pre-compiled wheel, does not)

## Vectorized conditionals

Sometimes, for example to initialize the domain at the beginning of a simulation, or for an upwind scheme, you need to apply some conditionals of the type:

```
for (cell : mesh.cells()) {
    x, y = cell.centroid()
    
    if(x > 0.5) 
        right_state(cell)
    else
        left_state(cell)
}
```

What you would do in vectorized form, is:

In [None]:
values = np.zeros(30)
x = np.linspace(0, 1, np.size(values))
print(values)

In [None]:
values[np.where(x > 0.5)] = 1

print(values)

# Basics of `josiepy`

## Problem statement

\begin{equation}
    \partial_t \mathbf{q} + 
    \nabla \cdot \left(\underline{\underline{\mathbf{F}}}(\mathbf{q}) 
        + \underline{\underline{\underline{\underline{D(\mathbf{q})}}}} \cdot \nabla \mathbf{q} \right) +
    \underline{\underline{\underline{\mathbf{B}(\mathbf{q})}}} \cdot \nabla\mathbf{q} = 
    \mathbf{s}(\mathbf{q})
\end{equation}


\begin{equation}
    \partial_t q_p + 
    \partial_{x_r} \left(F_{pr}(q_p) + D_{pqrs}(q_p)\partial_{x_s}q_q\right) + 
    B_{pqr}\partial_{x_r}q_q = 
    s_p(q_p)
\end{equation}


## The `State` object

\begin{equation}
    \partial_t \mathbf{q} + 
    \nabla \cdot \left(\underline{\underline{\mathbf{F}}}(\mathbf{q}) 
        + \underline{\underline{\underline{\underline{D(\mathbf{q})}}}} \cdot \nabla \mathbf{q} \right) +
    \underline{\underline{\underline{\mathbf{B}(\mathbf{q})}}} \cdot \nabla\mathbf{q} = 
    \mathbf{s}(\mathbf{q})
\end{equation}


Represents the $\mathbf{q}$

In [None]:
from josie.solver.state import State
from enum import IntEnum

In [None]:
class EulerFields(IntEnum):
    rho = 0
    rhoU = 1
    rhoV = 2
    rhoE = 3
    U = 4
    V = 5
    rhoe = 6
    p = 7
    c = 8

In [None]:
class EulerState(State):
    fields = EulerFields
    
rnd_state = np.random.random(len(EulerFields)).view(EulerState)

In [None]:
U1 = rnd_state[EulerFields.U]
U2 = rnd_state[4]
U3 = rnd_state[-5]
U4 = rnd_state[rnd_state.fields.U]

assert np.array_equal(U1, U2) and np.array_equal(U2, U3) and np.array_equal(U3, U4)


`State` can be multidimensional! (e.g. a 2D 100x100 mesh, each cell containing an Euler state)

In [None]:
rnd_state = np.random.random((100, 100, len(EulerFields)))

In [None]:
U = rnd_state[..., EulerFields.U]
print(U.shape)

In [None]:
rnd_state = np.random.random((100, 100, 100, len(EulerFields)))

In [None]:
U = rnd_state[..., EulerFields.U]
print(U.shape)

Let's compute the normal velocity again...

In [None]:
UV = rnd_state[..., EulerFields.U : EulerFields.V + 1]

print(UV.shape)


In [None]:
U_norm = np.einsum("...ij,j->...i", UV, normal)
print(U_norm.shape)

assert np.array_equal(U_norm, U)