

## __A Generative Model Predicting Steady Numerical Simulations via Weakly-supervised Learning__

## Abstract

The machine learning method applied to computational fluid dynamics (CFD) is a thriving approach for solving simulation problems. Traditional CFD seeks various approaches or high-performance computational tools to overcome the difficulty of being extremely time-consuming because the simulation process is mostly iterative and mesh-based. The machine learning method is usually data-driven and expedites the simulation process by building up a meshless, data-driven toolbox. However, the simulation results for labels of the training are not always available and can be as expensive. Weakly-supervised learning, as an alternative, has shown the potential for solving Laplace equations. Here we extend the applications to more complicated computational scenarios and show that weakly-supervised learning can be trained to solve the Navier-stokes equation, and to be applied on irregular domains by extending the dimension of data representation. The results shows high accuracy once a proper initialization is given as input. We expect that a similar model can be generalized to solve or to speed up the fluid-structure interaction problems with minimal information.



# Introduction
Complicated fluid problems are common in many natural processes. A typical fluid problem is governed by a highly nonlinear partial differential equation (PDE) system, Navier Stokes (N-S) equations. Numerical simulations on the fluid dynamics problems primarily rely on solving the PDEs in the discretized form both spatially and temporally in some ways using the finite difference (FD), finite volume (FV), Lattice-Boltzmann, or finite element (FE) methods. However, such simulations are based on the density of meshes thus computationally expensive in both time and memory for complicated fluid-structure interaction problems where complex geometries are applied. Even before the computation, the preprocessing such as the design of meshing is also a huge cumbersome. Although modern commercial software relieves the burden of these preprocessing tasks, background knowledge in numerical simulation and expertise in using such tools are usually necessary for both academic and industrial applications. 

Machine learning (ML), and especially the deep learning (DL) approaches have been a great success in many areas with the help of large labeled dataset. The supervised tasks of ML assemble labeled data before an architecture fitted to predict the expected results. Recently, DL has shown new promises for numerical simulation due to its capability of handling strong nonlinearity and high dimensionality [*]. However, just like in a typical machine learning task, it is necessary to obtain or create large labeled datasets in order to achieve high performance. The state-of-art architectures of deep learning are dependent on a large amount of training and specially labeled data, but these architectures are hard to be operated when the data becomes sparse and minimal.

In typical ML problems, though successful for solving different problems, the mechanism behind the system is complicated, high-dimensional, and usually unknown before fitted with the labeled data. On the contrary, in traditional numerical simulation problems, the governing equations are clearly known, but the equations are difficult to be solved correctly and efficiently. The known governing equations can be utilized to constrain the learning to compensate for the insufficiency of the data. Recently, several groups have established such connections between the governing equations [RASSIs] and the simulation results via the physics-informed loss functions, building up physics-informed neural network (PINN) [PINN]. The loss functions are established according to the residues of each PDE equation that constrain the computational domain that is defined by nodal points. However, in their works, the expectation from the network is to build up a data-driven solver -- to solve general PDEs not by a traditional CFD process, but by a deep and dense artificial neural network (ANN) instead. When multiple parameters in the same form of PDEs are in need to be solved, such trained networks can predict solutions in which parameters within a certain range are applied, but the networks should be re-trained when other geometries, other boundary conditions or initial conditions are applied, which deviates from the motivation of saving computational power.

Recently, a weakly-supervised learning paradigm for solving heat equations has shown the potential power of using a similar idea of building loss function to generate the solutions [Sharma]. Instead of the traditional ANN, a fully convolutional encoder-decoder network adapted from the U-Net architecture [Unet] has been implemented to generate the solved domain. This work has combined the idea of physics informed loss that learn physics with a generative model as in [amir] that keep the intrinsic relations between neighboring nodal points. Without allowing the model to get access to any solved simulation data, the trained model can predict the solution with different boundary conditions instantly. Unfortunately, the work brings the idea and only uses the heat equation with Dirichlet boundary conditions as the single example problem. The presented work is an extension along with this idea. Here we try using the idea and extending the complication and dimension of data representation to solve problems with other types of boundary conditions, irregular geometric domain, and a cavity flow problem solved by N-S equations.








## Install and Import Packages (Code)
Import pytorch packages

In [0]:
!pip install -U scikit_image scikit_learn scipy pytorch
!pip install pillow==4.2.1

In [0]:
import torch
import argparse
import os
import time


In [5]:
# if running on collaboratory set = True
collaboratory = True

if collaboratory:
    from google.colab import drive
    drive.mount('/content/drive')
else: 
    print('Running on local systems, if running on collaboratory please change above')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [6]:
cd drive/My\ Drive

/content/drive/My Drive


In [14]:
import os
if os.path.exists("./generate_CFD"):
    pass
else:
    ! git clone https://github.com/shenw33/generate_CFD.git

Cloning into 'generate_CFD'...
remote: Enumerating objects: 21, done.[K
remote: Counting objects: 100% (21/21), done.[K
remote: Compressing objects: 100% (16/16), done.[K
remote: Total 21 (delta 5), reused 15 (delta 3), pack-reused 0[K
Unpacking objects: 100% (21/21), done.


In [15]:
cd generate_CFD

/content/drive/My Drive/generate_CFD/generate_CFD


In [16]:
!git pull

Already up to date.


In [18]:
import UNet2 

ModuleNotFoundError: ignored

# Results
The physics informed loss functions are based on the residue of the PDEs. For example, to solve the 2D steady-state heat equation using finite difference, we need to discretize $\Delta T = 0$. Considering evenly spaced 2-D grid, the discretized form of $\Delta T=0$ for the node $(i,j)$ would be:
$T_{i,j} = (T_{i+1,j} +T_{i-1,j} +T_{i,j+1} +T_{i,j-1}) / 4$. The nodal relation expressed in the above equation is solved iteratively by applying the rule above at each node (point in the grid) until convergence. We can clearly find that if we define a physics-informed kernel, as shown in figure [], the physics informed loss should be:
 
## Laplace equation with Neumann Boundary conditions:
Suppose we define the discretization to be $h$. And there is Neumann boundary condition on the left boundary $T(0,y_j)=a(y_{j})$. Suppose fictitous nodes $T_{-1,j}$:
<br>
 
$$
T\left(0, y_{j}\right) \approx\frac{T_{1, j}-T_{-1, j}}{2 h}=a(y_{j})=a_{j}
$$
So we can rewrite that: $T_{-1, j}=T_{1, j}-2 h a_{j}$. For the nodes at $i=0$, the constraints still hold:
<br>
$$
T_{-1, j}-4 T_{0, j}+T_{1, j}+T_{0, j-1}+T_{0, j+1}=0
$$
Substitute with the equation for the fictitious nodes
$$
-4 T_{0, j}+2 T_{1, j}+T_{0, j-1}+T_{0, j+1}=2 h a_{j}
$$
<br>
 
This equation is the special condition for the boundary nodes, which is similar for the equation governing the internal nodes. For the boundary nodes $T_{0,j}$, if we simplify the $h$ to still be 1, the extra terms for the loss function could be written once we put the padding layers indicating the boundary condition $a_j$, and defined the extra kernel indicating the equation above, and the new kernel compared with the regular kernel applied on the internal nodes could be seen as in figure 
 
## Laplace equations with a hollow square inside the computational domain


## Navier-stokes equations
 

