# Physics-Informed Neural Networks

In this exercise, you will impelement two physics-informed neural networks (PINNs).

For both of the neural networks, you will only need to specify the network architecture, activation functions and the loss function. The physics is going to be encoded via the loss function.

In this exercise the programming part will be the easy part, but following the equations and ideas will be the difficult part.

### Forward Problem: Burger's Equation

PINNs were first proposed in [Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations](https://arxiv.org/abs/1711.10561) and [Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations](https://arxiv.org/abs/1711.10566).


These papers suggest using neural networks to model phenomena in physics and give an example with a PINN for the Burger's equation, since this equation is simple to understand, but can be tricky to solve numerically. In the first question we are going to solve a forward problem (data-driven solution) with a PINN for the Burger's equation.

Recall the viscous Burger's equation:
	$$\frac{\partial u}{\partial t} + u \frac{\partial u}{\partial x} = \lambda \frac{\partial^2 u}{\partial x^2}$$

for $x \in [-1,1]$ and $t \in [0,1]$ with initial and boundary conditions:
	$$ u(x,0)=-sin(\pi x); \qquad u(1, t) = u(-1, t) = 0 $$


#### You have the following tasks

##### Network Architecture and Activation Functions

We want to solve for $u$, which is a function of $x$ and $t$. So we would need a neural network with two nodes in the input layer (one for $x$ and one for $t$) and one node in the output layer, which will predict the value of $u(x, t)$. The number of hidden layers and nodes as well as the choice of activation functions is up to you in this exercise.
		
Implement the neural network architecture in `__init__()` and `forward()` in the `Net` class in `Burgers.py`.

This test for this task requires you to already have a saved model, so it makes sense to proceed to the next task(s) without passing this test, but making sure that you also pass this test eventually (once you have a saved model).

##### Loss Function

Now it's time to inform the neural network of the governing physics. We do this by providing the governing partial differential equation (PDE) in the loss funciton. Given that we want to minimize the loss, we move the term on the right hand side of the Burger's equation to the left hand side and obtain a PDE form  $F(x,t)$ that needs to be minimized:

$$ F(x, t) = \frac{\partial u}{\partial t} + u \frac{\partial u}{\partial x} - \nu \frac{\partial^2 u}{\partial x^2} = 0 $$

We set the viscosity coefficient $\nu = 0.01/\pi$. Since this is a regression problem, we minimize $F(x,t)$ using the mean squared error (MSE):

$$ MSE_{F} = \frac{1}{N_{F}} \sum_{j=1}^{N_F} | F(x_j^F, t_j^F) |^2 $$ 

where $N_F$ is the number of datapoints in the training dataset $D_F = \{{x_j^F, t_j^F}\}_{j=1}^{N_F}$.

You might be wondering how to calculate partial derivatives in order to formulate $F(x_j^F, t_j^F)$. Recall that a neural network is created as a graph datastructure via PyTroch and this gives PyTorch the ability to calculate the derivative of the output ($u$) with respect to an input ($x$ or $t$), simply by taking steps in the backward direction (from output towards input layer) and applying the chain rule. 

Since we're solving a forward problem, we are also going to use the information from initial and boundary conditions (IC&BC) to inform the neural network of the governing conditions. This is also done using the MSE: 
		$$ MSE_{IC\&BC} = \frac{1}{N_{IC\&BC}} \sum_{j=1}^{N_{IC\&BC}} |  u(x_{j}^{IC\&BC}, t_{j}^{IC\&BC}) - u_{NN}  (x_{j}^{IC\&BC},  t_{j}^{IC\&BC})   |  ^2
		$$

where $N_{IC\&BC}$ is the number of training points from the initial and boundary conditions in the training dataset $D_{IC\&BC} = \{x_j^{IC\&BC}, t_j^{IC\&BC}, u(x_{j}^{IC\&BC}, t_{j}^{IC\&BC})\}_{j=1}^{N_{IC\&BC}}$, where $u(x_{j}^{IC\&BC}, t_{j}^{IC\&BC})$ is the true solution evaluated at the given datapoints and $u_{NN}(x_{j}^{IC\&BC}, t_{j}^{IC\&BC})$ is the prediction of the neural network for the provided datapoints as inputs.


The total loss consists of a sum of the two MSE values: <br>
		$$
		\mathcal{L}_{Burgers} = MSE_{F} + MSE_{IC\&BC}
		$$

Your task is to compute $$MSE_{F}$$ and $$MSE_{IC\&BC}$$ in the function `lossfunction()` in `Burgers.py`. Note that $u_{NN}(x_{j}^{IC\&BC}, t_{j}^{IC\&BC})$, $u(x_{j}^{IC\&BC}, t_{j}^{IC\&BC})$, $u$, $\frac{\partial u}{\partial t}$, $\frac{\partial u}{\partial x}$ and $\frac{\partial^2 u}{\partial x^2}$ are provided as parameters in this function.

Look into the training loop and find the calls to `grad`, this is where the partial derivatives are calculated using PyTorch. Try to undersand and play aroud with this function now, because in Exercise 9 you will have the task of also using this functionality to calculate partial derivatives.

Hint: In the implementation of `lossfunction()` since we are operating with tensors, not numpy-arrays, please be careful with the syntax. Helpful methods might be `torch.sum()`, `torch.square()`, `Tensor.shape`.

##### Train and save a model

Finally you can train a model, save it and push it to Artemis once you're happy with the performance. Note that the choice of hyperparameters directly affects the performance of your model. There will be no hidden tests on a (hidden) test set for this problem. If you wish you can numerically create a test set for yourself and evaluate you model's performance.

Here's a visualization of nice results: ![Burgers Predictions.jpeg](/api/core/files/markdown/Markdown_2022-06-30T19-29-30-673_ce37ef31.jpeg)



### Inverse Problem: 1D Euler Equation

An exciting application of PINNs is solving inverse problems (data-driven discovery). We're going to look at asimple 1D problem from gasdynamics in this exercise. The motivation for such a problem comes from the ability to gather measurements on such a problem, for example using the Schlieren imaging technique we can get the density gradient information, also we can put a pressure probe at some point in the domain and measure the pressure only at this point (but take multiple measurements in time). The idea is to use these "measurements" and our knowledge of the PDE to predict the conserved quantities of interest: velocity, density and pressure. Note that in this exercise we will use the density gradient and pressure probe as numerically computed "measurements" (they are not measured in the real world, but generated on a computer).

Compressible flows are governed by the Euler Equations. In 1D these are:
	$$\frac{\partial \rho}{\partial t} + \frac{\partial \rho u}{\partial x} = 0 $$$$\frac{\partial \rho u}{\partial t} + \frac{\partial (\rho u^2 +p)}{\partial x} = 0 $$$$ \frac{\partial \rho E}{\partial t} + \frac{\partial (u(\rho E + p))}{\partial x} = 0 $$
	
where we're solving for $\rho$, $u$ and $p$, which stand for density, velocity and pressure, respectively.

The relationship between $\rho$, $u$, $p$ and $E$ is given by:
$$ p = (\gamma - 1) (\rho E - \frac{1}{2} \rho \Vert u \Vert ^2) $$

where $\gamma$ is the adiabatic index and for our case $\gamma =1.4$

The boundary and initial conditions will be:
	$$ x \in [-1,1] $$ $$ t \in [0, 2] $$ $$ \rho\_0 = 1.0 + 0.2 sin(\pi x)$$ $$u_0 = 1.0$$ $$ p_0=1.0$$
	
and the exact solutions are:
	$$ \rho = 1.0 + 0.2 sin(\pi (x-t))$$$$u = 1.0$$$$ p=1.0$$
This problem is described in Appendix B in [Physics-informed neural networks for high-speed flows](https://www.sciencedirect.com/science/article/am/pii/S0045782519306814). We are going to solve it in a similar way as proposed in this paper.

This is what the data looks like (only the coordinates of the points are shown):
![Training Data Euler.jpeg](/api/core/files/markdown/Markdown_2022-06-30T19-28-55-507_981dbd1e.jpeg)


#### You have the following tasks:

##### Network Architecture and Activation Functions
We want to solve for $\rho$, $u$ and $p$, all of which are functions of $x$ and $t$. So we would need a neural network with two nodes in the input layerand three nodes in the output layer. The other hyperparameters are up to you. Be careful with the activation functions at the output layer, some choices will not make sense physically.

Implement the neural network architecture in `__init__()` and `forward()` in the `Net` class in `Euler.py`.

This test for this task requires you to already have a saved model, so it makes sense to proceed to the next task(s) without passing this test, but making sure that you also pass this test eventually (once you have a saved model).


##### Loss Function
Again, we start by defining the PDE form which we want to minimize, but this time we have three equations to minimize.
		$$F_1(x, t) =\frac{\partial \rho}{\partial t} + \frac{\partial \rho u}{\partial x} = 0 $$ $$ F_2(x, t) = \frac{\partial \rho u}{\partial t} + \frac{\partial (\rho u^2 +p)}{\partial x} = 0 $$
		$$F_3(x, t) = \frac{\partial \rho E}{\partial t} + \frac{\partial (u(\rho E + p))}{\partial x} = 0$$
		$$ MSE_{F} = \frac{1}{N_{F}} \sum_{j=1}^{N_{F}} | F_1(x_j^{F}, t_j^{F}) |^2 + 
		\frac{1}{N_{F}} \sum_{j=1}^{N_{F}} | F_2(x_j^{F}, t_j^{F}) |^2 + 
		\frac{1}{N_{F}} \sum_{j=1}^{N_{F}} | F_3(x_j^{F}, t_j^{F}) |^2 $$
		
where $N\_F$ is the number of datapoints in the training dataset $D\_F = \{x_j^F, t_j^F\}_{j=1}^{N_F}$.

Since we're solving an inverse problem, we are *not* going to use information from initial/boundary conditions. We are going to use information from (easily) measurable quantities in our domain and use this to infer other quantities of interest.

Let's assume that the density gradent is easily measurable in the whole domain, and we can place a pressure probe at a fixed point in space $x^*$. According to the paper mentioned above, we would additionally need to enforce mass conservation at $t=0$ in order to get good predictions (you can also try to exclude the mass conservation term and observe the predictions to see if this is really necessary).

Let's start with the information from the density gradient. This is done by computing the finite difference gradient approximation using two density values: $\rho(x, t)$ and $\rho(x+dx,t)$ with known $dx$ and comparing this to the finite difference gradient obtained with the density predictions from the neural network:
		$$MSE_{\nabla \rho} = \frac{1}{N_{\nabla \rho}} \sum_{j=1}^{N_{\nabla \rho}} \left| \frac{\rho(x_j^{\nabla \rho} +dx, t_j^{\nabla \rho}) - \rho(x_j^{\nabla \rho}, t_j^{\nabla \rho})  }{dx} - \frac{\rho_{NN}(x_j^{\nabla \rho} +dx, t\_j^{\nabla \rho}) - \rho_{NN}(x_j^{\nabla \rho}, t_j^{\nabla \rho})}{dx} \right| ^2 $$ <br>
		
where $N_{\nabla \rho}$ is the number of training points in the training dataset $D_{\nabla \rho} = \{x_j^{\nabla \rho}, t_j^{\nabla \rho}, \rho_{j}^{\nabla \rho}(x_j^{\nabla \rho}, 
		t_j^{\nabla \rho}), \rho_{j}^{\nabla \rho}(x_j^{\nabla \rho} +dx, t_j^{\nabla \rho})\}_{j=1}^{N_{\nabla \rho}}$$
	
We continue with the infromation from the pressure probe, which gives us pressure measurements at a fixed point $x^*$: 
		$$MSE_{p^*} = \frac{1}{N_{p^*}} \sum_{j=1}^{N_{p^*}} \left| p(x^*, t_j^{p^*}) - p_{NN}(x^*, t_j^{p^*}) \right|^2$$
		
where $N_{p^*}$ is the number of training points in the training dataset $D_{p^*} = \{t_j^{p^*}, p(x^*, t_j^{p^*})\}_{j=1}^{N_{p^*}}$.

Additionally, we want to enforce mass conservation at $t=0$ with:
		$$
		MSE_{Mass_0} = \left( \int_{x=-1}^{x=1} \rho_{NN}(x, 0) dx - \int_{x=-1}^{x=1} \rho(x, 0) dx \right)^2
		$$
	
where the integral of the density at $t=0$ can be analytically computed and evaluates to $$\int_{x=-1}^{x=1} \rho_{NN}(x, 0) dx = 2$$

The  training data used for this MSE loss is $D_{Mass_0} = \{x_j^{Mass_0})\}_{j=1}^{N_{Mass_0}}$, since we already know the evaluated integral. Notice that in order to be able to compute an integral with the predictions, the training data $D_{Mass_0}$ should not be shuffled! This is already taken care of in the provided template, I just wanted to make you aware of this peculiarity.

Since this is a problem where we're trying to optimize multiple objectives, we can use coefficients (loss weights) for each of the MSE loss terms in order to distribute the optimization as we would like. This makes sense when you think of a case where one of these MSE values has a very low value, so the optimizer is not really minimizing this MSE loss, but the focusing on other MSE terms since they have a larger value. With loss weights we can re-distribute the optimization objectives.

For details on loss weights see the follwoing paper on this topic:
[Optimally weighted loss functions for solving PDEs with Neural Networks](https://arxiv.org/abs/2002.06269).


The total loss consists of a sum of all the MSE values: 
		$$
		\mathcal{L}_{1D_{Euler}} = w_F MSE_{F} + w_{\nabla \rho} MSE_{\nabla \rho} + w_{p^*} MSE_{p^*} + w_{Mass_0} MSE_{Mass_0}
		$$

Your task is to compute $MSE_{F}$, $MSE_{\nabla \rho}$, $MSE_{p^*}$ and $MSE_{Mass_0}$ in the `lossfunction()` in `Euler.py`.

Note that $\rho$, $u$, $p$, $E$, $\frac{\partial \rho}{\partial x}$, $\frac{\partial u}{\partial x}$,$\frac{\partial p}{\partial x}$, $\frac{\partial E}{\partial x}$, $\frac{\partial \rho}{\partial t}$, $\frac{\partial u}{\partial t}$, $\frac{\partial p}{\partial t}$, $\frac{\partial E}{\partial t}$, which are obtained from the neural network, are given.

Additionally, $\rho(x_j^{\nabla \rho} +dx, t_j^{\nabla \rho})$, $\rho(x_j^{\nabla \rho}, t_j^{\nabla \rho})$, $\rho_{NN}(x_j^{\nabla \rho} +dx, t_j^{\nabla \rho})$, $\rho_{NN}(x_j^{\nabla \rho}, t_j^{\nabla \rho})$, $dx$, $p(x^*, t_j^{p^*})$, $p_{NN}(x^*, t_j^{p^*})$, $\int_{x=-1}^{x=1} \rho_{NN}(x, 0) dx$ and $\int_{x=-1}^{x=1} \rho(x, 0) dx$ are given as parameters of the `lossfunction()`.
	
IMPORTANT: The Artemis test for this task assumes that `DX=0.0999`, `w_nabla_rho = 1`, `w_probe = 1`, `w_mass0 = 1` and `w_F = 1`. This means that once you start changing the hyperparameters, this test might start failing. Currently the idea is **not** to include it in the calculation of the final result (so it won't matter if you fail it), but to keep it so that you can check your implementation of `lossfunction()` before you continue to the stage of training a model and changing hyperparameters.


##### Train and save a model

Train a model, save it and push it to Artemis once you're happy with the performance. Note that the choice of hyperparameters directly affects the performance of your model. A common mistake in this exercise is a poor choice of activation function at the output layer.

Here's a visualization of nice results: ![Euler Predictions.jpeg](/api/core/files/markdown/Markdown_2022-06-30T19-28-09-132_5718fc06.jpeg)



##### (Optional) Train and save a really good model

There will be a test set that is hidden from you before the deadline which will evaluate the saved models, similar to what we did in previous exercises. Again, the student with the best performance (lowest loss on this test set) will recieve a bonus of 10 points.
