# Solving Partial Differential Equations with Neural Networks

##### Authors: Szymon Malec (262276) & Damian Szuster (262229)

## 1. Introduction

Partial differential equations (PDE) appear in numerous fields of science i.a. in mathematics, physics or chemistry. Their task is to describe change of some quantities by using derivatives of given function. Unfortunately, most of them do not have analytical solution. Thus, lots of numerical methods have been derived. One of them, perhaps less known, is solving PDEs with a help of neural networks. That is the main topic of our report.

This assignment consists of five parts. The first one includes some theory about neural networks. The second chapter describes partaial differential equations and model parameters chosen for the simulations. Chapters 4 and 5 contain method implementation in Python programming language and analysis of three of its properties: consistency, convergence and stability. After all of these parts some conclusions will be drawn.



## 2. Neural networks

#### 2.1. Introduction to Neural Networks

Neural networks, inspired by the biological neural networks of the human brain, are a subset of machine learning techniques designed to recognize patterns. They consist of interconnected layers of nodes (or neurons) that process data and learn to make predictions or decisions without explicit programming.

#### 2.2. Basic Structure of Neural Networks

A neural network typically consists of an input layer, one or more hidden layers, and an output layer:

- **Input Layer:** This layer receives the raw input data. Each neuron in the input layer represents a feature of the input data.
- **Hidden Layers:** These intermediate layers transform the input into something the output layer can use. They apply weights to the inputs and pass the results through activation functions to capture non-linear relationships.
- **Output Layer:** This layer produces the final output, which can be a single value or a vector of values, depending on the problem.

#### 2.3. Training Neural Networks

The process of training a neural network involves the following steps:

1. **Forward Propagation:** Input data is passed through the network, layer by layer, until an output is generated.
2. **Loss Calculation:** The difference between the predicted output and the actual target is quantified using a loss function.
3. **Backward Propagation:** The network adjusts the weights of the connections to minimize the loss. This is done using algorithms such as gradient descent, which calculates the gradient of the loss function with respect to each weight.
4. **Iteration:** The process is repeated for many iterations, gradually improving the accuracy of the network.

#### 2.4. Activation Functions

Activation functions introduce non-linearity into the network, allowing it to model complex relationships. Common activation functions include:

- **Sigmoid:** Outputs values between 0 and 1, used for binary classification.
- **ReLU (Rectified Linear Unit):** Outputs the input directly if positive; otherwise, it outputs zero. This function helps to mitigate the vanishing gradient problem.
- **Tanh:** Outputs values between -1 and 1, used for centering data.

#### 2.5. Neural Networks for Solving PDEs

Partial differential equations (PDEs) are equations that involve rates of change with respect to continuous variables. Solving PDEs is crucial in various fields such as physics, engineering, and finance. Traditional methods for solving PDEs, such as finite element methods, can be computationally intensive.

Neural networks provide an alternative approach to solving PDEs through the following methods:

- **Physics-Informed Neural Networks (PINNs):** These networks incorporate physical laws described by PDEs into the loss function. The network is trained not only on data but also on the underlying physical principles, ensuring that the solution respects the PDE.
- **Deep Galerkin Method (DGM):** This method approximates the solution to PDEs using deep neural networks. It leverages the universal approximation capability of neural networks to represent complex solutions.

#### 2.6. Advantages of Neural Networks for PDEs

- **Flexibility:** Neural networks can approximate complex functions and handle high-dimensional problems that are challenging for traditional methods.
- **Data Efficiency:** PINNs and similar approaches can leverage available data more effectively by incorporating physical laws into the learning process.
- **Parallelization:** Neural network training can be parallelized, taking advantage of modern high-performance computing resources.

#### 2.7. Challenges and Future Directions

Despite their advantages, neural networks for PDEs face several challenges:

- **Training Complexity:** Training neural networks to solve PDEs can be computationally expensive and requires careful tuning of hyperparameters.
- **Generalization:** Ensuring that the network generalizes well to unseen data and different boundary conditions is a significant challenge.
- **Interpretability:** Neural network models are often seen as black boxes, making it difficult to interpret the solution process.

Future research is directed towards improving training algorithms, developing more interpretable models, and combining neural networks with traditional numerical methods for hybrid approaches.


## 3. Model parametrization and choice of equations

#### 3.1. Parameters

#### 3.2. Equations

To check the quality of our model we have chosen two partial differential equations: wave and Burgers equation. The choice of them is not random: we would like to consider models that have (wave) or do not have (Burgers) analytical solution.

Let us begin with the wave equation. It is a second-order linear PDE applied to model wavelike phenomena, e.g. small-amplitude oscilations near equilibrium. In this project we will consider the wave equation in one space dimension:

$$
\frac{\partial^{2}g}{\partial t^{2}} = c^{2}\frac{\partial^{2}g}{\partial x^{2}},
$$

where $c$ is wave speed. Additionally, some initial conditions are given:

$$
g(x,0)=\phi(x),
$$
$$
g_{t}(x,0)=\psi(x).
$$

It is also possible in many ways to describe other boundary conditions. In our case we will use Dirichlet condition. It describes how endpoints of our wave move:

$$
g(0,t) = \mu(t),
$$

$$
g(N,t) = \nu(t).
$$

We say that endpoints are fixed when $\mu = \nu = 0.$

The other equation we will use in simulations is Burgers' equation which is one of the fundamental PDEs. This convection-diffusion equation describes phenomena occuring in traffic flow, fluid dynamics etc. As in previous case, we are going to consider Burgers' equation in one space dimension:

$$
\frac{\partial g}{\partial t} + g \frac{\partial g}{\partial x} = \nu \frac{\partial^{2} g}{\partial x^{2}},
$$

where $\nu$ is a diffusion coeficient. The initial conditions are described in similar form as in wave equation

## 4. Implementation

For the implementation we used Python programming language. Some of python packages turned out to be very useful, especially well-known `numpy` library but also `autograd` which provides functions for numerical gradient calculation. Therefore, let's import necessary packages.

In [1]:
import numpy
import autograd.numpy as np
from autograd import grad, elementwise_grad
from matplotlib import pyplot as plt
from mpl_toolkits import mplot3d
from mpl_toolkits.mplot3d import Axes3D

Firstly we define two activation functions, which we'll use in neural networks. These are ReLu and sigmoid.

In [2]:
def relu(x):
    return np.maximum(x, 0)

def sigmoid(x):
    return 1/(1 + np.exp(-x))

Next, let's create the function which returns the output of the neural network. The function below calculates values in all neurons layer by layer ending with the output layer. It needs weights of the network to be specified and passed as numpy arrays.

In [3]:
def neural_network(x, weights, activation_function=sigmoid):
    for W in weights[:-1]:
        x = np.vstack([np.ones(x.shape[1]), x])
        x = activation_function(W @ x)
    x = np.vstack([np.ones(x.shape[1]), x])
    x = weights[-1] @ x
    return x

An inherent element of neural network is the cost (loss) function, which should be minimized through the training process. The right choice of the cost is an important step. In case of Physical Informed Neural Networks the cost function is assumed to be the square of the differential equation. Notice that, if we put all the elements on one side of the equation, we can represent the equation as
$$
f\left(g, x_1, \, \dots \, , x_N, \frac{\partial g}{\partial x_1}, \dots , \frac{\partial g}{\partial x_N}, \frac{\partial g}{\partial x_1\partial x_2}, \, \dots \, , \frac{\partial^n }{\partial x_N^n} \right) = 0
$$
for the function $g = g(x_1,\dots,x_N)$ of $N$ variables. We may treat the above as an error. Thus the cost function can be expressed as
$$ C(x, W) = f^2. $$

In [None]:
def cost(X, g_t, equation, weights):
    return np.mean(equation(X, g_t, weights)**2)

And finally the function which solves the equation.

In [None]:
def solve(equation, g_t, X, layers, epochs, learning_rate=0.001):

    cost_grad = elementwise_grad(cost, 3)
    weights = [np.random.randn(layers[layer + 1], layers[layer] + 1) for layer in range(len(layers) - 1)]

    for epoch in range(epochs):
        dW = cost_grad(X, g_t, equation, weights)
        for w in range(len(weights)):
            weights[w] -= learning_rate * dW[w]
    return weights

## 5. Properties

#### 5.1. Consistency

#### 5.2. Convergence

#### 5.3. Stability

## 6. Conclusion

## 7. Bibliography