<div style="background-color: #ccffcc; padding: 10px;">
    <h1> Tutorial 2 </h1> 
    <h2> Physics-Informed Neural Networks Part 2</h2>
    <h2> 1D Heat Equation PINNs Example </h2>
</div>    

# Overview

This notebook is based on two papers: *[Physics-Informed Neural Networks: A Deep Learning Framework for Solving Forward and Inverse Problems Involving Nonlinear Partial Differential Equations](https://www.sciencedirect.com/science/article/pii/S0021999118307125)* and *[Hidden Physics Models: Machine Learning of Nonlinear Partial Differential Equations](https://www.sciencedirect.com/science/article/pii/S0021999117309014)* with the help of  Fergus Shone and Michael Macraild.

These tutorials will go through solving Partial Differential Equations using Physics-Informed Neural Networks, focusing on the Burgers Equation and a more complex example using the Navier Stokes Equation.

**This introduction section is replicated in all PINN tutorial notebooks (please skip if you've already been through).** 


    

</div>

<div style="background-color: #ccffcc; padding: 10px;">

<h1>Physics-Informed Neural Networks</h1>

For a typical neural network using algorithms like gradient descent to look for a hypothesis, the data are the only guide. However, if the data are noisy or sparse, and we already have governing physical models, we can use the knowledge we already have to optimise and inform the algorithms. This can be done via [feature engineering](https://www.ibm.com/think/topics/feature-engineering) or by adding a physical inconsistency term to the loss function.


<a href="https://towardsdatascience.com/physics-guided-neural-networks-pgnns-8fe9dbad9414">
<img src="https://miro.medium.com/max/700/1*uM2Qh4PFQLWLLI_KHbgaVw.png">
</a>   
 
 
## The very basics

If you are new to neural networks, there is a [toy neural network Python code example](https://github.com/cemac/LIFD_ENV_ML_NOTEBOOKS/tree/main/ToyNeuralNetwork) included in the [LIFD ENV ML Notebooks Repository](https://github.com/cemac/LIFD_ENV_ML_NOTEBOOKS). There we cover some of the fundamentals of neural nets and show how to build a two-layer neural network from scatch.

    
## Recommended reading
    
The in-depth theory behind neural networks will not be covered here as this tutorial focuses on the application of machine-learning methods. If you wish to learn more, here are some great starting points:
 

* [All you need to know on neural networks](https://towardsdatascience.com/nns-aynk-c34efe37f15a) 
* [Introduction to neural networks](https://victorzhou.com/blog/intro-to-neural-networks/)
* [Physics-Guided Neural Networks](https://towardsdatascience.com/physics-guided-neural-networks-pgnns-8fe9dbad9414)
* [Maziar Rassi's Physics-Informed Neural Networks GitHub page](https://maziarraissi.github.io/PINNs/)

</div>


<hr>


<div style="background-color: #e6ccff; padding: 10px;">
    
<h1> Machine Learning Theory </h1>
<a href="https://victorzhou.com/series/neural-networks-from-scratch/">
<img src="https://victorzhou.com/media/nn-series/network.svg">
</a>

    
## Physics-Informed Neural Networks

Neural networks work by using lots of data to tune weights and biases, thereby minimising the loss function and enabling them to act as universal function approximators. However, these purely data-driven models lose their robustness when data is limited. By using known physical laws or empirically validated relationships, the solutions from neural networks can be sufficiently constrained by disregarding unrealistic solutions.
    
A Physics-Informed Neural Network considers a parameterised and non-linear partial differential equation in the general form:



    
\begin{align}
     u_t + \mathcal{N}[u; \lambda] &= 0, && x \in \Omega, t \in [0,T],\\
\end{align}
    


where $\mathcal{u(t,x)}$ denotes the latent solution, $\mathcal{N}$ is a non-linear differential operator acting on $u$, $\mathcal{\lambda}$ and $\Omega$ is a subset of $\mathbb{R}^D$ (the prescribed domain). This setup encapsulates a wide range of problems, such as diffusion processes, conservation laws,  advection-diffusion-reaction  systems,  and  kinetic  equations.

Here we will go though this for the 1D Heat equation and Navier stokes equations.


</div>    

<div style="background-color: #cce5ff; padding: 10px;">

<h1> Python </h1>

## PyTorch
    
There are many machine-learning libraries available for Python. [PyTorch](https://pytorch.org/) a is one such library. If you have a GPU on the machine you are using, PyTorch should automatically use it and run the code in the notebooks even faster! This will work automatically with google Colab. If using your own machine, please ensure that the GPU-enabled version of PyTorch is installed.


## Further Reading

* [Running Jupyter Notebooks](https://jupyter.readthedocs.io/en/latest/running.html#running)
* [PyTorch optimisers](https://nbviewer.org/github/bentrevett/a-tour-of-pytorch-optimizers/blob/main/a-tour-of-pytorch-optimizers.ipynb)


</div>
    
<hr>

<div style="background-color:  #f4b85d; padding: 10px;">
    
<h1> Requirements </h1>

These notebooks should run with the following requirements satisfied.

<h2> Python Packages: </h2>

* Python 3
* PyTorch
* NumPy 
* Matplotlib
* SciPy

<h2> Data Requirements</h2>
    
This notebook refers to some data included in the GitHub repository that was imported via the `git submodules` command mentioned in the installation instructions.
    
</div>


**Contents:**

1. [1D Heat Equation non-ML Example](PINNs_1DHeatEquations_nonML.ipynb)
2. **[1D Heat Equation PINN Example](PINNs_1DEquationExample.ipynb)**
    * [1D Heat Equation Forwards](#1D-Heat-Equation-Forwards)
    * [1D Heat Equation Inverse](#1D-Heat-Equation-Inverse)
3. [Navier-Stokes PINNs Discovery of PDEs](PINNs_NavierStokes_example.ipynb)
4. [Navier-Stokes PINNs Hidden Fluid Mechanics](PINNs_NavierStokes_HFM.ipynb)

<div style="background-color: #cce5ff; padding: 10px;">
Load in all required modules (including some auxiliary code) and turn off warnings.
</div>

In [None]:
# For readability: disable warnings
import warnings
warnings.filterwarnings('ignore')

In [1]:
import os
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt
import scipy.io
from scipy.interpolate import griddata
import time
from itertools import product, combinations
from mpl_toolkits.mplot3d import Axes3D
from mpl_toolkits.mplot3d.art3d import Poly3DCollection
from mpl_toolkits.axes_grid1 import make_axes_locatable
import matplotlib.gridspec as gridspec
import scipy.sparse as sp
import scipy.sparse.linalg as la
from pyDOE import lhs

<hr>

# Solving the 1D Heat Equation via Neural Networks

# 1D Heat Equation Forwards

<div style="background-color: #ccffcc; padding: 10px;">

**Model Problem: 1D Heat Equation**

We begin by describing the first model problem - the one-dimensional heat equation. 

The heat equation is the prototypical parabolic partial differential equation and can be applied to modelling the diffusion of heat through a given region, hence its name. Read more about the heat equation here: https://en.wikipedia.org/wiki/Heat_equation.

In 1D, the heat equation can be written as:

\begin{equation}
\frac{\partial u}{\partial t} = k \frac{\partial^2 u}{\partial x^2 },
\end{equation}
    
where $k$ is a material parameter called the coefficient of thermal diffusivity.

This equation can be solved using numerical methods, such as finite differences or finite elements. For this notebook, we have solved the above equation numerically on a domain of $x \in [0,1]$ and $t \in [0, 0.25]$. Solving this equation numerically gives us a spatiotemporal domain $(x,t)$ and corresponding values of the solution $u$.

</div>



<div style="background-color: #e6ccff; padding: 10px;">

    
Here we will describe the architecture of the PINN we use to solve the 1D heat equation in this notebook. 
![PINNS.png](PINNS.png)
    
    
Net U in the above diagram approximations a function that maps from $(x,t) \mapsto u$. $\sigma$ represents the biases and weights for the each neuron of the network. These $\sigma$ values are the network parameters that are updated after each iteration. AD means Automatic Differentiation - this is the chain rule-based differentiation procedure that allows for differentiation of network outputs with respect to its inputs, e.g. differentiating $u$ with respect to $x$, or calculating $\frac{\partial u}{\partial x}$. The I node in the AD section represents the identity operation, i.e. keeping $u$ fixed without applying any differentiation.

After the automatic differentiation part of the network, we have two separate loss function components - the data loss and the PDE loss. The data loss term is calculated by finding the difference between the network outputs/predictions $u$ and the ground truth values of $u$, which could come from simulation or experiment. The data loss term enforces the network outputs to match known data points, which are represented by the pink box labelled "Data". The PDE loss term is where we add the "physics-informed" part of the network. Using automatic differentiation, we are able to calculate derivatives of our network outputs, and so we are able to construct a loss function that forces the network to match the PDE that is known to govern the system. In this case, the PDE loss term is defined as:
 
\begin{equation}
f = \frac{\partial u}{\partial t} - k \frac{\partial^2 u}{\partial x^2 },
\end{equation}
    
where $f$ is the residual of the 1D heat equation. By demanding that $f$ is minimised as our network trains, we ensure that the network outputs obey the underlying PDE that governs the system. We then calculate the total loss of the system as a sum of the data loss and the PDE loss.

The loss is calculated after each pass through the network and, when it is above a certain tolerance, the weights and biases are updated using a gradient descent step. When the loss falls below the tolerance the network is trained. In inference mode, we can then input a fine mesh of spatiotemporal coordinates and the network will find the solution at each of these points.
    
</div>

<hr>
<div style="background-color: #ccffcc; padding: 10px;">
    
###  Variable definitions

Variables to be defined here:

`X_u`: Input coordinates, e.g. spatial and temporal coordinates.

`u`: Output corresponding to each input coordinate. 

`X_f`: Collocation points at which the governing equations are satisfied. These coordinates will have the same format as the X_u coordinates, e.g. $(x,t)$.

`layers`: Specifies the structure of the U network.

`lb`: Vector containing the lower bound of all of the coordinate variables, e.g. $x_{min}$, $t_{min}$.

`ub`: Vector containing the upper bound of all of the coordinate variables, e.g. $x_{max}$, $t_{max}$.

`k`: This is the constant material parameter for this specific problem. For this problem, the heat equation, $k$ represents thermal diffusivity.
    

    
</div>

<div style="background-color: #ccffcc; padding: 10px;">

# Load data and set input parameters 
      
A feedforward neural network of the following structure is assumed:
- the input is scaled elementwise to lie in the interval $[-1, 1]$,
- followed by 8 fully connected layers each containing 20 neurons and each followed by a hyperbolic tangent activation function,
- one fully connected output layer.

This setting results in a network with a first hidden layer: $2 \cdot 20 + 20 = 60$; $9$ intermediate layers: each $20 \cdot 20 + 20 = 540$; output layer: $20 \cdot 1 + 1 = 21$).
    

</div>

 <div style="background-color: #cce5ff; padding: 10px;">
    
# Number of collocation points 
    
`2000` collocation points is the default setting for this example and can be increased to improve results at cost of computational speed. The original work set this to `N_u=10000`, running on GPUs in a few minutes. 
    
    
The network takes in data in coordinate pairs: $(x,t) \mapsto u$.     
</div>


# Initialisation forwards


<div style="background-color: #ffffcc; padding: 10px;">

Once you have run through the notebook once you may wish to alter any of the following: 
    
- number of data training points `N_u`,
- number of collocation training points `N_f`,
- number of layers in the network `layers`,
- number of neurons per layer `layers`.

</div>

In [None]:
k = 1
N_u = 100 # number of data points (default 100)
N_f = 2000 # collocation points (default 2000)
# structure of network: two inputs (x,t) and one output u
# 8 fully connected layers with 20 nodes per layer
layers = [2, 20, 20, 20, 20, 20, 20, 20, 20, 1]

In [None]:
data = scipy.io.loadmat("Data/heatEquation_data.mat") # these data are from the PINNs paper
t = data['t'].flatten()[:,None] # read in t and flatten into column vector
x = data['x'].flatten()[:,None] # read in x and flatten into column vector
 # Exact represents the exact solution to the problem, from the data provided
Exact = np.real(data['usol']).T # Exact has structure of nx times nt

print("usol shape = ", Exact.shape)

# We need to find all the x,t coordinate pairs in the domain
X, T = np.meshgrid(x,t)

# Flatten the coordinate grid into pairs of x,t coordinates
X_star = np.hstack((X.flatten()[:,None], T.flatten()[:,None])) # coordinates x,t
u_star = Exact.flatten()[:,None]   # corresponding solution value with each coordinate            

print("X has shape ", X.shape, ", X_star has shape ", X_star.shape)
    
# Domain bounds (-1,1)
lb = X_star.min(0)
ub = X_star.max(0)  

print("Lower bounds of x,t: ", lb)
print("Upper bounds of x,t: ", ub)

<div style="background-color: #ccffcc; padding: 10px;">

# Select training data
    
The training data `X_u_train` and `u_train` are generated to include training coordinates on the boundaries. The sampling points are plotted below.
    

</div>

In [None]:
## Train using internal points
X_u_train = X_star
u_train = u_star

## Generate collocation points using Latin Hypercube sampling within the bounds of the spatiotemporal coordinates
# Generate N_f x,t coordinates within range of upper and lower bounds
X_f_train = lb + (ub-lb)*lhs(2, N_f) # the 2 denotes the number of coordinates we have - x,t 

## In addition, we add the X_u_train coordinates from the boundaries to the X_f coordinate set
X_f_train = np.vstack((X_f_train, X_u_train)) # stack up all training x,t coordinates for u and f 

## We downsample the boundary data to leave N_u randomly distributed points
## This makes the training more difficult - 
## if we used all the points then there is not much for the network to do!
idx = np.random.choice(X_star.shape[0], N_u, replace=False)
X_u_train = X_star[idx,:]
u_train = u_star[idx,:]

In [None]:
## Make a plot to show the distribution of training data
plt.scatter(X_f_train[:,1], X_f_train[:,0], marker='x', color='red',alpha=0.1)
plt.scatter(X_u_train[:,1], X_u_train[:,0], marker='x', color='black')
plt.xlabel('x')
plt.ylabel('t')
plt.title('Data points and collocation points (red crosses)')
plt.legend(['Collocation Points', 'Data Points'])
plt.show()

<hr>
<div style="background-color: #ccffcc; padding: 10px;">

**$u(x,t)$** can then be defined below as the function `net_u` and the physics informed neural network **$f(x,t)$** is outlined in function `net_f`.


`net_u()` constructs a network that takes input $x,t$ and outputs the solution $u(x,t)$.
    
`net_f()`  is the $f$ network is where the PDE is encoded.
    
1. We read in the value of $k$ first so that it can be included in the equations.
2. Then we evaluate $u$ for the $X_f$ input coordinates (collocation points).
3. Then we use PyTorch differentiation (autograd) to calculate the derivatives of the solution.
4. Finally, we encode the PDE in residual form as $f->0, u_t = k*u_xx, which is the governing eq.
    
</div>

In [None]:
def net_u(x, t, model):
    X = torch.cat([x, t], dim=1)
    u = model(X)
    return u

def net_f(x, t, model, k):
    x.requires_grad_(True)
    t.requires_grad_(True)
    u = net_u(x, t, model)
    u_x = torch.autograd.grad(u, x, grad_outputs=torch.ones_like(u), create_graph=True)[0]
    u_xx = torch.autograd.grad(u_x, x, grad_outputs=torch.ones_like(u_x), create_graph=True)[0]
    f = k * u_xx
    return f

# PyTorch module class and initialization

<div style="background-color: #ccffcc; padding: 10px;">

This code sets up a Physics-Informed Neural Network (PINN) for solving the 1D heat equation using PyTorch. Here's a summary of the key components:

1. `XavierInit` Class:

Custom layer initialization using Xavier initialization, which is designed to keep the scale of the gradients roughly the same in all layers. Initializes weights and biases for a layer.

2. `initialize_NN` Function:

Creates a list of layers using the XavierInit class. Takes a list of layer sizes and initializes each layer accordingly.

3. `NeuralNet()` constructs the network U where X is a matrix containing the input and output coordinates, i.e. x,t,u
and X is normalised so that all values lie between -1 and 1, this improves training. Applies the layers sequentially with the tanh activation function, except for the last layer.

Using the PyTorch module classes allows you to create more complex models controlling exactly how the data flows through the model [overview of PyTorch Modules here](https://www.learnpytorch.io/02_pytorch_classification/).

</div>

In [None]:
class XavierInit(nn.Module):
    def __init__(self, size):
        super(XavierInit, self).__init__()
        in_dim = size[0]
        out_dim = size[1]
        xavier_stddev = torch.sqrt(torch.tensor(2.0 / (in_dim + out_dim)))
        self.weight = nn.Parameter(torch.randn(in_dim, out_dim) * xavier_stddev)
        self.bias = nn.Parameter(torch.zeros(out_dim))

    def forward(self, x):
        return torch.matmul(x, self.weight) + self.bias

def initialize_NN(layers):
    weights = nn.ModuleList()
    num_layers = len(layers)
    for l in range(num_layers - 1):
        layer = XavierInit(size=[layers[l], layers[l + 1]])
        weights.append(layer)
    return weights

class NeuralNet(nn.Module):
    def __init__(self, layers, lb, ub):
        super(NeuralNet, self).__init__()
        self.weights = initialize_NN(layers)
        self.lb = torch.tensor(lb)
        self.ub = torch.tensor(ub)

    def forward(self, X):
        H = 2.0 * (X - self.lb) / (self.ub - self.lb) - 1.0
        for l in range(len(self.weights) - 1):
            H = torch.tanh(self.weights[l](H.float()))
        Y = self.weights[-1](H)
        return Y

<div style="background-color: #ccffcc; padding: 10px;">

# Initalize the neural network 
    
Calling our PyTorch model passing in information about the neural network layers (`layers`) and bounds (`lb`, `ub`) initializes our model.


</div>

In [None]:
# Initialize network
model = NeuralNet(layers, lb, ub)

<div style="background-color: #cce5ff; padding: 10px;">

**Training might take a while depending on value of Train_iterations**

If you set Train_iterations too low the end results will be poor. 50000 was used to achieve excellent results. 

* If you are using a machine with GPUs please set `Train_iterations` to 50000 and this will run quickly.
* If you are using a well spec'ed laptop/computer, you can leave this setting `Train_iterations=50000` but it will take up to 10 mins.
* If you are using a low spec'ed laptop/computer or cannot leave the code running, `Train_iterations=20000` is the recommended value (this solution may not be accurate).
    
</div>

# Optimization

<div style="background-color: #ffffcc; padding: 10px;">

# Advanced 
    
    
Once you have run through the notebook once, you may wish to alter the optimizer used in the `train()` function to see how large an effect the choice of optimizer can have.

We've highlighted in the comments a number of possible optimizers to use from the [PyTorch Optimizers](https://pytorch.org/docs/stable/optim.html).

    
You can learn more about different optimization algorithms [here](https://towardsdatascience.com/optimizers-for-training-neural-network-59450d71caf6).
    
</div>



<div style="background-color: #ccffcc; padding: 10px;">

### Train our heat equation PINN

1. Initialize loss function and optimizer:

* criterion = `nn.MSELoss()`
* optimizer = `torch.optim.Adam(model.parameters(), lr=learning_rate)`

2. Prepare input data:

Extract `x` and `t` from `X`.
Convert `x`, `t`, and `u` to PyTorch tensors with `requires_grad=True`.

4. Training loop:

* Loop over the number of iterations (nIter).
* Zero the gradients: optimizer.zero_grad().
* Predict u and the residual f using the model: u_pred and f_pred.
* Compute the loss:
    * `loss_PDE`: Mean Squared Error (MSE) between `f_pred` and zero.
    * `loss_data`: MSE between `u_tf` and `u_pred`.
    * Total loss: loss = loss_PDE + 5*loss_data.
* Backpropagate the loss: `loss.backward()`.
* Update the model parameters: `optimizer.step()`.

5. Print progress:

Every 50 iterations, print the current iteration, loss, and elapsed time.

</div>

In [None]:
def train(nIter, X, u, k,  model, learning_rate=0.001):
    criterion = nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    start_time = time.time()
    x = X[:,0:1]
    t = X[:,1:2]
    
    x_tf = torch.tensor(x, requires_grad=True).float()
    t_tf = torch.tensor(t, requires_grad=True).float()
    u_tf = torch.tensor(u, requires_grad=True).float()

    for it in range(nIter):
        optimizer.zero_grad()
        u_pred = net_u(x_tf, t_tf, model )
        f_pred = net_f(x_tf, t_tf, model,  k)
    
        loss_PDE = criterion(f_pred, torch.zeros(f_pred.shape))
        loss_data = criterion(u_tf, u_pred)
        loss = loss_PDE + 5*loss_data
        
        loss.backward()
        optimizer.step()

        # Print
        if it % 50 == 0:
            elapsed = time.time() - start_time
            print('It: %d, Loss: %.3e, Time: %.2f' % 
                          (it, loss.item(), elapsed))
            start_time = time.time()

In [None]:
# Training
Train_iterations=50000

<div style="background-color: #cce5ff; padding: 10px;">

### Setting Up Tensors in PyTorch

In PyTorch, tensors are the primary data structure used for storing and manipulating data. Below is an expanded explanation of setting up tensors, including requiring gradients and setting data types.

**Requiring Gradients**:
- To enable automatic differentiation, set `requires_grad=True` when creating a tensor. This allows PyTorch to track operations on the tensor and compute gradients during backpropagation.

**Data Types**:
  - PyTorch supports various data types, such as `float32`, `float64`, `int32`, `int64`, etc.
  - The data type can be specified using the `dtype` argument when creating a tensor, as your data types must match when performing operations.
 

</div>

In [None]:
x = X_u_train[:,0:1]
t = X_u_train[:,1:2]

x_tf = torch.tensor(x, requires_grad=True)
t_tf = torch.tensor(t, requires_grad=True)
u_tf = torch.tensor(u_train, requires_grad=True)

u_pred = net_u(x_tf, t_tf, model )
f_pred = net_f(x_tf, t_tf, model,k)
model = model.float()
x_tf = x_tf.float()
t_tf = t_tf.float()
u_tf = u_tf.float()


# Now you can call your train function
train(Train_iterations, X_u_train, u_train,k, model, learning_rate=0.001)

<div style="background-color: #ccffcc; padding: 10px;">

# Use trained model to predict from data sample
    
The function `predict` will predict `u` using the trained model.

</div>

In [None]:
def predict(model, x_star, t_star):
    """
    Predicts the solution u and the residual f of the 1D heat equation using the trained PINN model.

    Parameters:
    model (torch.nn.Module): The trained Physics-Informed Neural Network model.
    x_star (numpy.ndarray): Array of spatial points where predictions are to be made.
    t_star (numpy.ndarray): Array of temporal points where predictions are to be made.

    Returns:
    tuple: A tuple containing:
        - u_star (numpy.ndarray): Predicted solution at the given spatial and temporal points.
        - f_star (numpy.ndarray): Predicted residual of the PDE at the given spatial and temporal points.
    """
    # Convert input spatial points to a PyTorch tensor with gradient tracking enabled
    x_star_tf = torch.tensor(x_star, requires_grad=True)
    # Convert input temporal points to a PyTorch tensor with gradient tracking enabled
    t_star_tf = torch.tensor(t_star, requires_grad=True)
    # Predict the solution u using the model
    u_star = net_u(x_star_tf, t_star_tf, model)
    # Predict the residual f of the PDE using the model
    f_star = net_f(x_star_tf, t_star_tf, model,k)
     
    # Detach the predictions from the computation graph and convert to NumPy arrays
    return u_star.detach().numpy(), f_star.detach().numpy()

In [None]:
u_pred, f_pred = predict(model,X_star[:,0:1],X_star[:,1:2])

error_u = np.linalg.norm(u_star-u_pred,2)/np.linalg.norm(u_star,2)

<div style="background-color: #ccffcc; padding: 10px;">

# Calculate errors
    
if you have set the number of training iterations large enough then the errors should be relatively small. 
</div>

In [None]:
print("f_pred mean = ", np.mean(f_pred))
print('Error u: %e' % (error_u))
print('Percent error u: ',  100*error_u)

In [None]:
# Set grid values back to full data set size for plotting

t = data['t'].flatten()[:,None]
x = data['x'].flatten()[:,None]
X, T = np.meshgrid(x,t) 

U_pred = griddata(X_star, u_pred.flatten(), (X,T), method='cubic')
Error = np.abs(Exact - U_pred)
percentError = 100*np.divide(Error, Exact)

<div style="background-color: #ccffcc; padding: 10px;">

# Plot exact and predicted $(u,t)$
    


</div>

In [None]:
fig, ax = plt.subplots(figsize=(15,15))
ax.axis('off')

print("--------- Errors ---------")
print('Percent error u: ',  100*error_u)
print("--------------------------")

####### Row 0: u(t,x) ##################
gs0 = gridspec.GridSpec(3, 2)
gs0.update(top=1-0.06, bottom=1-1/3, left=0.15, right=0.85, wspace=0, hspace=1)

########## Prediction ##################
ax = plt.subplot(gs0[0, :])
h = ax.imshow(U_pred.T, interpolation='nearest', cmap='rainbow',
              extent=[t.min(), t.max(), x.min(), x.max()],
              origin='lower', aspect='auto', vmin = 0, vmax = 1)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(h, cax=cax)

ax.plot(X_u_train[:,1], X_u_train[:,0], 'kx', label = 'Data (%d points)' % (u_train.shape[0]), markersize = 4, clip_on = False)

line = np.linspace(x.min(), x.max(), 2)[:,None]
ax.plot(t[2]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[5]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[10]*np.ones((2,1)), line, 'w-', linewidth = 1)

ax.set_xlabel('$t$')
ax.set_ylabel('$x$')
ax.legend(frameon=False, loc = 'best')
ax.set_title('$u(t,x) - Prediction$', fontsize = 10)

########## Exact ##################
ax = plt.subplot(gs0[1, :])
i = ax.imshow(Exact.T, interpolation='nearest', cmap='rainbow',
              extent=[t.min(), t.max(), x.min(), x.max()],
              origin='lower', aspect='auto', vmin = 0, vmax = 1)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(i, cax=cax)

ax.plot(X_u_train[:,1], X_u_train[:,0], 'kx', label = 'Data (%d points)' % (u_train.shape[0]), markersize = 4, clip_on = False)

line = np.linspace(x.min(), x.max(), 2)[:,None]
ax.plot(t[2]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[5]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[10]*np.ones((2,1)), line, 'w-', linewidth = 1)

ax.set_xlabel('$t$')
ax.set_ylabel('$x$')
ax.legend(frameon=False, loc = 'best')
ax.set_title('$u(t,x)$ - Exact', fontsize = 10)

########## Error ##################
ax = plt.subplot(gs0[2, :])
j = ax.imshow(percentError.T, interpolation='nearest', cmap='rainbow',
              extent=[t.min(), t.max(), x.min(), x.max()],
              origin='lower', aspect='auto', vmin = 0, vmax = 10)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(j, cax=cax)

ax.plot(X_u_train[:,1], X_u_train[:,0], 'kx', label = 'Data (%d points)' % (u_train.shape[0]), markersize = 4, clip_on = False)

line = np.linspace(x.min(), x.max(), 2)[:,None]
ax.plot(t[2]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[5]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[10]*np.ones((2,1)), line, 'w-', linewidth = 1)

ax.set_xlabel('$t$')
ax.set_ylabel('$x$')
ax.legend(frameon=False, loc = 'best')
ax.set_title('$u(t,x)$ - Percent Error', fontsize = 10)

####### Row 1: u(t,x) slices ##################
fig, ax = plt.subplots(figsize=(15,15))
ax.axis('off')
gs1 = gridspec.GridSpec(1, 3)
gs1.update(top=1-1/3, bottom=0, left=0.1, right=0.9, wspace=0.5)

ax = plt.subplot(gs1[:, 0])
ax.plot(x,Exact[2,:], 'b-', linewidth = 2, label = 'Exact')
ax.plot(x,U_pred[2,:], 'r--', linewidth = 2, label = 'Prediction')
ax.set_xlabel('$x$')
ax.set_ylabel('$u(t,x)$')
ax.set_title('$t = ' + str(t[2,0]) + '$', fontsize = 10)
ax.axis('square')
ax.set_xlim([0,1])
ax.set_ylim([0,1])

ax = plt.subplot(gs1[:, 1])
ax.plot(x,Exact[5,:], 'b-', linewidth = 2, label = 'Exact')
ax.plot(x,U_pred[5,:], 'r--', linewidth = 2, label = 'Prediction')
ax.set_xlabel('$x$')
ax.set_ylabel('$u(t,x)$')
ax.axis('square')
ax.set_xlim([0,1])
ax.set_ylim([0,1])
ax.set_title('$t = ' + str(t[5,0]) + '$', fontsize = 10)
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.35), ncol=5, frameon=False)

ax = plt.subplot(gs1[:, 2])
ax.plot(x,Exact[10,:], 'b-', linewidth = 2, label = 'Exact')
ax.plot(x,U_pred[10,:], 'r--', linewidth = 2, label = 'Prediction')
ax.set_xlabel('$x$')
ax.set_ylabel('$u(t,x)$')
ax.axis('square')
ax.set_xlim([0,1])
ax.set_ylim([0,1])
ax.set_title('$t = ' + str(t[10,0]) + '$', fontsize = 10)

<div style="background-color: #ccffcc; padding: 10px;">

**Results**

Above are the results of the PINN. The error for recreating the full solution field is $\approx 10 \%$, despite using only $N_u = 100$ data points. This shows the power of PINNs to learn from sparse measurements by augmenting the available observational data with knowledge of the underlying physics (i.e. governing equations). 

The three colour maps show the PINN prediction, the exact solution from the numerical method and the relative error between these two fields. We can see that the errors are largest near $t=0$ and $x=0$, but that overall the agreement is very good.

On the colourmap, we can see three vertical white lines, which show the location in time of the three profile plots of $u$ against $x$. The three heat profiles at these times are plotted against the exact solution found using numerical methods. The profiles can be seen to be in very good agreement, but show worse agreement.

**Further Work**

Congratulations, you have now trained your first physics-informed neural network!

This network contains a number of hyper-parameters that could be tuned to give better results. Various hyper-parameters include:
- number of data training points N_u,
- number of collocation training points N_f,
- number of layers in the network,
- number of neurons per layer,
- weightings for the data and PDE loss terms in the loss function (currently we use loss = loss_PDE + 5*loss_data).

It is also possible to use different sampling techniques for training data points. We randomly select $N_u$ data points, but alternative methods could be choosing only boundary points or choosing more points near the $t=0$ boundary. Choosing boundary points for training could help to reduce the errors seen in these regions.

Feel free to try out some of these changes if you like!

The next part of this notebook looks at the corresponding inverse problem for the 1D heat equation.

</div>

# 1D Heat Equation Inverse

<div style="background-color: #ccffcc; padding: 10px;">

Remembering that in 1D, the heat equation can be written as:

\begin{equation}
\frac{\partial u}{\partial t} = k \frac{\partial^2 u}{\partial x^2 }
\end{equation}

where $k$ is a material parameter called the coefficient of thermal diffusivity. For this notebook, we have solved the above equation numerically on a domain of $x \in [0,1]$ and $t \in [0, 0.25]$. Solving this equation numerically gives us a spatiotemporal domain $(x,t)$ and corresponding values of the solution $u$.

**Now we will let $k$ be an unknown input parameter in the PINN**. In reality, we know the value of $k$, as we set it when solving the system numerically, but for the sake of this example let's imagine we do not know the value of $k$ when we come to use the PINN. This corresponds to real-world problems where we may have observational data, knowledge of the governing equations, but little information for some input parameters for the system.

The network architecture for this example is the same as for the previous [example](#1D-Heat-Equation-Forwards). The only difference is that this time we do not know the value for $k$, and so in each training iteration we do not only updates the network weights and biases, but also the value of $k$. Through training, the network will then optimise the value of $k$ such that it fits with the observed data.
    
</div>


In [None]:
## This is the k value used to generate the data.
## We use this to compare to the value found by the PINN.
k_exact = 1

# Initialization (inverse problem)

<div style="background-color: #cce5ff; padding: 10px;">

Once you have run through the notebook once, you may wish to alter any the following:
    
- number of data training points `N_u`,
- number of collocation training points `N_f`,
- number of layers in the network `layers`,
- number of neurons per layer `layers`.

</div>

In [None]:
N_u = 100 # number of data points (default 100)
N_f = 2000 # collocation points (default 2000)
# structure of network: two inputs (x,t) and one output u
# 8 fully connected layers with 20 nodes per layer
layers = [2, 20, 20, 20, 20, 20, 20, 20, 20, 1]

In [None]:
# This code is duplicated from above in case you have been playing with parameters.
data = scipy.io.loadmat("Data/heatEquation_data.mat")
t = data['t'].flatten()[:,None] # read in t and flatten into column vector
x = data['x'].flatten()[:,None] # read in x and flatten into column vector
# Exact represents the exact solution to the problem, from the Matlab script provided
Exact = np.real(data['usol']).T # Exact has structure of nx times nt

# print("t = ", t.transpose())
# print("x = ", x.transpose())
print("usol shape = ", Exact.shape)

# We need to find all the x,t coordinate pairs in the domain
X, T = np.meshgrid(x,t)

# Flatten the coordinate grid into pairs of x,t coordinates
X_star = np.hstack((X.flatten()[:,None], T.flatten()[:,None])) # coordinates x,t
u_star = Exact.flatten()[:,None]   # corresponding solution value with each coordinate            

print("X has shape ", X.shape, ", X_star has shape ", X_star.shape)
    
# Domain bounds (-1,1)
lb = X_star.min(0)
ub = X_star.max(0)  

print("Lower bounds of x,t: ", lb)
print("Upper bounds of x,t: ", ub)

## train using internal points
X_u_train = X_star
u_train = u_star

## Generate collocation points using Latin Hypercube sampling within the bounds of the spationtemporal coordinates
# Generate N_f x,t coordinates within range of upper and lower bounds
X_f_train = lb + (ub-lb)*lhs(2, N_f) # the 2 denotes the number of coordinates we have - x,t 

## In addition, we add the X_u_train coordinats from the boundaries to the X_f coordinate set
X_f_train = np.vstack((X_f_train, X_u_train)) # stack up all training x,t coordinates for u and f 

## We downsample the boundary data to leave N_u randomly distributed points
## This makes the training more difficult - 
## if we used all the points then there is not much for the network to do!
idx = np.random.choice(X_star.shape[0], N_u, replace=False)
X_u_train = X_star[idx,:]
u_train = u_star[idx,:]

<div style="background-color: #ccffcc; padding: 10px;">

Now we will use all the same fuctions as before, except we will modify `k` and the train function to handle a variable `k` value.
    
</div>

In [None]:
# Generate data using true k
k = k_exact
k = torch.tensor(float(k), requires_grad=True)
model = NeuralNet(layers,lb,ub)
x = X_u_train[:,0:1]
t = X_u_train[:,1:2]


x_t = torch.tensor(x, requires_grad=True)
t_t = torch.tensor(t, requires_grad=True)
u_t = torch.tensor(u_train, requires_grad=True)

u_pred = net_u(x_t, t_t, model)
f_pred = net_f(x_t, t_t, model,k)
model = model.float()
x_t = x_t.float()
t_t = t_t.float()
u_t = u_t.float()

<div style="background-color: #cce5ff; padding: 10px;">

**Training might take a while depending on value of Train_iterations**

If you set Train_iterations too low the end results will be poor. 50000 was used to achieve excellent results. 

* If you are using a machine with GPUs please set `Train_iterations` to 50000 and this will run quickly.
* If you are using a well spec'ed laptop/computer, you can leave this setting `Train_iterations=50000` but it will take up to 10 mins.
* If you are using a low spec'ed laptop/computer or cannot leave the code running, `Train_iterations=20000` is the recommended value (this solution may not be accurate).
    
</div>

In [None]:
# Training
Train_iterations=50000
train(Train_iterations, X_u_train, u_train,k, model, learning_rate=0.001)

In [None]:
u_pred, f_pred = predict(model,X_star[:,0:1],X_star[:,1:2])
error_u = np.linalg.norm(u_star - u_pred, 2)/np.linalg.norm(u_star, 2)

In [None]:
print("f_pred mean = ", np.mean(f_pred))
print('Error u: %e' % (error_u))
print('Percent error u: ',  100*error_u)

In [None]:
# Set grid values back to full data set size for plotting

t = data['t'].flatten()[:,None]
x = data['x'].flatten()[:,None]
X, T = np.meshgrid(x,t) 

U_pred = griddata(X_star, u_pred.flatten(), (X,T), method='cubic')
Error = np.abs(Exact - U_pred)
percentError = 100*np.divide(Error, Exact)

In [None]:
fig, ax = plt.subplots(figsize=(15,15))
ax.axis('off')

print("--------- Errors ---------")
print('Percent error u: ',  100*error_u)
print("--------------------------")

####### Row 0: u(t,x) ##################
gs0 = gridspec.GridSpec(3, 2)
gs0.update(top=1-0.06, bottom=1-1/3, left=0.15, right=0.85, wspace=0, hspace=1)

########## Prediction ##################
ax = plt.subplot(gs0[0, :])
h = ax.imshow(U_pred.T, interpolation='nearest', cmap='rainbow',
              extent=[t.min(), t.max(), x.min(), x.max()],
              origin='lower', aspect='auto', vmin = 0, vmax = 1)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(h, cax=cax)

ax.plot(X_u_train[:,1], X_u_train[:,0], 'kx', label = 'Data (%d points)' % (u_train.shape[0]), markersize = 4, clip_on = False)

line = np.linspace(x.min(), x.max(), 2)[:,None]
ax.plot(t[2]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[5]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[10]*np.ones((2,1)), line, 'w-', linewidth = 1)

ax.set_xlabel('$t$')
ax.set_ylabel('$x$')
ax.legend(frameon=False, loc = 'best')
ax.set_title('$u(t,x) - Prediction$', fontsize = 10)

########## Exact ##################
ax = plt.subplot(gs0[1, :])
i = ax.imshow(Exact.T, interpolation='nearest', cmap='rainbow',
              extent=[t.min(), t.max(), x.min(), x.max()],
              origin='lower', aspect='auto', vmin = 0, vmax = 1)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(i, cax=cax)

ax.plot(X_u_train[:,1], X_u_train[:,0], 'kx', label = 'Data (%d points)' % (u_train.shape[0]), markersize = 4, clip_on = False)

line = np.linspace(x.min(), x.max(), 2)[:,None]
ax.plot(t[2]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[5]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[10]*np.ones((2,1)), line, 'w-', linewidth = 1)

ax.set_xlabel('$t$')
ax.set_ylabel('$x$')
ax.legend(frameon=False, loc = 'best')
ax.set_title('$u(t,x)$ - Exact', fontsize = 10)

########## Error ##################
ax = plt.subplot(gs0[2, :])
j = ax.imshow(percentError.T, interpolation='nearest', cmap='rainbow',
              extent=[t.min(), t.max(), x.min(), x.max()],
              origin='lower', aspect='auto', vmin = 0, vmax = 10)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
fig.colorbar(j, cax=cax)

ax.plot(X_u_train[:,1], X_u_train[:,0], 'kx', label = 'Data (%d points)' % (u_train.shape[0]), markersize = 4, clip_on = False)

line = np.linspace(x.min(), x.max(), 2)[:,None]
ax.plot(t[2]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[5]*np.ones((2,1)), line, 'w-', linewidth = 1)
ax.plot(t[10]*np.ones((2,1)), line, 'w-', linewidth = 1)

ax.set_xlabel('$t$')
ax.set_ylabel('$x$')
ax.legend(frameon=False, loc = 'best')
ax.set_title('$u(t,x)$ - Percent Error', fontsize = 10)

####### Row 1: u(t,x) slices ##################
fig, ax = plt.subplots(figsize=(15,15))
ax.axis('off')
gs1 = gridspec.GridSpec(1, 3)
gs1.update(top=1-1/3, bottom=0, left=0.1, right=0.9, wspace=0.5)

ax = plt.subplot(gs1[:, 0])
ax.plot(x,Exact[2,:], 'b-', linewidth = 2, label = 'Exact')
ax.plot(x,U_pred[2,:], 'r--', linewidth = 2, label = 'Prediction')
ax.set_xlabel('$x$')
ax.set_ylabel('$u(t,x)$')
ax.set_title('$t = ' + str(t[2,0]) + '$', fontsize = 10)
ax.axis('square')
ax.set_xlim([0,1])
ax.set_ylim([0,1])

ax = plt.subplot(gs1[:, 1])
ax.plot(x,Exact[5,:], 'b-', linewidth = 2, label = 'Exact')
ax.plot(x,U_pred[5,:], 'r--', linewidth = 2, label = 'Prediction')
ax.set_xlabel('$x$')
ax.set_ylabel('$u(t,x)$')
ax.axis('square')
ax.set_xlim([0,1])
ax.set_ylim([0,1])
ax.set_title('$t = ' + str(t[5,0]) + '$', fontsize = 10)
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.35), ncol=5, frameon=False)

ax = plt.subplot(gs1[:, 2])
ax.plot(x,Exact[10,:], 'b-', linewidth = 2, label = 'Exact')
ax.plot(x,U_pred[10,:], 'r--', linewidth = 2, label = 'Prediction')
ax.set_xlabel('$x$')
ax.set_ylabel('$u(t,x)$')
ax.axis('square')
ax.set_xlim([0,1])
ax.set_ylim([0,1])
ax.set_title('$t = ' + str(t[10,0]) + '$', fontsize = 10)

<div style="background-color: #ccffcc; padding: 10px;">

**Results**

Above are the results of the PINN. The error for recreating the full solution field is $\approx 5 \%$, despite using only $N_u = 100$ data points. This shows the power of PINNs to learn solution fields from sparse measurements, even when some of the input parameters are unknown.

The three colour maps show the PINN prediction, the exact solution from the numerical method and the relative error between these two fields. We can see that the errors are largest near $t=0$ and $x=0$, but that overall the agreement is very good.

On the colour map, we can see three vertical white lines, which show the location in time of the three profile plots of $u$ against $x$. The three heat profiles at these times are plotted against the exact solution found using numerical methods. The profiles can be seen to be in very good agreement, but show worse agreement.

</div>

<div style="background-color: #ffffcc ; padding: 10px;">

**Further Work**

Congratulations, you have now trained your second physics-informed neural network!

This network contains a number of hyper-parameters that could be tuned to give better results. Various hyper-parameters include:
- number of data training points N_u,
- number of collocation training points N_f,
- number of layers in the network,
- number of neurons per layer,
- weightings for the data and PDE loss terms in the loss function (currently we use loss = loss_PDE + 5*loss_data)
- initialisation value for k,
- optimisation.

It is also possible to use different sampling techniques for training data points. We randomly select $N_u$ data points, but alternative methods could be choosing only boundary points or choosing more points near the $t=0$ boundary.

    
 </div>

<hr>


<div style="background-color: #e6ccff; padding: 10px;">
    
# Next Steps
    
Next we move on to a more complex example using the Navier Stokes Equation in the next notebook.
    
</div>