Copyright (c) 2026 [Vital de Nodrest]

The code (Python, shell and PowerShell cells) is under MIT license. Feel free to share and experiment! See LICENSE-CODE.txt in the project root for more information.

Due to their time-consuming and didactic nature, text & image contents are under CC BY-NC-SA 4.0 license. Removal of the author's name or redistribution without credit is prohibited. See LICENSE-DOCS.txt in the project root for more information.

# Introduction to PINNs for Helmholtz problems

This notebook aims at being a basic introduction to PINNs using the example of Helmholtz wave propagation problems.

## PINNs

### Universal approximation theorem and PDE problems

### Automatic differentiation and physical losses

## Configuration

You can use any recent Python environment to run this notebook (ipykernel will be required for interactive computing).

The following cells are necessary regardless of your device.

In [None]:
pip install matplotlib

In [None]:
import matplotlib.pyplot as plt

The next subsections provide different configuration scripts depending on your device.

You can uncomment and run the ones you need.

### Apple Silicon, GPU

Configuration for Apple Silicon devices running on GPU (mps).

In [None]:
pip install torch torchvision

In [None]:

import torch

if torch.backends.mps.is_available() and torch.backends.mps.is_built():
    device = torch.device("mps")
    print("The MPS configuration worked.")
else:
    device = torch.device("cpu")
    print("MPS device not found. Performance might be impacted.")


### Linux, CUDA 12.6

Configuration for Linux devices equipped with CUDA 12.6.

In [None]:
#pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126

### Linux, ROCm 7.1

Configuration for Linux devices equipped with ROCm 7.1.

In [None]:
#pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm7.1

### Windows, CUDA 12.6

Configuration for Windows devices equipped with CUDA 12.6.

In [None]:
#pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126

### Windows, CPU

Configuration for Windows users choosing to run on CPU. Performance might be impacted.

In [None]:
#pip install torch torchvision

### Other devices

If your situation doesn't fit any of the subsections, see the [PyTorch installation tutorial](https://pytorch.org/get-started/locally/).

## A 1D Helmholtz problem

## Implementation (1D)

The following chapter goes over the PyTorch implementation of a PINN solver for the Helmholtz problem.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np

In [None]:
# !!!
device=torch.device('cpu')

### Neural network

Let's initialize our neural network.

A typical PINN takes the physical coordinates of the problem as an input and outputs the solution.
In this example, the input is **x** (**1D** space variable) and the output is **u** (**1D** scalar output).

There are many architectural possibilities for the neural network. The simplest choice is a uniform fully-connected neural network with tanh activation functions. By default, each neuron has a weight and a bias.

We cannot use the tanh activation function after the last layer as we need a solution that can reach $1$ and $-1$.

TODO plot

In [None]:
class FNN(nn.Module):
    def __init__(self, input_dim: int=1, output_dim: int=1, width: int=20, hidden_layers: int=3):
        super().__init__()
        #self.flatten = nn.Flatten()
        layers = [nn.Linear(input_dim, width), nn.Tanh()] # input -> hidden layer 1
        # hidden layer i -> hidden layer i+1
        for _ in range(hidden_layers-1):
            layers.append(nn.Linear(width, width))
            layers.append(nn.Tanh())
        layers.append(nn.Linear(width, output_dim)) # last hidden layer -> output
        self.stack = nn.Sequential(*layers)
    
    def forward(self, x):
        logits = self.stack(x)
        return logits

In [None]:
model = FNN(width=50)
model.to(device)

Let's optimize our model parameters $\theta$ (weight & biases) iteratively using the Adam algorithm with a learning rate of $.001$ to minimize the mean-square-error loss function:

$$L_{\text{MSE}} \left(\theta, (x_i)_{i=1}^N, (y_i)_{i=1}^N \right) = \frac{1}{N} \sum_{i=1}^{N} \left( f_{\theta}(x_i) - y_i \right)^2 $$

Where $f_{\theta}$ is the model parametrized by the vector $\theta$, and we consider $N$ training samples:
- Training points $(x_i)_{i=1}^N$
- Respective solutions $(y_i)_{i=1}^N$

In [None]:
# Optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Loss
loss = torch.nn.MSELoss() # the default reduction is "mean", dividing the loss by the number of samples

### PDE problem

Let's implement our PDE problem in Python.

In [None]:
k = 5

In [None]:
def residual_pde(X, U):
    du_dx = torch.autograd.grad(outputs=U,
                                inputs=X,
                                grad_outputs=torch.ones_like(U), # Shape information for batches
                                create_graph=True, # creating a graph for higher order derivatives
                                retain_graph=True,
                                )[0]
    du_dxx = torch.autograd.grad(outputs=du_dx,
                                 inputs=X,
                                 grad_outputs=torch.ones_like(du_dx), # Shape information for batches
                                 create_graph=True,
    )[0]
    return du_dxx + k**2 * U

In [None]:
def residual_0(X, U):
    du_dx = torch.autograd.grad(outputs=U,
                                inputs=X,
                                grad_outputs=torch.ones_like(U), # Shape information for batches
                                create_graph=True,
                                retain_graph=True,
                                )[0]
    return du_dx

In [None]:
def residual_1(X, U):
    du_dx = torch.autograd.grad(outputs=U,
                                inputs=X,
                                grad_outputs=torch.ones_like(U), # Shape information for batches
                                create_graph=True,
                                retain_graph=True,
                                )[0]
    return du_dx + k * torch.sin(k * torch.ones_like(X))

In [None]:
def u(x):
    return torch.cos(k*x)

### Data

In [None]:
### Training
X_train_pde = torch.linspace(0, 1, 10 + 2, dtype=torch.float32, device=device)[1: -1].unsqueeze_(1)
print(X_train_pde)
X_train_0 = torch.tensor([0], dtype=torch.float32, device=device).unsqueeze_(1) # left boundary
X_train_1 = torch.tensor([1], dtype=torch.float32, device=device).unsqueeze_(1) # right boundary
### Validation
X_validation_pde = torch.linspace(0.01, 0.99, 100, dtype=torch.float32, device=device).unsqueeze_(1)
X_validation_0 = torch.tensor([0], dtype=torch.float32, device=device).unsqueeze_(1)
X_validation_1 = torch.tensor([1], dtype=torch.float32, device=device).unsqueeze_(1)
### Test
X_test = torch.linspace(0, 1, 1000, dtype=torch.float32, device=device).unsqueeze_(1)
U_test_exact = u(X_test)
U_test_exact_norm = torch.linalg.vector_norm(U_test_exact)

### Training loop

In [None]:
# LOOP

num_epochs = 5000 # !!!
train_losses = torch.zeros((3, num_epochs))
validation_losses = torch.zeros((4, num_epochs))
test_metrics = torch.zeros((num_epochs))

best_model_state = None
best_validation_loss = torch.inf

for epoch in range(num_epochs):
    
    ## Training phase
    
    ### Configuration
    model.train()
    optimizer.zero_grad() # reset previously accumulated gradients
    
    ### Optional reampling
    #X_train_pde = torch.rand(size=(10,1)).unsqueeze_(1)
    
    ### Enable automatic differentiation for training and residuals audtodiff
    X_train_pde.requires_grad_(True)
    X_train_0.requires_grad_(True)
    X_train_1.requires_grad_(True)
    
    ### Evaluate the model
    U_train_pde = model(X_train_pde)
    U_train_0 = model(X_train_0)
    U_train_1 = model(X_train_1)
    
    ### Compute residuals
    RES_train_pde = residual_pde(X_train_pde, U_train_pde)
    RES_train_0 = residual_0(X_train_0, U_train_0)
    RES_train_1 = residual_1(X_train_1, U_train_1)
    
    ### Evaluate losses and accumulate the parameter derivatives in place
    train_loss_pde: torch.Tensor = loss(RES_train_pde, torch.zeros_like(RES_train_pde))
    train_loss_pde.backward()
    train_losses[0, epoch] = train_loss_pde.tolist()
    
    train_loss_0 = loss(RES_train_0, torch.zeros_like(RES_train_0))
    train_loss_0.backward()
    train_losses[1, epoch] = train_loss_0.tolist()
    
    train_loss_1 = loss(RES_train_1, torch.zeros_like(RES_train_1))
    train_loss_1.backward()
    train_losses[2, epoch] = train_loss_1.tolist()
    
    ### Perform optimizer step
    optimizer.step()
        
        
    ## Validation phase
    
    ### Configuration
    model.eval()
    
    """
    torch.no_grad() cannot be used because computing 
    """
    
    ### Enable automatic differentiation for residuals audtodiff
    X_validation_pde.requires_grad_(True)
    X_validation_0.requires_grad_(True)
    X_validation_1.requires_grad_(True)
        
    ### Evaluate the model
    U_validation_pde = model(X_validation_pde)
    U_validation_0 = model(X_validation_0)
    U_validation_1 = model(X_validation_1)
    
    ### Compute residuals
    RES_validation_pde = residual_pde(X_validation_pde, U_validation_pde)
    RES_validation_0 = residual_0(X_validation_0, U_validation_0)
    RES_validation_1 = residual_1(X_validation_1, U_validation_1)
    
    ### Evaluate loss
    validation_loss_total = 0
    
    validation_loss_pde: torch.Tensor = loss(RES_validation_pde, torch.zeros_like(RES_validation_pde))
    validation_losses[0, epoch] = validation_loss_pde.tolist()
    validation_loss_total += validation_loss_pde.tolist()
    
    validation_loss_0: torch.Tensor = loss(RES_validation_0, torch.zeros_like(RES_validation_0))
    validation_losses[1, epoch] = validation_loss_0.tolist()
    validation_loss_total += validation_loss_0.tolist()
    
    validation_loss_1: torch.Tensor = loss(RES_validation_1, torch.zeros_like(RES_validation_1))
    validation_losses[2, epoch] = validation_loss_1.tolist()
    validation_loss_total += validation_loss_1.tolist()
    
    
    validation_losses[3, epoch] = validation_loss_total
        
    ### Save the best model
    if validation_loss_total < best_validation_loss:
        best_validation_loss = validation_loss_total
        best_model_state = model.state_dict().copy()
        

    ## Test phase (optional in the loop)
    
    with torch.no_grad():
        U_test = model(X_test)
        test_metrics[epoch] = (torch.linalg.vector_norm(U_test - U_test_exact) / U_test_exact_norm)

### Displaying results

In [None]:
# Training history
fig1, ax1 = plt.subplots()
ax1.set_yscale("log")
ax1.plot(train_losses[0, :], label=']0,1[ residue')
ax1.plot(train_losses[1, :], label='0 residue')
ax1.plot(train_losses[2, :], label ='1 residue')
ax1.plot(torch.sum(train_losses, dim=0), label='total residue')
ax1.set_xlabel('epoch')
ax1.set_title('Training losses')
ax1.legend()


In [None]:
fig2, ax2 = plt.subplots()
ax2.set_yscale("log")
ax2.plot(validation_losses[0, :], label=']0,1[ residue')
ax2.plot(validation_losses[1, :], label='0 residue')
ax2.plot(validation_losses[2, :], label='1 residue')
ax2.plot(validation_losses[3, :], label='total residue')
ax2.legend()
ax2.set_title('Validation losses')


In [None]:

fig3, ax3 = plt.subplots()
ax3.set_yscale("log")
ax3.plot(test_metrics)


In [None]:

## Last model
fig4, ax4 = plt.subplots()
ax4.plot(X_test, U_test_exact)
ax4.plot(X_test, U_test)

u_min = torch.min(torch.min(U_test_exact), torch.min(U_test))
u_max = torch.max(torch.max(U_test_exact), torch.max(U_test))
for x in X_train_pde.detach().tolist() + X_train_0.detach().tolist() + X_train_1.detach().tolist():
    ax4.plot([x]*2, [u_min, u_max], color='grey', linewidth=.3)

last_str = f"Last model test metric: {test_metrics[-1].tolist()}"
print(last_str)


In [None]:

## Best model
with torch.no_grad():
    model.load_state_dict(best_model_state)
    U_test = model(X_test)
    best_test_metric = (torch.linalg.vector_norm(U_test - U_test_exact) / U_test_exact_norm)
fig5, ax5 = plt.subplots()
ax5.plot(X_test, U_test_exact)
ax5.plot(X_test, U_test)
best_str = f"Best model test metric: {best_test_metric.tolist()}"
print(best_str)

## Further research...