# Hands-on Tutorial: AI Fundamentals for Research (joint session BP/TUT/DY/AKPIK)
Sunday, 16th March 2025, 16:00–18:15, H2<br>
Presented at the **[2025 DPG Spring Meeting of the Condensed Matter Section](https://www.dpg-verhandlungen.de/year/2025/conference/regensburg/part/tut/session/1)**

## Hands-On Session 1: Function Approximation
#### •Jan Bürger<sup>1</sup>, Janine Graser<sup>2</sup>, Robin Msiska<sup>2,3</sup>, and Arash Rahimi-Iman<sup>4</sup>

<sup>1</sup>ErUM-Data-Hub, RWTH Aachen University, Aachen, Germany<br>
<sup>2</sup>Faculty of Physics and Center for Nanointegration Duisburg-Essen (CENIDE), University of Duisburg-Essen, Duisburg, Germany<br>
<sup>3</sup>Department of Solid State Sciences, Ghent University, Ghent, Belgium<br>
<sup>4</sup>I. Physikalisches Institut and Center for Materials Research, Justus-Liebig-University Gießen, Gießen, Germany

### Abstract
*In the first half of the interactive session, participants will work with Jupyter Notebooks to explore practical applications of machine learning. They will train simple neural networks to predict a mathematical function, gaining hands-on experience in tuning key parameters. Since neural networks can typically be considered universal function approximators, this concept is effectively illustrated using a one-dimensional function, making it easy to visualize and understand.*"

### About this notebook
#### **Presenter: Jan Bürger**

The first part of this tutorial is based on material from a tutorial by Dr. Andrea Santamaria Garcia (Karlsruhe Institute of Technology) and Chenran Xu (Karlsruhe Institute of Technology) held at the [Deep Learning School](https://indico.desy.de/event/40559/timetable/) 'Basic Concepts' of the [ErUM-Data-Hub](https://erumdatahub.de).


## Motivation - Why fitting a 1D mathematical function?
- Neural networks fitting to a abstract function
- To simplify, we use 1d mathematical function to understand the basics of NN

Imports and modules

In [None]:
# For automatic execution time tracking
try:
    %load_ext autotime
except:
    !pip install ipython-autotime
    %load_ext autotime

In [None]:
%matplotlib inline
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
import h5py
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interact, IntSlider, FloatSlider

plt.rcParams['figure.figsize'] = 6, 4
plt.rcParams['savefig.dpi'] = 300
plt.rcParams['image.cmap'] = "viridis"
plt.rcParams['image.interpolation'] = "none"
plt.rcParams['savefig.bbox'] = "tight"

## Reproducibility

- We set the random seeds so that the training results are always the same
- Feel free to change the seed number to see the effects of the random initialization of the network weights on the training results

In [None]:
SEED = 26
torch.manual_seed(SEED)
torch.backends.openmp.deterministic = True
np.random.seed(SEED)

## Accelerated computing

- **Accelerated computing** = when we add extra hardware to accelerate computation, like GPUs (needed in deep machine learning).
- **GPU**: many "not-so-intelligent" cores that are parallelizable. They can carry out specific operations in a very efficient way, e.g. tensor cores perform very efficient sparse tensor multiplication.

We will be working with torch tensors in this notebook! instead of the usual numpy arrays. This means you could execute this code on a GPU if you have access to one with a simple command `torch.device("cuda")`.

In [None]:
# Set the GPU device
myGPU = 0
device = torch.device(f'cuda:{myGPU}' if torch.cuda.is_available() else 'cpu')

# Set the current device
torch.cuda.set_device(device)

import os
os.environ["CUDA_VISIBLE_DEVICES"] = str(myGPU)

print(f'Using {device} device')

## Conventions for this notebook

### Jargon

- Unit = activation = neuron
- Model = neural network
- Feature = dimension of input vector = number of independent variables
- Hypothesis = prediction = output of the model

### Indices

- **Data points:** $i = 1,..., n$ 
- **Parameters of the model:** $k = 1,..., p$ 
- **Layers:** $j = 1,..., l$ 
- **Activation unit label:** $s$ 

### Scalars

- $u^j$ = number of units in layer $j$
- $z_s^j$ is the activation unit $s$ in layer $j$

## Conventions for this notebook

### Vectors and matrices

- $\pmb{X}$: input vector of dimension $[n \times (p \times 1)]$
- $z^j$: activation vector of layer $j$ of dimension $[(u^j + 1) \times 1]$
- $\pmb{w}^j$: weight matrix from layer $j$ to $j+1$, of dimension $[u^{j+1} \times (u^j + 1)]$


<span style='color:Blue'> where the $+1$ accounts for the bias unit </span>


$$
\pmb{X} =
\begin{bmatrix}
x_0 \\
x_1 \\
\vdots \\
x_p
\end{bmatrix} \ \ ; \ \
\pmb{w}^j =
\begin{bmatrix}
w_{10} & \dots & w_{1(u^j + 1)}\\
w_{20} & \ddots\\
\vdots \\
w_{(u^{j+1}) 0} & & w_{(u^{j+1})(u^j + 1)}\\
\end{bmatrix} 
$$

## Universal Approximation Theorem (s. [here](https://en.wikipedia.org/wiki/Universal_approximation_theorem))
- When the activation function is non-linear, then a two-layer neural network can be proven to be a **universal function approximator**.
- <span style='color:red'> This is where the power of neural networks comes from! </span>

## Create a function to fit
Let's create a simple non-linear function to fit with our neural network:

In [None]:
sample_points = 1e3
x_lim = 100
x = np.linspace(0, x_lim, int(sample_points))
y = np.sin(x * x_lim * 1e-4) * np.cos(x * x_lim * 1e-3) * 3
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.grid()
plt.title('Function to be fitted')

## Data shape
- Our data is 1D, meaning it has only one feature.
- We want a model that for a given $x$ it returns the correspondent $y$ value.
- This means that a model with one neuron input and a one neuron output suffices:

In [None]:
n_input = 1
n_out = 1

In [None]:
print(len(x))
print(len(y))

### Data reshaping 
In order for the model to take each point of the data one by one we need to do some additional re-shaping, where we introduce an additional dimension for each entry:

In [None]:
x_reshape = x.reshape((int(len(x) / n_input), n_input))
y_reshape = y.reshape((int(len(y) / n_out), n_out))

In [None]:
# Uncomment to check the shape change
print(x.shape, y.shape)
print(x_reshape.shape, y_reshape.shape)
print(x[10], x_reshape[10])

In [None]:
# print(x_reshape)
# print(x)

<h2>PyTorch</h2>

<a href="https://pytorch.org/">PyTorch</a> is an optimized tensor library for deep learning using GPUs and CPUS.
- A <span style='color:#b51f2a'> **tensor** </span> is an algebraic object that may map between different objects such as vectors, scalars, and even other tensors. It can be easily understood as a multidimensional matrix/array. 
    - These objects allow to easily carry out machine learning computations in problems with many features, weights, etc.
    - In PyTorch, a <a href="https://pytorch.org/docs/stable/tensors.html#:~:text=A%20torch.,of%20a%20single%20data%20type.">tensor</a> is a multi-dimensional matrix containing elements of a single data type.

<img src="img/tensor_2.jpeg" style="width:40%; margin:auto;" />
<p style="clear:both; font-size: small; text-align: center; margin-top:1em;">image from <a href="https://towardsai.net/p/deep-learning/working-with-pytorch-tensors">Working with PyTorch tensors</a></span>

## Data type
The data that we will input to the model needs to be of the type `torch.float32`

_Side Remark_: The default dtype of torch tensors (also the layer parameters) is `torch.float32`, which is related to the GPU performance optimization. If one wants to use `torch.float64`/`torch.double` instead, one can set the tensors to double precision via `v = v.double()` or set the global precision via `torch.set_default_dtype(torch.float64)`. Just keep in mind, the NN parameters and the input tensors should have the same precision.

Before starting, let's convert our data numpy arrays to torch tensors:

In [None]:
# convert numpy to torch tensor
x_torch = torch.from_numpy(x_reshape)
y_torch = torch.from_numpy(y_reshape)

In [None]:
# Type checking:
print(x.dtype, y.dtype)
print(x_torch.dtype, y_torch.dtype)

### Convert data type
The type is still not correct, but we can easily convert it:

In [None]:
# convert from float64 to float32
x_torch = x_torch.to(dtype=torch.float32)
y_torch = y_torch.to(dtype=torch.float32)

In [None]:
# Type checking:
print(x.dtype, y.dtype)
print(x_torch.dtype, y_torch.dtype)

In [None]:
plt.plot(x_torch.numpy(), y_torch.numpy())

## Data normalization
We will also need to normalize the data to make sure we are in the non-linear region of the activation functions:

<img src="img/activation-functions.png" width="60%"/>


### Calculate the data normalization
We are using min-max normalization to normalize the input tensors to [0,1] and output tensors to [-0.5,0.5]

In [None]:
x_norm = (x_torch - x_torch.min()) / (x_torch.max() - x_torch.min())
y_norm = (y_torch - y_torch.min()) / (y_torch.max() - y_torch.min()) - 0.5

In [None]:
plt.plot(x_norm.detach().numpy(), y_norm.detach().numpy())
plt.xlabel('x')
plt.ylabel('y')
plt.grid()
plt.title('Normalized function')

In [None]:
x_norm.shape

## Build your model
- In PyTorch [`Sequential`](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html#torch.nn.Sequential) stands for *sequential container*, where modules can be added sequentially and are connected in a cascading way. The output for each module is forwarded sequentially to the next.
- Now we will build a simple model with one hidden layer with `Sequential`
- Remember that every layer in a neural network is followed by an **activation layer** that performs some additional operations on the neurons.

<img src="./img/sequential.png" alt="drawing" style="width:400px;"/>


### Let's build 3 different models

### Model 0 (1 layer, LeakyReLU)

A small model with small non-linearity

In [None]:
n_hidden_01 = 5
model0 = nn.Sequential(
    nn.Linear(n_input, n_hidden_01),
    nn.LeakyReLU(),
    nn.Linear(n_hidden_01, n_out),
).to(device)  # Move model to the correct device
print(model0)

### Model 1 (1 layer, Tanh)

A small model with some non-linearity

In [None]:
n_hidden_11 = 5
model1 = nn.Sequential(nn.Linear(n_input, n_hidden_11),
                      nn.Tanh(),
                      nn.Linear(n_hidden_11, n_out),
                      ).to(device)
print(model1)

### Model 2 (2 layer, optional Tanh or LeakReLU)

A larger model with non-linearity
Activation function Tanh or optional LeakReLU

In [None]:
n_hidden_21 = 10
n_hidden_22 = 5
# activation function Tanh
model2 = nn.Sequential(nn.Linear(n_input, n_hidden_21),
                      nn.Tanh(),
                      nn.Linear(n_hidden_21, n_hidden_22),
                      nn.Tanh(),
                      nn.Linear(n_hidden_22, n_out),
                      ).to(device)
print(model2)

# Uncomment to use LeakyReLU instead of Tanh 
# model2 = nn.Sequential(nn.Linear(n_input, n_hidden_21),
#                       nn.LeakyReLU(),
#                       nn.Linear(n_hidden_21, n_hidden_22),
#                       nn.LeakyReLU(),
#                       nn.Linear(n_hidden_22, n_out),
#                       )
# print(model2)

<h3 style="color:#145a32;">How much do you think each hyperparameter will affect the quality of the model</h3>

- <p style="color:#145a32;"> uncomment and execute the next line to explore the methods of the <code>model</code> object you created

In [None]:
# dir(model0)

## Understanding the PyTorch model
Try the `parameters` method (needs to be instantiated).

In [None]:
model0.parameters()

The `parameters` method gives back a *generator*, which means it needs to be iterated over to give back an output:

In [None]:
for element in model0.parameters():
    print(element)

### <span style="color:green;">Without taking into account any bias unit: can you identify the elements of the model by their dimensions?</span>

- <span style="color:green;">The first element corresponds to the weight matrix</span> $\theta^0$ <span style="color:green;">from layer 0 to layer 1, of dimensions</span> $u^{j+1} \times u^j = u^2 \times u^1$ <span style="color:green;">(so, without bias)</span>
- <span style="color:green;">The second element corresponds to the values of the activation units in layer 1</span>
- <span style="color:green;">The third element corresponds to the weight matrix</span> $\theta^1$ <span style="color:green;">from layer 1 to layer 2, of dimensions</span> $u^{j+1} \times u^j = u^3 \times u^3$ <span style="color:green;">(without bias)</span>
- <span style="color:green;">The fourth element is the output of the model</span>

Let's have a look at what the contents of those tensors:

In [None]:
for element in model0.parameters():
    print(element)

<h3 style="color:#145a32;">What are these values?</h3>

## Define the loss function
- Reminder: the **loss function** measures how distant the predictions made by the model are from the actual values
- `torch.nn` provides many different types of [loss functions](https://pytorch.org/docs/stable/nn.html#loss-functions). One of the most popular ones in the [Mean Squared Error (MSE)](https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html#torch.nn.MSELoss) since it can be applied to a wide variety of cases.
- In general cost functions are chosen depending on desirable properties, such as convexity.

In [None]:
loss_function = nn.MSELoss()

## Define the optimizer
[`torch.optim`](https://pytorch.org/docs/stable/optim.html) provides implementations of various optimization algorithms. The optimizer object will hold the current state and will update the parameters of the model based on computer gradients. It takes as an input an iterable containing the model parameters, that we explored before.

In [None]:
learning_rate = 1e-2

### Choose Adam or SGD optimizer

In [None]:
# Adam-Optimizer
optimizer0 = torch.optim.Adam(model0.parameters(), lr=learning_rate)
optimizer1 = torch.optim.Adam(model1.parameters(), lr=learning_rate)
optimizer2 = torch.optim.Adam(model2.parameters(), lr=learning_rate)

In [None]:
# SGD-Optimizer
# optimizer0 = torch.optim.SGD(model0.parameters(), lr=learning_rate)
# optimizer1 = torch.optim.SGD(model1.parameters(), lr=learning_rate)
# optimizer2 = torch.optim.SGD(model2.parameters(), lr=learning_rate)

## Train the models on a loop
The model learns iteratively in a loop of a given number of epochs. Each loop consists of:
- A **forward propagation**: compute $y$ given the input $x$ and current weights and calculate the loss
- A **backward propagation**: compute the gradient of the loss function (error of the loss at each unit)
- Gradient descent: update model weights

 Forward propagation             |  Backpropagation
:-------------------------:|:-------------------------:
<img src="img/forwordpropagation.png" style="width:50%; margin:auto;" /> |<img src="img/backpropagation.png" style="width:50%; margin:auto;" />


In [None]:
batch_size = 64 # how many points to pass to the model at a time
# batch_size = len(x_norm)  # uncomment to pass all data at once
dataset = TensorDataset(x_norm, y_norm)
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True, pin_memory=False, drop_last=True)

In [None]:
# Define the training loop NEW
def training_loop(dataloader, model, optimizer, epochs):
    losses = []
    for _ in range(epochs):
        for id_batch, (x_batch, y_batch) in enumerate(dataloader):
            x_batch = x_batch.to(device)
            y_batch = y_batch.to(device)
            pred_y = model(x_batch)
            optimizer.zero_grad()
            loss = loss_function(pred_y, y_batch)
            loss.backward()  # Back-prop
            optimizer.step()
            losses.append(loss.item())
    return losses


<img src="./img/trainingloop.png" alt="drawing" width="900"/>

In [None]:
# Run the training for all the models
#epochs = 2000
epochs = 500

losses0 = training_loop(dataloader, model0, optimizer0, epochs=epochs)
losses1 = training_loop(dataloader, model1, optimizer1, epochs=epochs)
losses2 = training_loop(dataloader, model2, optimizer2, epochs=epochs)

In [None]:
# Plot the loss
plt.plot(losses0, label='Model 0', color='green')
plt.plot(losses1, label='Model 1', color='blue')
plt.plot(losses2, label='Model 2', color='red')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.title("Learning rate %f"%(learning_rate))
plt.legend()
plt.show()

In [None]:
# Plot the loss with x-axis epochs (real) and batch-epoch (based on batchsize)
# Plot with the primary x-axis
fig, ax1 = plt.subplots()

ax1.plot(losses0, label='Model 0', color='green')
ax1.plot(losses1, label='Model 1', color='blue')
ax1.plot(losses2, label='Model 2', color='red')

ax1.set_xlabel('Epoch (#Batches)')
ax1.set_ylabel('Loss')
ax1.set_title(f"Learning rate {learning_rate}")

fig.legend()

# Adding the secondary x-axis using plt
ax2 = ax1.twiny()
ax2.set_xlim(0,epochs)  # Synchronize the x-limits
ax2.set_xlabel('Epoch (real)')


plt.show()
print(f'Every epoch has {len(x)/batch_size} (in real {int(len(x)/batch_size)}) batches.')

### <span style="color:#145a32;">Interpreting the loss curves</span>

- <span style="color:#145a32;">Have the NNs learned?</span>
- <span style="color:#145a32;"> Why is model 0 learning faster than model 1?</span>
- <span style="color:#145a32;"> Why is model 2 better than models 0 and 1?</span>
- <span style="color:#145a32;"> Train for more epochs. How does the loss curve change?</span>
- <span style="color:#145a32;"> Change the number of minibatches to pass all data at once. How does the loss curve change? Which method is more effective?</span>

## Test the trained model
- Let's create some random points in the x-axis within the model's interval that will serve as test data.
- We will do the same data manipulations as before.

In [None]:
test_points = 50
x_test = np.random.uniform(0, np.max(x_norm.detach().numpy()), test_points)
x_test_reshape = x_test.reshape((int(len(x_test) / n_input), n_input))

# Convert numpy array to PyTorch tensor and move to the correct device
x_test_torch = torch.from_numpy(x_test_reshape).to(dtype=torch.float32).to(device) 

Now we predict the y-value with our model:

In [None]:
y0_test_torch = model0(x_test_torch)
y1_test_torch = model1(x_test_torch)
y2_test_torch = model2(x_test_torch)

In [None]:
plt.plot(x_norm.detach().cpu().numpy(), y_norm.detach().cpu().numpy())
plt.scatter(x_test_torch.detach().cpu().numpy(), y0_test_torch.detach().cpu().numpy(), color='green', marker='*', label='Model 0')
plt.scatter(x_test_torch.detach().cpu().numpy(), y1_test_torch.detach().cpu().numpy(), color='blue', marker='v', label='Model 1')
plt.scatter(x_test_torch.detach().cpu().numpy(), y2_test_torch.detach().cpu().numpy(), color='red', label='Model 2')
plt.legend()
plt.show()

<h3 style="color:#145a32;">Comment on the NN predictions</h3>

- <span style="color:#145a32;"> Why does the prediction of model 0 have that particular shape?</span>
- <span style="color:#145a32;"> Which activation function would be more appropriate to fit this function, the one from model 0 or model 1?</span>
- <span style="color:#145a32;"> Which NN gets the best prediction and why?</span>

<h3 style="color:#145a32;">Bonus</h3>

- <span style="color:#145a32;">$\implies$ Change the seed at the top of the notebook. How do the predictions change?</span>
- <span style="color:#145a32;">$\implies$ Change the optimizer in <code>Section 4</code> from <code>Adam</code> to <code>SGD</code> and re-train the models. What happens? How did the loss curves change? Did the NNs learn? Change the number of epochs and try to make it learn.</span>

## Play with the notebook!
Some ideas:
- Change the number of epochs in `Section 5` to 5000 and re-train the models. What happens?
- Change the random seed in the `Reproducibility` cell at the very top. How do the results change?
- Change the optimizer in `Section 4` from `Adam` to `SGD` and re-train the models. What happens?
- [**if time allows, takes several minutes**] Change the epochs in `Section 5` to 1000000. What happens?
- Go back to 1000 epochs and the Adam optimizer. Change the learning rate in `Section 4` to 0.05. How do the results change? what does it tell us about our previous value?
- Change the learning rate to 0.5. What happens now?


# Physics-informed neural networks (PINNs)

This is a deep learning model that incorporates **physical laws and constraints directly into its architecture through the loss function**. PINNs solve differential equations by training neural networks to satisfy both the governing equations and boundary/initial conditions simultaneously.

PINNs are valuable because they:
1. Can handle complex, high-dimensional problems where analytical solutions are intractable
2. Provide continuous, differentiable solutions rather than discrete numerical approximations
3. Can solve both forward and inverse problems within the same framework

<img src="img/PINN.png" alt="PINN" style="width:850px;"/>

## Damped Simple Harmonic Oscillator PINN

As an example of a PINN we solve the damped oscillator ordinary differential equation

$$ u''(t) + \gamma u'(t) + u(t) = 0, $$

with initial conditions

$$ u(0)=1, \quad u'(0)=0. $$

The exact (underdamped) solution is given by

$$ u(t)=e^{-\gamma t}\Big(\cos(\omega_d t) + \frac{\gamma}{\omega_d}\sin(\omega_d t)\Big), \quad \text{with}\quad \omega_d=\sqrt{1-\gamma^2}. $$

We will define a neural network that approximates \( u(t) \), enforce the governing differential equation and initial conditions through tailored loss functions, and explore the training process with interactive widgets.

#### Set random seeds

In [None]:
# Set random seeds for reproducibility
np.random.seed(42)
torch.manual_seed(42)

## Define the PINN Model

We now define the neural network that represents our PINN. This network is a fully connected network with two hidden layers (each with 20 neurons and a $\tanh $activation function) that approximates $u(t)$.

In [None]:
class PINN(nn.Module):
    def __init__(self):
        super(PINN, self).__init__()
        self.net = nn.Sequential(
            nn.Linear(1, 20),  # Input: time t, Output: 20 neurons
            nn.Tanh(),         # Tanh activation function
            nn.Linear(20, 20),
            nn.Tanh(),
            nn.Linear(20, 1)   # Output: u(t)
        )

    def forward(self, t):
        return self.net(t)

# Instantiate and display the model structure
model = PINN()
print(model)

## Physics-Informed Loss Functions

To train our PINN, we enforce the governing differential equation and the initial conditions using specialized loss functions:

1. **PDE Loss:** Computes the residual of the differential equation
   $$ u''(t) + \gamma u'(t) + u(t) $$
   using PyTorch's automatic differentiation via `torch.autograd.grad`.

2. **Initial Condition Loss:** Enforces the conditions $$u(0)=1, \quad u'(0)=0.$$ 

An additional **Data Loss** is also used by sampling a few known solution points from the exact solution to guide training.

In [None]:
def pde_loss(model, t, gamma):
    # Ensure input t requires gradient
    t = t.clone().detach().requires_grad_(True)
    
    # Compute the network output u(t)
    u = model(t)
    
    # First derivative u'(t)
    u_t = torch.autograd.grad(u, t, grad_outputs=torch.ones_like(u), create_graph=True)[0]
    
    # Second derivative u''(t)
    u_tt = torch.autograd.grad(u_t, t, grad_outputs=torch.ones_like(u_t), create_graph=True)[0]
    
    # Residual of the ODE: u''(t) + gamma*u'(t) + u(t) = 0
    f = u_tt + gamma * u_t + u
    loss_f = torch.mean(f**2)
    return loss_f

### Initial Condition Loss Function

This function enforces the initial conditions by calculating the discrepancy between the network prediction and the true values at \(t=0\). It also computes the derivative \(u'(0)\) and penalizes deviations from the prescribed value.

In [None]:
def ic_loss(model, t0, u0_true, u0_t_true):
    # Set requires_grad for the initial condition point
    t0 = t0.clone().detach().requires_grad_(True)
    
    # Compute u(t0) and enforce u(t0) = u0_true
    u0 = model(t0)
    loss_u0 = (u0 - u0_true)**2
    
    # Compute the first derivative u'(t0) and enforce u'(t0) = u0_t_true
    u0_t = torch.autograd.grad(u0, t0, grad_outputs=torch.ones_like(u0), create_graph=True)[0]
    loss_u0_t = (u0_t - u0_t_true)**2
    
    return torch.mean(loss_u0) + torch.mean(loss_u0_t)

## Define the Exact Solution

For validation purposes, we define the exact solution given by

$$ u(t)=e^{-\gamma t}\Big(\cos(\omega_d t) + \frac{\gamma}{\omega_d}\sin(\omega_d t)\Big), \quad \omega_d=\sqrt{1-\gamma^2}. $$

This function will be used to generate training data as well as to compare against the PINN prediction.

In [None]:
def exact_solution(t, gamma, omega_d):
    return np.exp(-gamma * t) * (np.cos(omega_d * t) + (gamma / omega_d) * np.sin(omega_d * t))

## Training Function with Interactive Widget

The following function performs the training of the PINN. It:

- Generates training data by sampling points from the exact solution on the interval $[0, t_{train\_end}]$.
- Creates collocation points over the broader interval $[0, 10]$ to enforce the differential equation.
- Minimizes a total loss composed of the PDE loss, the initial condition loss, and the data loss.
- Plots the network prediction every 500 epochs alongside the exact solution and training data.

The function parameters (number of epochs, learning rate, $\gamma$, and the end of training data) can be adjusted interactively.

In [None]:
def train_pinn(epochs=5000, learning_rate=0.001, gamma=0.12, t_train_end=4.0):
    # Time parameters
    t_train_start = 0.0  # Start of training data
    t_test_end = 10.0    # End of prediction range

    # Define the model and optimizer
    model = PINN()
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)
    
    # Generate training data (10 points from the exact solution within [0, t_train_end])
    num_train_points = 10
    t_train = np.random.uniform(t_train_start, t_train_end, num_train_points)[:, None]
    omega_d = np.sqrt(1 - gamma**2)
    u_train = exact_solution(t_train, gamma, omega_d)
    
    # Convert training data to tensors
    t_train_tensor = torch.tensor(t_train, dtype=torch.float32, requires_grad=True)
    u_train_tensor = torch.tensor(u_train, dtype=torch.float32)
    
    # Initial condition at t = 0
    t0 = torch.tensor([[0.0]], dtype=torch.float32, requires_grad=True)
    u0_true = torch.tensor([[1.0]], dtype=torch.float32)  # u(0) = 1
    u0_t_true = torch.tensor([[0.0]], dtype=torch.float32)  # u'(0) = 0
    
    # Generate collocation points for the entire range [0, 10]
    num_colloc_points = 100
    t_colloc = np.random.uniform(t_train_start, t_test_end, num_colloc_points)[:, None]
    t_colloc_tensor = torch.tensor(t_colloc, dtype=torch.float32, requires_grad=True)
    
    # Training loop
    loss_history = []
    for epoch in range(epochs):
        optimizer.zero_grad()
        
        # Compute the PDE loss
        loss_pde = pde_loss(model, t_colloc_tensor, gamma)
        
        # Compute the initial condition loss
        loss_ic = ic_loss(model, t0, u0_true, u0_t_true)
        
        # Compute data loss from training data
        loss_data = torch.mean((model(t_train_tensor) - u_train_tensor)**2)
        
        total_loss = loss_pde + loss_ic + loss_data
        loss_history.append(total_loss.item())
        
        total_loss.backward()
        optimizer.step()
        
        # Plot every 500 epochs
        if epoch % 500 == 0:
            t_test = np.linspace(t_train_start, t_test_end, 200)[:, None]
            t_test_tensor = torch.tensor(t_test, dtype=torch.float32)
            
            model.eval()
            with torch.no_grad():
                u_pred = model(t_test_tensor).cpu().numpy()
            
            u_exact = exact_solution(t_test, gamma, omega_d)
            
            plt.figure(figsize=(8, 5))
            plt.plot(t_test, u_exact, color='gray', linestyle='-', label='Exact solution', linewidth=2)
            plt.plot(t_test, u_pred, 'b-', label='Neural network prediction', linewidth=2)
            plt.scatter(t_train, u_train, color='orange', label='Training data', zorder=5)
            plt.axvline(x=t_train_end, color='r', linestyle='--', label='End of training data')
            plt.title(f'Training step: {epoch} | Training data ends at t = {t_train_end}')
            plt.xlabel('t')
            plt.ylabel('u(t)')
            plt.legend()
            plt.grid(True)
            plt.show()
    
    # Plot the loss history
    plt.figure(figsize=(8, 5))
    plt.semilogy(loss_history, 'k-')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.title('Training Loss History')
    plt.show()

## Interactive Widget for Training

The interactive widget below lets you adjust the training parameters:

- **epochs:** The number of training iterations.
- **Learning rate:** The step size for the optimizer.
- **gamma, $\gamma$:** The damping factor in the ODE.
- **Train end:** The end time for generating training data.

By adjusting these sliders, you can observe how the network prediction evolves during training.

In [None]:
interact(
    train_pinn,
    epochs=IntSlider(min=1000, max=10000, step=1000, value=5000),
    learning_rate=FloatSlider(min=0.0001, max=0.01, step=0.0001, value=0.001, readout_format='.4f', description='Learning rate'),
    gamma=FloatSlider(min=0.01, max=0.5, step=0.01, value=0.12),
    t_train_end=FloatSlider(min=1.0, max=8.0, step=0.5, value=4.0, description='Train end')
)

## Outlook
5 min break
#### Hands-On Session 2 -- Classification and More
[DPG Regensburg](https://www.dpg-verhandlungen.de/year/2025/conference/regensburg/part/tut/session/1)
##### Abstract: 
The session demonstrates how pre-trained models can simplify tasks such as classification, making them readily applicable to research. Typical examples include recognizing handwritten digits, which showcase the power of pretrained models in solving common challenges. As a preview of advanced topics, the tutorial concludes with brief examples of large language models (LLMs) and generative AI.

Keywords: AI

[ErUM-Data-Hub<img src="./img/erum-data-hub_logo.png" style="width: 20%;" />](https://erumdatahub.de)

#### Learn more about us, our events and shools Do, AKPIK 5.10:
[AKPIK 5.10: Poster Donnerstag, 20. März 2025, 15:00–16:30, P2](https://www.dpg-verhandlungen.de/year/2025/conference/regensburg/part/akpik/session/5/contribution/10) Advancing Digital Transformation in Research on Universe and Matter in Germany
