# Expert  Tutorial: Introduction to Physics-Informed Neural Networks

In this tutorial, we will learn the basics of Physics-Informed Neural Networks (PINNs). It consistes of three parts:
1. Forward Problem - Heat Equation 2D
2. Inverse Problem - Heat Equation 2D
3. Application - Inverse Problem - Transfer Learning

This tutorail is in the form of *jupyter* notebook hosted on Google Collab. Go through the tutorial step-by-step and try understand the code. In the places with a *TODO* it is your task to fill out the blanks!

Let's get started! Don't hesitate to ask us any questions!

___

## Part 1: Forward Problem - Heat Equation 2D

In this part, we will try to solve the following 1D Heat PDE with a PINN:

$$
\begin{aligned}
    u_t(x,t) &= \kappa u_{xx}(x,t) \\
    u(x, 0) &= \sin(x) \\
    u(0,t ) &= 0 \\
    u(\pi,t) &= 0
\end{aligned}
$$

This PDE is directly solvable, the exact solution is $u(x, t) = \sin(x) e^{-\kappa t}$. We will use it as a test case for the PINN. 

The subsections are as follows:
1) Develop ML model
2) Develop loss function
3) Define computational grid
3) Complete training loop and visualize 

- [Optional] - What happens if we change activation functions? (ReLU, GeLU)

In [None]:
# First import the necessary libraries
import numpy as np
import torch
from torch import nn
import matplotlib.pyplot as plt

### 1.1 Develop ML Model

Model description:  
$$
u_\theta = \text{MLP}(x)
$$

The model is a three-layer fully connected neural network with 64 neurons per hidden layer and **tanh** activation functions:
$$
u_\theta(x) = W_3 \, \tanh\!\left(W_2 \, \tanh\!\left(W_1 x + b_1\right) + b_2\right) + b_3
$$

In [None]:
# 🧩  Develop the ML model in Torch
class PINN(nn.Module):
    # MLP: 3 layers, 64 neurons; Tahn activation function
    def __init__(self, input_dim, output_dim, hidden_dim=64, num_layers=1):
        super().__init__() 
        # TODO: implement the model architecture


    def forward(self, x):
        return self.net(x)

#### Solution

In [None]:
class PINN(nn.Module):
    
    def __init__(self, input_dim, output_dim, hidden_dim=64, num_layers=1):
        super().__init__()
        # TODO: implement the model architecture
        layers = [nn.Linear(input_dim, hidden_dim), nn.Tanh()]
        for _ in range(num_layers - 1):
            layers += [nn.Linear(hidden_dim, hidden_dim), nn.Tanh()]
        layers.append(nn.Linear(hidden_dim, output_dim))
        self.net = nn.Sequential(*layers)

    def forward(self, x):
        return self.net(x)

### 1.2 Develop Loss Function - Residual Loss for the Heat Equation

The total loss combines all three components:

$$
L_{\text{total}} =
L_{\text{res}} +
L_{\text{bc}} +
L_{\text{ic}}
$$

where:

- **Residual loss** enforces the PDE:

$$
L_{\text{res}} = \frac{1}{N_r} \sum_{i=1}^{N_r} \left( u_t(x_i, t_i) - \kappa \, u_{xx}(x_i, t_i) \right)^2
$$

- **Boundary condition loss** enforces the boundary:

$$
L_{\text{bc}} = \frac{1}{N_b} \sum_{i=1}^{N_b} \left( u_\theta(x_b^i, t_b^i) - u_{\text{BC}}^i \right)^2
$$

- **Initial condition loss** enforces the initial state:

$$
L_{\text{ic}} = \frac{1}{N_i} \sum_{i=1}^{N_i} \left( u_\theta(x_i, 0) - u_{\text{IC}}^i \right)^2
$$


In [None]:
# 🧩 Implement loss function

def heat_residual_loss(model, collocation_pts, kappa=1):
    # TODO: implement the residual loss for the heat equation
    return residual_loss

def heat_pinn_loss(model, collocation_pts, boundary_pts, initial_pts, kappa=1):

    residual_loss = heat_residual_loss(model, collocation_pts, kappa)

    #TODO: Implement the boundary loss. Use the squared error between the true BC and the model prediction on the BC points.

    #TODO: Implement the initial condition loss. Use the squared error between the true IC and the model prediction on the IC points.
    
    total_loss = residual_loss + boundary_loss + initial_loss
    return total_loss


#### Solution

In [None]:
def heat_residual_loss(model, collocation_pts, kappa=1):
    # TODO: implement the residual loss for the heat equation.
    collocation_pts.requires_grad_(True)
    u = model(collocation_pts)
    grads = torch.autograd.grad(u, collocation_pts, torch.ones_like(u), create_graph=True)[0]
    u_x = grads[:, 0:1]
    u_t = grads[:, 1:2]
    u_xx = kappa * torch.autograd.grad(u_x, collocation_pts, torch.ones_like(u_x), create_graph=True)[0][:, 0:1]
    residual_loss =  ((u_t - u_xx) ** 2).mean()
    return residual_loss

def heat_pinn_loss(model, collocation_pts, boundary_pts, initial_pts, kappa=1):

    residual_loss = heat_residual_loss(model, collocation_pts, kappa)

    #TODO: Implement the boundary loss. Use the squared error between the true BC and the model prediction on the BC points.
    bc_targets = torch.zeros(len(boundary_pts))
    boundary_loss = ((model(boundary_pts) - bc_targets) ** 2).mean()


    #TODO: Implement the initial condition loss. Use the squared error between the true IC and the model prediction on the IC points.
    ic_targets = torch.sin(initial_pts[:, 0]).unsqueeze(1)  # Make it [100, 1]
    initial_loss = ((model(initial_pts) - ic_targets) ** 2).mean()

    total_loss = residual_loss + boundary_loss + initial_loss
    return total_loss


### 1.3 Define Computational  Grid
We need to sample points from the interior to calculate the residuals (called collocation points) and points on the IC and BC to train the model.

In [None]:
# Domain
x_min, x_max = 0.0, np.pi
t_min, t_max = 0.0, 1.0

# Collocation points (interior)
n_colloc = 1000
x_c = torch.rand(n_colloc, 1) * (x_max - x_min) + x_min
t_c = torch.rand(n_colloc, 1) * (t_max - t_min) + t_min
colloc = torch.cat([x_c, t_c], dim=1)

# Initial condition: t=0, u(x,0)=sin(x)
n_ic = 100
ic_x = torch.linspace(x_min, x_max, n_ic).reshape(-1, 1)
ic_t = torch.zeros_like(ic_x)
ic_pts = torch.cat([ic_x, ic_t], dim=1)

# Boundary condition: t=1, u(x,1)=0
n_bc = 100
t_bc = torch.linspace(t_min, t_max, n_bc).unsqueeze(1)
bc_pts = torch.cat([
    torch.cat([torch.zeros_like(t_bc), t_bc], dim=1),
    torch.cat([torch.full_like(t_bc, np.pi), t_bc], dim=1)
], dim=0)


#### [Optional] - Plot

In [None]:
# Plotting
plt.figure(figsize=(8, 6))

# Collocation points
plt.scatter(colloc[:, 0].numpy(), colloc[:, 1].numpy(), color='blue', s=10, label='Collocation Points', alpha=0.5)

# Initial condition points
plt.scatter(ic_x.numpy(), ic_t.numpy(), color='green', s=30, label='Initial Condition (t=0)')

# Boundary condition points
plt.scatter(bc_pts[:, 0].numpy(), bc_pts[:, 1].numpy(), color='red', s=30, label='Boundary Conditions (x=0, pi)')

plt.xlabel('x')
plt.ylabel('t')
plt.title('PINN Training Points')
plt.legend()
plt.grid(True)
plt.show()

### 1.4 Training Loop
Now let's combine everything and train the model.  

In [None]:
# 🧩 Implement training loop

def train(model, colloc, ic_pts, bc_pts, loss_function, epochs=2000, lr=1e-3):

    #TODO 0. Define Optimizer

    history = [] # for logging the loss
    for i in range(epochs):
        #TODO 1.  zero the gradients

        #TODO 2. compute the loss

        ##TODO 3. backpropagate the loss

        ##TODO 4. update the weights
        
        history.append(loss.item())
        if i % 200 == 0:
            print(f"Epoch {i}: Loss={loss.item():.4e}")
    return history

# initialize model
model = PINN(2, 1)
# train model
history = train(model, colloc, ic_pts, bc_pts, heat_pinn_loss)

#### Solution

In [None]:
def train(model, colloc, ic_pts, bc_pts, loss_function, epochs=2000, lr=1e-3):

    # 0. Define Optimizer
    opt = torch.optim.Adam(model.parameters(), lr=lr)
    history = []
    for i in range(epochs):
        # 1.  zero the gradients
        opt.zero_grad()
        # 2. compute the loss
        loss = loss_function(model, colloc, bc_pts, ic_pts)
        # 3. backpropagate the loss
        loss.backward()
        # 4. update the weights
        opt.step()
        history.append(loss.item())
        if i % 200 == 0:
            print(f"Epoch {i}: Loss={loss.item():.4e}")
    return history

model = PINN(2, 1)
history = train(model, colloc, ic_pts, bc_pts, heat_pinn_loss)

### 1.5 Test and Visualize

In [None]:
def exact(x, t):
    return np.sin(x) * np.exp(-t)

def visualize(X, T, u_true, u_pred):
    pts = torch.tensor(np.column_stack([X.flatten(), T.flatten()]), dtype=torch.float32)
    plt.figure(figsize=(12, 4))
    plt.subplot(1, 3, 1)
    plt.contourf(X, T, u_pred, 50, cmap='viridis')
    plt.title('PINN Prediction')
    plt.xlabel('x'); plt.ylabel('t')
    plt.colorbar()
    plt.subplot(1, 3, 2)
    plt.contourf(X, T, u_true, 50, cmap='viridis')
    plt.title('Exact Solution')
    plt.xlabel('x'); plt.ylabel('t')
    plt.colorbar()
    plt.subplot(1, 3, 3)
    plt.contourf(X, T, np.abs(u_pred - u_true), 50, cmap='Reds')
    plt.title('Absolute Error')
    plt.xlabel('x'); plt.ylabel('t')
    plt.colorbar()
    plt.tight_layout()
    plt.show()

# Testgrid
x_test = np.linspace(x_min, x_max, 101)
t_test = np.linspace(t_min, t_max, 101)
X, T = np.meshgrid(x_test, t_test)
pts = torch.tensor(np.column_stack([X.flatten(), T.flatten()]), dtype=torch.float32)
with torch.no_grad():
    u_pred = model(pts).numpy().reshape(X.shape)
u_true = exact(X, T)

visualize(X, T, u_true, u_pred)

print(f"Max abs error: {np.abs(u_pred - u_true).max():.3e}")

___
## Part 2: Inverse Problem - Heat Equation 2D

In inverse problems, we are given some data from the true process, and try to infer the parameters of the process that generated the data. However, we do now the underlying PDE. We can approach this problem using PINNs by adding the unknown physical PDE parameter as a variable to be optimized. Afterwards, we minimize the residual PDE loss and the loss to the data samples at the same time. 

The steps we will take are:
- 2.1 Generate Synthetic data: Typically you want to have this data from real world observations; here we use synthetic data as an example.
- 2.2 Define loss function
- 2.3 Implement training loop
- 2.4 Visualize

- [Optional] - What happens if our real-world data is noisy? - Add noise to synthetic data and re-run
- [Optional] -  What happens if we reduce the synthetic data quantity? -  decrease the  $51$ value in the $x_c$ and $t_c$ generation


### 2.1 Generate Synthetic Data

In [None]:
### 2.1 Synthetic data generation

kappa_true = 2.4 # what we actually want to predict

def exact(x, t):
    return np.sin(x) * np.exp(-kappa_true * t)

x_c = torch.linspace(x_min, x_max, 51)
t_c = torch.linspace(t_min, t_max, 51)
X, T = torch.meshgrid(x_c, t_c)
colloc = torch.stack([X.flatten(), T.flatten()], dim=1)

data = torch.Tensor(exact(X.numpy(), T.numpy())).flatten().reshape(-1, 1)

### 2.2 Modify Loss Function

The total loss combines all three components:

$$
L_{\mathrm{total}} =
L_{\mathrm{res}} +
L_{\mathrm{data}}
$$

where  

$$
L_{\mathrm{data}} \text{ represents the available observational data.}
$$

Here, the missing information in the model (such as initial and boundary conditions)  
is compensated for by incorporating additional data.


In [None]:
# 🧩 2.2 Implement loss function
def loss_function_inverse(model, kappa, collocation_pts, boundary_pts, initial_pts, data):

    # Use the standard PINN loss
    pde_loss = heat_pinn_loss(model, collocation_pts, boundary_pts, initial_pts, kappa)

    #TODO: Add data loss term

    return pde_loss + data_loss


#### Solution

In [None]:
def loss_function_inverse(model, kappa, collocation_pts, boundary_pts, initial_pts, data):

    # Use the standard PINN loss
    pde_loss = heat_pinn_loss(model, collocation_pts, boundary_pts, initial_pts, kappa)

    #TODO: Add data loss term
    data_loss = (model(collocation_pts) - data).square().mean()
    return pde_loss + data_loss


### 2.3 Implement Training Loop
Here in addition to adding the model's parameter to the optimizer, we need to include the material propertie $\kappa$.

In [None]:
# 🧩 Implement Training Loop

model = PINN(2, 1)
kappa = torch.nn.Parameter(torch.tensor(1.0, requires_grad=True)) # (!) - important

#TODO 0. implement the optimizer

#TODO1.  create optimizer

epochs = 5000
history = []
for i in range(epochs):
    #TODO 2. zero the gradients

    #TODO 3. compute the loss

    #TODO 4. backpropagate the loss

    #TODO 5. update the weights

    history.append(loss.item())
    if i % 200 == 0:
        print(f"Epoch {i}: Loss={loss.item():.4e}, kappa={kappa.item():.3f}")

pts = torch.tensor(np.column_stack([X.flatten(), T.flatten()]), dtype=torch.float32)

visualize(X.numpy(), T.numpy(), exact(X.numpy(), T.numpy()), model(pts).detach().numpy().reshape(X.shape))


#### Solution

In [None]:
model = PINN(2, 1)
kappa = torch.nn.Parameter(torch.tensor(1.0, requires_grad=True))
#TODO: implement the training loop
optimizer = torch.optim.Adam(list(model.parameters()) + [kappa], lr=1e-3)
epochs = 5000
history = []
for i in range(epochs):
    optimizer.zero_grad()
    loss = loss_function_inverse(model, kappa, colloc, bc_pts, ic_pts, data)
    loss.backward()
    optimizer.step()
    history.append(loss.item())
    if i % 200 == 0:
        print(f"Epoch {i}: Loss={loss.item():.4e}, kappa={kappa.item():.3f}")

pts = torch.tensor(np.column_stack([X.flatten(), T.flatten()]), dtype=torch.float32)

visualize(X.numpy(), T.numpy(), exact(X.numpy(), T.numpy()), model(pts).detach().numpy().reshape(X.shape))


## Part 3: Application - Inverse Problem - Transfer Learning
Imagine you have a **mechanical system** — like a *vibrating beam*, a *spring–mass–damper*, or even a *sensor-mounted machine part* — that follows this physical law:

\[
m x''(t) + c x'(t) + k x(t) = 0
\]

Where:

- **m**: mass *(known)*
- **c**: damping coefficient *(unknown)*
- **k**: stiffness *(unknown)*
- **x(t)**: displacement *(measured or observed)*
- **x'(t)**, **x''(t)**: velocity and acceleration

In real experiments, you can’t directly measure **c** and **k**, but you can measure **displacement over time** with sensors.

So, you have:

- A known **model equation** (the physics)  
- A few **noisy measurements** of \( x(t) \)  
- Two **unknown physical parameters** (\( c \), \( k \)) you want to identify

The following tutorial is structured as:
1) Generate ground truth solution
2) From this ground truth solution we sample sparse measurements
3) 🧩 [TODO] Construt a PINN Model
4) 🧩 [TODO] Define the PINN Loss
5) 🧩 [TODO] Train the model to solve the inverse problem (find  $c$ and $k$)
6) Explore transfer learning

### 3.1 Generate Ground Truth Solution

In [None]:
# --- 1. Analytical solution of the damped oscillator ---
def damped_oscillator_solution_np(t, m, c, k, x0, v0):
    t = np.asarray(t, dtype=float)
    omega_n = np.sqrt(k / m)
    disc = c**2 - 4*m*k

    if disc < 0:  # underdamped
        wn = omega_n
        zeta = c / (2*np.sqrt(m*k))
        wd = wn * np.sqrt(max(0.0, 1 - zeta**2))
        exp_term = np.exp(-zeta*wn*t)
        C1, C2 = x0, (v0 + zeta*wn*x0)/wd
        x = exp_term * (C1*np.cos(wd*t) + C2*np.sin(wd*t))
        return x

    elif np.isclose(disc, 0.0):  # critically damped
        wn = omega_n
        exp_term = np.exp(-wn*t)
        x = (x0 + (v0 + wn*x0)*t) * exp_term
        return x

    else:  # overdamped
        sqrt_disc = np.sqrt(disc)
        r1 = (-c + sqrt_disc) / (2*m)
        r2 = (-c - sqrt_disc) / (2*m)
        denom = (r1 - r2)
        A = (v0 - r2 * x0) / denom
        B = x0 - A
        x = A * np.exp(r1 * t) + B * np.exp(r2 * t)
        return x


# --- 2. Generate the signal ---
def generate_signal(m=1.0, c=0.5, k=4.0, x0=1.0, v0=0.0, T=10.0, n_points=400):
    t = np.linspace(0, T, n_points)
    x = damped_oscillator_solution_np(t, m, c, k, x0, v0)
    return t, x

#### [Optional] - Visualize

In [None]:
## Visualize
t_signal, x_signal = generate_signal(m=1.0, c=0.5, k=4.0, x0=1.0, v0=0.0)
plt.plot(t_signal, x_signal)
plt.xlabel('Displacement')
plt.ylabel('Time')
plt.title('Damped Oscillator Signal')
plt.show()

### 3.2 Collect Sparse Measurements

In [None]:
# --- 3. Sample synthetic sensor data ---
def get_sensor_data(t_signal, x_signal, n_obs=21, noise_scale=0.005, seed=0):
    np.random.seed(seed)
    t_obs = np.linspace(t_signal.min(), t_signal.max(), n_obs)
    x_true = np.interp(t_obs, t_signal, x_signal)
    x_obs = x_true + np.random.normal(scale=noise_scale, size=x_true.shape)
    return t_obs, x_obs

#### [Optional] - Visualize

In [None]:
# --- 4. Visualize the true signal and data ---
t_signal, x_signal = generate_signal(m=1.0, c=0.5, k=4.0, x0=1.0, v0=0.0)
t_obs, x_obs = get_sensor_data(t_signal, x_signal, n_obs=10, noise_scale=0.01)

plt.figure(figsize=(8,5))
plt.plot(t_signal, x_signal, label='True Signal', linewidth=2)
plt.scatter(t_obs, x_obs, color='r', edgecolor='k', s=60, zorder=5, label='Sensor Data')
plt.xlabel('Time $t$')
plt.ylabel('Displacement $x(t)$')
plt.title('Damped Oscillator Signal and Sensor Measurements')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()

### 3.3 Construct Model

1. Create a torch class that models:
$$x_\theta = \text{MLP}(t)$$

2. Create a computer_derivative function that calculates x' and x''


In [None]:
# 🧩 - Build a PINN model

class PINN(nn.Module):
    def __init__(self, hidden=64, layers=3):
        super().__init__()
        # [TODO] Define the neural network layers
        # Hint: Use nn.Linear and nn.Tanh activations
        # Example: [input → hidden → ... → output]
        seq = __?__       # Fill in the sequence of layers
        self.net = __?__  # Fill in: wrap the sequence with nn.Sequential

    def forward(self, t):
        # [TODO] Define the forward pass through the network
        # Input: t (time), Output: x(t)
        return __?__      # Fill in: use the defined network

    def compute_derivatives(self, t):
        """
        Compute first and second derivatives of x(t)
        with respect to time t using autograd.
        """
        # Ensure gradients are enabled for t
        t = t.clone().detach().requires_grad_(True)

        # [TODO] Forward pass
        x = __?__         # Fill in: model prediction x(t)

        # [TODO] First derivative: dx/dt
        x_t = torch.autograd.grad(
            x, t,
            grad_outputs=torch.ones_like(x),
            create_graph=True,
            retain_graph=True
        )[0]

        # [TODO] Second derivative: d²x/dt²
        x_tt = __?__      # Fill in: autograd call for second derivative

        return x, x_t, x_tt


#### Solution

In [None]:
#  Solution
class PINN(nn.Module):
    def __init__(self, hidden=64, layers=3):
        super().__init__()
        seq = [nn.Linear(1, hidden), nn.Tanh()]
        for _ in range(layers-1):
            seq += [nn.Linear(hidden, hidden), nn.Tanh()]
        seq += [nn.Linear(hidden, 1)]
        self.net = nn.Sequential(*seq)
    def forward(self, t):
        return self.net(t)

    def compute_derivatives(self, t):
        """
        Compute first and second derivatives of x(t)
        with respect to time t using autograd.
        """
        # Ensure gradients are enabled for t
        t = t.clone().detach().requires_grad_(True)

        # Forward pass
        x = self.forward(t)

        # First derivative: dx/dt
        x_t = torch.autograd.grad(
            x, t,
            grad_outputs=torch.ones_like(x),
            create_graph=True,
            retain_graph=True
        )[0]

        # Second derivative: d²x/dt²
        x_tt = torch.autograd.grad(
            x_t, t,
            grad_outputs=torch.ones_like(x_t),
            create_graph=True,
            retain_graph=True
        )[0]

        return x, x_t, x_tt

### 3.4. Define the PINN Loss

In this section, students will implement the **loss function** for a Physics-Informed Neural Network (PINN) modeling a damped oscillator:

\[
m x''(t) + c x'(t) + k x(t) = 0
\]

### Guidance:

- **Trainable parameters:**  
  `damping_param` (c) and `stiffness_param` (k) are the unknown physical coefficients to learn.

- **Inputs:**  
  - `t_coll` → collocation points used to enforce the PDE residual.  
  - `t_sensors` & `x_sensors` → sensor measurement times and corresponding observed displacements.  
  - `ic_time`, `ic_displacement`, `ic_velocity` → initial conditions at time `t=0`.

- **Loss components:**  
  1. **Residual loss:** Enforce the differential equation at collocation points.  
  2. **Data loss:** Match model predictions to sensor measurements.  
  3. **Initial condition loss:** Ensure the model satisfies the initial displacement and velocity.

- **Derivatives:**  
  Use `torch.autograd.grad` to compute the first and second derivatives of `x(t)` w.r.t. time.

- **Combine losses:**  
  Use weighted sum to balance residual, data, and IC contributions.

> 💡 Hint: Start by computing the model output at each set of points, then its derivatives, and finally the squared errors for each loss term.


In [None]:
# 🧩 Exercise: Implement the PINN Loss Function 
def loss_fn(model, c, k,
            t_coll, t_obs, y_obs,
            t_ic, x0_t, v0_t,
            m):
    """
    Compute the total loss for a damped oscillator PINN.

    The loss combines:
        (1) Physics residual loss:   m x'' + c x' + k x = 0
        (2) Data loss:               match observed data
        (3) Initial condition loss:  enforce x(0) = x0, x'(0) = v0
    """

    # ---[TODO] 1. Physics residual loss ---
    x_pred, x_t, x_tt = __?__   # Fill in: use model.compute_derivatives(...)
    res = __?__                 # Fill in: m*x_tt + c*x_t + k*x_pred
    l_res = __?__               # Fill in: mean squared residual

    # ---[TODO] 2. Data fitting loss ---
    y_pred = __?__              # Fill in: model prediction at t_obs
    l_data = __?__              # Fill in: mean squared data loss

    # ---[TODO] 3. Initial condition loss ---
    x_ic, x_t_ic, _ = __?__     # Fill in: model.compute_derivatives(t_ic)
    l_ic = __?__                # Fill in: IC loss for displacement and velocity

    # ---[TODO] 4. Total loss ---
    loss = __?__                # Fill in: combine all losses

    return loss, l_res, l_data, l_ic, c, k

#### Solution

In [None]:
# solution
def loss_fn(model, c, k, t_coll, t_obs, y_obs, t_ic, x0_t, v0_t, m):
    # --- 1. Physics residual loss ---
    x_pred, x_t, x_tt = model.compute_derivatives(t_coll)
    # Damped oscillator residual: m*x_tt + c*x_t + k*x = 0
    res = m * x_tt + c * x_t + k * x_pred
    l_res = torch.mean(res**2)

    # --- 2. Data fitting loss ---
    y_pred = model(t_obs)
    l_data = torch.mean((y_pred - y_obs) ** 2)

    # --- 3. Initial condition loss ---
    x_ic, x_t_ic, _ = model.compute_derivatives(t_ic)
    l_ic = torch.mean((x_ic - x0_t) ** 2) + torch.mean((x_t_ic - v0_t) ** 2)

    # --- 4. Total loss ---
    loss = l_res + l_data + l_ic

    return loss, l_res, l_data, l_ic, c, k

### 3.5 Train model

In [None]:
# --- 6. Synthetic data for PINN training ---
torch.manual_seed(42)

# System parameters
m, c_true, k_true = 1.0, 0.5, 2.0
x0, v0 = 1.0, 0.0
T = 10.0
noise_scale = 0.005

# Time grids
t_plot = np.linspace(0, T, 400)
t_obs = np.linspace(0, T, 21)

# Ground truth and noisy observations
x_plot = damped_oscillator_solution_np(t_plot, m, c_true, k_true, x0, v0)
y_obs = damped_oscillator_solution_np(t_obs, m, c_true, k_true, x0, v0)
y_obs += np.random.normal(scale=noise_scale, size=y_obs.shape)

# Convert to PyTorch tensors
device = torch.device("cpu")
t_obs_t, y_obs_t = [torch.tensor(arr.reshape(-1,1), dtype=torch.float32, device=device)
                     for arr in (t_obs, y_obs)]

# Collocation and initial condition points
t_coll = torch.tensor(np.random.uniform(0, T, 120).reshape(-1,1), dtype=torch.float32, device=device)
t_ic = torch.tensor([[0.0]], dtype=torch.float32, device=device)
x0_t = torch.tensor([[x0]], dtype=torch.float32, device=device)
v0_t = torch.tensor([[v0]], dtype=torch.float32, device=device)



In [None]:
# --- 7. 🧩 PINN initialization --- 
model = PINN(hidden=64, layers=3).to(device)
c_un = # 🧩 [TODO] define optimizable parameters 
k_un = # 🧩 [TODO] define optimizable parameters
params = list(model.parameters()) + [c_un, k_un]

#### Solution

In [None]:
# --- 7. PINN initialization ---
model = PINN(hidden=64, layers=3).to(device)
c_un = nn.Parameter(torch.tensor(0.2, dtype=torch.float32, device=device))
k_un = nn.Parameter(torch.tensor(2.0, dtype=torch.float32, device=device))
params = list(model.parameters()) + [c_un, k_un]

###

In [None]:
# --- 9. Training loop (provided) ---
t_coll.requires_grad_(True)  # Needed to compute derivatives for PDE residual
t_ic.requires_grad_(True)    # Needed for IC derivative

opt = torch.optim.AdamW(params, lr=1e-2)
n_epochs = 2000

loss_history = [] # for logging
for epoch in range(1, n_epochs + 1):
    opt.zero_grad()  # Reset gradients

    # Compute total loss and components
    loss, lres, ldata, lic, c_val, k_val = loss_fn(
        model, c_un, k_un, t_coll, t_obs_t, y_obs_t, t_ic, x0_t, v0_t, m
    )

    loss.backward()  # Backpropagation
    opt.step()       # Update model and parameters

    loss_history.append(loss.item()) # for loggin the loss curve

    # Print progress every 200 epochs
    if epoch == 1 or epoch % 200 == 0 or epoch == n_epochs:
        print(
            f"Epoch {epoch:4d} | "
            f"Total {loss.item():.4e} | "
            f"Res {lres.item():.4e} | "
            f"Data {ldata.item():.4e} | "
            f"IC {lic.item():.4e} | "
            f"c {c_val.item():.4f} | "
            f"k {k_val.item():.4f}"
        )

# plot losses
plt.plot(loss_history)
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.yscale('log')
plt.title('Loss Curve')
plt.show()

In [None]:
# --- 10. Evaluation and visualization ---
model.eval()
with torch.no_grad():
    t_plot_t = torch.tensor(t_plot.reshape(-1, 1), dtype=torch.float32, device=device)
    x_pred = model(t_plot_t).cpu().numpy().flatten()
    learned_c = c_un.item()
    learned_k = k_un.item()

print(f"\nTrue parameters:   c = {c_true:.4f}, k = {k_true:.4f}")
print(f"Learned parameters: c = {learned_c:.4f}, k = {learned_k:.4f}")

plt.figure(figsize=(8, 4))
plt.plot(t_plot, x_plot, label='True solution $x(t)$', linewidth=2)
plt.plot(
    t_plot, x_pred, linestyle='--', label='PINN prediction $\\hat{x}(t)$', linewidth=2
)
plt.scatter(
    t_obs, y_obs, label='Sensor Data', color='red', edgecolor='k', s=40, zorder=5
)
plt.xlabel('Time $t$')
plt.ylabel('Displacement $x(t)$')
plt.title('PINN inference of $c$ and $k$')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()

### 3.6 Explore Transfer - Can Transfer Learning Be Used to Find the Solution Faster?

**Idea:** Instead of training a PINN from scratch every time, we can **reuse a previously trained model** on a similar problem and **fine-tune** it for the new parameters.  

- **Fine-tuning** means taking a model that already learned general features of the system (e.g., dynamics of the oscillator) and updating its weights slightly to adapt to a new scenario (e.g., different damping `c` or stiffness `k`).  
- This often **converges faster** and requires **fewer collocation points** or epochs compared to training from scratch.  

**Hands-on opportunity for students:**  
- Load a pre-trained PINN for one set of parameters.  
- Update `c` and `k` (or some layers of the network) and continue training on the new problem.  
- Experiment with how many layers or parameters to freeze, and observe how it affects **speed and accuracy**.

In [None]:
# --- 6. Synthetic data for PINN training ---
torch.manual_seed(42)

# System parameters
m, c_true, k_true = 1.0, 0.5, 3.0 # <- ⚠️ Try changing this
x0, v0 = 1.0, 0.0 # <- ⚠️ Try changing this
T = 10.0
noise_scale = 0.005

# Time grids
t_plot = np.linspace(0, T, 400)
t_obs = np.linspace(0, T, 21)

# Ground truth and noisy observations
x_plot = damped_oscillator_solution_np(t_plot, m, c_true, k_true, x0, v0)
y_obs = damped_oscillator_solution_np(t_obs, m, c_true, k_true, x0, v0)
y_obs += np.random.normal(scale=noise_scale, size=y_obs.shape)

# Convert to PyTorch tensors
device = torch.device("cpu")
t_obs_t, y_obs_t = [torch.tensor(arr.reshape(-1,1), dtype=torch.float32, device=device)
                     for arr in (t_obs, y_obs)]

# Collocation and initial condition points
t_coll = torch.tensor(np.random.uniform(0, T, 120).reshape(-1,1), dtype=torch.float32, device=device)
t_ic = torch.tensor([[0.0]], dtype=torch.float32, device=device)
x0_t = torch.tensor([[x0]], dtype=torch.float32, device=device)
v0_t = torch.tensor([[v0]], dtype=torch.float32, device=device)

In [None]:
# --- 7. PINN initialization ---
# ⚠️ We skip this part to use previously trained model, but we save previous loss curve
# ⚠️ Alternatively, try running it with a model initialized from scratch

#model = PINN(hidden=64, layers=3).to(device)
#c_un = nn.Parameter(torch.tensor(0.2, dtype=torch.float32, device=device))
#k_un = nn.Parameter(torch.tensor(2.0, dtype=torch.float32, device=device))
#params = list(model.parameters()) + [c_un, k_un]

loss_pre_trained = loss_history

In [None]:
# --- 9. Training loop (provided) ---
t_coll.requires_grad_(True)  # Needed to compute derivatives for PDE residual
t_ic.requires_grad_(True)    # Needed for IC derivative

opt = torch.optim.AdamW(params, lr=1e-2)
n_epochs = 2000

loss_history = [] # for logging
for epoch in range(1, n_epochs + 1):
    opt.zero_grad()  # Reset gradients

    # Compute total loss and components
    loss, lres, ldata, lic, c_val, k_val = loss_fn(
        model, c_un, k_un, t_coll, t_obs_t, y_obs_t, t_ic, x0_t, v0_t, m
    )

    loss.backward()  # Backpropagation
    opt.step()       # Update model and parameters

    loss_history.append(loss.item()) # for loggin the loss curve

    # Print progress every 200 epochs
    if epoch == 1 or epoch % 200 == 0 or epoch == n_epochs:
        print(
            f"Epoch {epoch:4d} | "
            f"Total {loss.item():.4e} | "
            f"Res {lres.item():.4e} | "
            f"Data {ldata.item():.4e} | "
            f"IC {lic.item():.4e} | "
            f"c {c_val.item():.4f} | "
            f"k {k_val.item():.4f}"
        )

# plot losses
plt.plot(loss_pre_trained, label='Pre-trained')
plt.plot(loss_history, label='Fine-tuned')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.yscale('log')
plt.title('Loss Curve')
plt.legend()
plt.show()


In [None]:
# --- 10. Evaluation and visualization ---
model.eval()
with torch.no_grad():
    t_plot_t = torch.tensor(t_plot.reshape(-1, 1), dtype=torch.float32, device=device)
    x_pred = model(t_plot_t).cpu().numpy().flatten()
    learned_c = c_un.item()
    learned_k = k_un.item()

print(f"\nTrue parameters:   c = {c_true:.4f}, k = {k_true:.4f}")
print(f"Learned parameters: c = {learned_c:.4f}, k = {learned_k:.4f}")

plt.figure(figsize=(8, 4))
plt.plot(t_plot, x_plot, label='True solution $x(t)$', linewidth=2)
plt.plot(
    t_plot, x_pred, linestyle='--', label='PINN prediction $\\hat{x}(t)$', linewidth=2
)
plt.scatter(
    t_obs, y_obs, label='Sensor Data', color='red', edgecolor='k', s=40, zorder=5
)
plt.xlabel('Time $t$')
plt.ylabel('Displacement $x(t)$')
plt.title('PINN inference of $c$ and $k$')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()