# Pure Neural Network vs Physics Informed Neural Network

Welcome back! In this notebook we compare two learning strategies for the mass–spring dataset created earlier.

- A **baseline neural network** that only tries to match the noisy data points.
- A **Physics Informed Neural Network (PINN)** that matches the data *and* respects the governing differential equation.

Our goal is to see how the extra physics guidance helps the model produce better predictions, especially when we only supply a few noisy measurements.

## 1. What are these models?

Before coding, here is a quick reminder in plain language:

- A **standard neural network** is a flexible function built from layers of simple mathematical operations. During training we show it input–output pairs (in our case time and displacement) and adjust its weights to reduce the prediction error.
- A **Physics Informed Neural Network (PINN)** uses the same layers, but it also gets told about the physical law. Besides fitting the data, it tries to make the differential equation hold true at many time points. We penalize it whenever the equation is violated.

By training both models side by side we can visualize how physics knowledge improves generalization.

## 2. Imports and helper functions

We import PyTorch for building the neural networks, NumPy for data handling, Matplotlib for plotting, and SciPy in case we need to regenerate the dataset from scratch. We also add the project root to `sys.path` so that we can reuse the `TimeMLP` model defined in `src/models.py`.

In [None]:
import sys
from pathlib import Path

import numpy as np
import torch
from torch import nn
from torch.optim import Adam
from torch.utils.data import DataLoader, TensorDataset
import matplotlib.pyplot as plt
from scipy.integrate import solve_ivp


def find_project_root(start: Path) -> Path:
    '''Locate the project directory regardless of where the notebook is executed from.'''
    for candidate in [start, *start.parents]:
        if (candidate / 'figures').exists() and (candidate / 'src').exists():
            return candidate
    raise FileNotFoundError('Could not locate the project root. Run the notebook from inside the repository folder.')

PROJECT_ROOT = find_project_root(Path.cwd())
if str(PROJECT_ROOT) not in sys.path:
    sys.path.append(str(PROJECT_ROOT))
src_path = PROJECT_ROOT / 'src'
if str(src_path) not in sys.path:
    sys.path.append(str(src_path))

from src.models import build_baseline_model, build_pinn_model, ModelConfig

plt.style.use('seaborn-v0_8')
plt.rcParams['figure.figsize'] = (10, 5)

device = torch.device('cpu')
torch.manual_seed(0)
np.random.seed(0)
print(f'Using device: {device}')


## 3. Regenerate the dataset inside this notebook

To keep everything lightweight we rebuild the synthetic dataset right here using the same helper logic as the first notebook. This avoids saving `.npz` files and makes it easy to tweak the parameters on the fly.


In [None]:

m = 1.0
c = 0.1
k = 1.0

def generate_dataset(
    m_value: float = m,
    c_value: float = c,
    k_value: float = k,
    noise_level: float = 0.02,
    t_start: float = 0.0,
    t_end: float = 10.0,
    num_points: int = 1000,
    seed: int | None = 0,
):
    """Return clean and noisy displacement data for the mass-spring system."""
    time_eval = np.linspace(t_start, t_end, num_points)

    def system(t, y):
        x, v = y
        dxdt = v
        dvdt = -(c_value / m_value) * v - (k_value / m_value) * x
        return [dxdt, dvdt]

    solution = solve_ivp(system, (t_start, t_end), [1.0, 0.0], t_eval=time_eval)
    if not solution.success:
        raise RuntimeError('ODE solver failed when regenerating data.')

    x_true = solution.y[0]
    velocity = solution.y[1]

    rng = np.random.default_rng(seed)
    signal_amplitude = np.max(np.abs(x_true))
    noise_std = noise_level * signal_amplitude
    x_noisy = x_true + rng.normal(scale=noise_std, size=x_true.shape)

    return time_eval, x_true, x_noisy, velocity


time, x_clean, x_noisy, velocity = generate_dataset()

# Select training points in the range [0, 5] seconds
train_mask = time <= 5.0
train_time = time[train_mask]
train_indices = np.linspace(0, train_time.size - 1, 40, dtype=int)
train_time_subset = train_time[train_indices]
train_observations = x_noisy[train_mask][train_indices]

# Full test set
test_time = time
x_test_true = x_clean

print(f'Training points: {train_time_subset.shape[0]} | Test points: {test_time.shape[0]}')


### Prepare tensors for PyTorch

We convert the NumPy arrays to PyTorch tensors with shape `(N, 1)` so that they play nicely with our multilayer perceptron. Working with column vectors keeps the math clear.

In [None]:
def to_tensor(array: np.ndarray) -> torch.Tensor:
    return torch.tensor(array, dtype=torch.float32, device=device).view(-1, 1)


t_train = to_tensor(train_time_subset)
x_train = to_tensor(train_observations)

t_test = to_tensor(test_time)
x_test = to_tensor(x_test_true)

print('Training tensor shape:', t_train.shape)
print('Test tensor shape:', t_test.shape)

## 4. Baseline neural network training

We use the `TimeMLP` defined in `src/models.py`. The network is small and trains quickly on a CPU. We minimize the mean squared error between the predictions and the noisy training data.

In [None]:
config = ModelConfig(hidden_layers=(64, 64), activation='tanh')
baseline_model = build_baseline_model(config).to(device)

optimizer = Adam(baseline_model.parameters(), lr=1e-3)
criterion = nn.MSELoss()

# DataLoader helps with batching even though the dataset is small
train_dataset = TensorDataset(t_train, x_train)
train_loader = DataLoader(train_dataset, batch_size=20, shuffle=True)

num_epochs = 1000
loss_history = []

for epoch in range(1, num_epochs + 1):
    epoch_loss = 0.0
    for batch_t, batch_x in train_loader:
        optimizer.zero_grad()
        predictions = baseline_model(batch_t)
        loss = criterion(predictions, batch_x)
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item() * batch_t.size(0)
    epoch_loss /= len(train_dataset)
    loss_history.append(epoch_loss)
    if epoch % 200 == 0 or epoch == 1:
        print(f'Epoch {epoch:4d} | Training MSE: {epoch_loss:.6f}')

baseline_predictions = baseline_model(t_test).detach().cpu().numpy().flatten()
baseline_mse = float(torch.mean((baseline_model(t_test) - x_test) ** 2).item())
print(f'Baseline test MSE: {baseline_mse:.6f}')

### Visualize the baseline fit

We plot the baseline network prediction against the true curve and highlight the training points. This reveals how the model behaves outside the region where it saw data.

In [None]:
fig, ax = plt.subplots()
ax.plot(test_time, x_test_true, label='True solution', linewidth=2)
ax.plot(test_time, baseline_predictions, label='Baseline NN prediction', linestyle='--')
ax.scatter(train_time_subset, train_observations, label='Training data (noisy)', color='black', s=40, zorder=5)
ax.axvspan(0, 5, color='gray', alpha=0.1, label='Training region')
ax.set_xlabel('Time [s]')
ax.set_ylabel('Displacement [m]')
ax.set_title('Baseline neural network vs. true motion')
ax.legend()
fig.tight_layout()
fig.savefig(PROJECT_ROOT / 'figures' / 'baseline_prediction.svg', dpi=150, format='svg')
plt.show()


## 5. Train the Physics Informed Neural Network

The PINN uses the same architecture but receives additional feedback from the differential equation. We evaluate the residual of the ODE at many time points and penalize large deviations.

In [None]:
pinn_model = build_pinn_model(config).to(device)
optimizer_pinn = Adam(pinn_model.parameters(), lr=1e-3)
criterion = nn.MSELoss()

lambda_data = 1.0
lambda_phys = 1.0

# Physics points cover the full time range to encourage good behavior everywhere
t_physics = torch.linspace(0.0, 10.0, steps=200, device=device).view(-1, 1)

def physics_residual(model: nn.Module, t: torch.Tensor) -> torch.Tensor:
    t = t.clone().detach().requires_grad_(True)
    x_pred = model(t)
    dx_dt = torch.autograd.grad(x_pred, t, grad_outputs=torch.ones_like(x_pred), create_graph=True)[0]
    d2x_dt2 = torch.autograd.grad(dx_dt, t, grad_outputs=torch.ones_like(dx_dt), create_graph=True)[0]
    residual = m * d2x_dt2 + c * dx_dt + k * x_pred
    return residual

num_epochs_pinn = 1500

for epoch in range(1, num_epochs_pinn + 1):
    optimizer_pinn.zero_grad()
    pred_data = pinn_model(t_train)
    data_loss = criterion(pred_data, x_train)

    residual = physics_residual(pinn_model, t_physics)
    physics_loss = torch.mean(residual ** 2)

    loss = lambda_data * data_loss + lambda_phys * physics_loss
    loss.backward()
    optimizer_pinn.step()

    if epoch % 300 == 0 or epoch == 1:
        print(
            f'Epoch {epoch:4d} | Total loss: {loss.item():.6f} '
            f'| Data: {data_loss.item():.6f} | Physics: {physics_loss.item():.6f}'
        )

pinn_predictions = pinn_model(t_test).detach().cpu().numpy().flatten()
pinn_mse = float(torch.mean((pinn_model(t_test) - x_test) ** 2).item())
print(f'PINN test MSE: {pinn_mse:.6f}')

### Compare all curves together

The next plot overlays the true motion, the baseline network, and the PINN. We again highlight where training data were available.

In [None]:
fig, ax = plt.subplots()
ax.plot(test_time, x_test_true, label='True solution', linewidth=2)
ax.plot(test_time, baseline_predictions, label='Baseline NN', linestyle='--')
ax.plot(test_time, pinn_predictions, label='PINN', linestyle='-')
ax.scatter(train_time_subset, train_observations, label='Training data (noisy)', color='black', s=40, zorder=5)
ax.axvspan(0, 5, color='gray', alpha=0.1, label='Training region')
ax.set_xlabel('Time [s]')
ax.set_ylabel('Displacement [m]')
ax.set_title('Baseline NN vs PINN on the mass–spring system')
ax.legend()
fig.tight_layout()
fig.savefig(PROJECT_ROOT / 'figures' / 'model_comparison.svg', dpi=150, format='svg')
plt.show()


## 6. Quantitative comparison

To summarize the experiment we compute the mean squared error (MSE) on the full test set for both models and present it in a small table.

In [None]:
results = [
    ('Baseline NN', baseline_mse),
    ('PINN', pinn_mse),
]

print('Model           | Test MSE')
print('---------------------------')
for name, mse in results:
    print(f'{name:<15} | {mse:.6f}')


The PINN achieves a lower error because the physics loss keeps it aligned with the true dynamics even outside the training window. The baseline network, having no knowledge of the differential equation, drifts away once it leaves the area with data points.

## 7. Takeaways

- Physics guidance acts like a smart regularizer: the PINN stays close to the real motion even with noisy, sparse data.
- The baseline neural network can memorize the training points but struggles to extrapolate.
- Both models share the same architecture; the difference lies in the training objective.
- When you know the governing equation, incorporating it into the loss can dramatically boost reliability.

Feel free to tweak the parameters, the number of training points, or the loss weights to see how the behavior changes!