# Physics Informed FNO for Magnetohydrodynamics Equations

In this notebook, we will study the application of physics informed, data-driven modeling using numerical derivatives, by applying these techniques to the incompressible magnetohydrodynamics (MHD) equations representing an incompressible fluid in the presence of a magnetic field $\mathbf{B}$. In this example, our models will be built using a Fourier Neural Operator (FNO), Tensor Factorized FNO, and trained in conjunction with the PDEs representing our system. 

#### Learning Outcomes
* How to apply physics constraints to neural networks
* Learn how the Fourier Neural Operator can be applied to physics based problems
* Learn how to define PDEs with PhysicsNeMo
* Train PINOs in PhysicsNeMo
* Learn how data driven modeling can help build computationally efficient surrogates for physics problems

## Problem Overview

To examine the properties of PINOs with multiple complex equations, we examined the ability of the networks to reproduce the incompressible magnetohydrodynamics (MHD) equations representing an incompressible fluid in the presence of a magnetic field $\mathbf{B}$. These equations are present in several astrophysical phenomena, including black hole accretion and binary neutron star mergers. 

These equations are given by:

$$\begin{align}
\partial_t \mathbf{u}+\mathbf{u} \cdot \nabla \mathbf{u} &=
-\nabla \left( p+\frac{B^2}{2} \right)/\rho_0 +\mathbf{B}
\cdot \nabla \mathbf{B}+\nu \nabla^2 \mathbf{u}, \\
\partial_t \mathbf{B}+\mathbf{u} \cdot \nabla \mathbf{B} &=
\mathbf{B} \cdot \nabla \mathbf{u}+\eta \nabla^2 \mathbf{B}, \\
\nabla \cdot \mathbf{u} &= 0, \\
\nabla \cdot \mathbf{B} &= 0,
\end{align}$$
 
where $\mathbf{u}$ is the velocity field, $p$ is  the pressure, $B$ is the magnitude of the magnetic field, $\rho_0=1$  is the density of the fluid, $\nu$ is the kinetic viscosity,  and $\eta$ is the magnetic resistivity.  We have two equations  for evolution and two constraint equations.


For the magnetic field divergence equation, we can either include it in the loss function or instead evolve the magnetic vector potential $\mathbf{A}$. This quantity is defined such that

$$\begin{align}
\mathbf{B} = \nabla \times \mathbf{A},
\end{align}$$

which ensures that the divergence of $\mathbf{B}$ is zero. By evolving magnetic vector potential $\mathbf{A}$ instead of the magnetic field $\mathbf{B}$, we have a new evolution equation for the vector potential $\mathbf{A}$. This equation is given by 

$$\begin{align}
\partial_t \mathbf{A} + \mathbf{u} \cdot \nabla \mathbf{A}=\eta \nabla^2 \mathbf{A}.
\end{align}$$

Both scripts to evolve the magnetic field $\mathbf{B}$ or vector potential $\mathbf{A}$ are included.


## Data Creation
To train our model, a representative dataset is first created that gives enough coverage of the solution space to train a surrogate model to make predictions on new data points. To obtain interesting results without additional computational difficulty, we will solve the equations in 2D with periodic boundary conditions. This results in solving a total of 3 evolution PDEs at each time step. Two for the velocity evolution, and one for the magnetic vector potential. 

The solution space to this problem can be obtained numerically by solving the PDEs from above with a numerical solver such as `dedalus`. To generate this data, `dedalus` is used to simulate a 2D periodic incompressible MHD flow with a passive tracer field for visualization. The initial flow is in the x-direction and depends only on z. The problem is non-dimensionalized using the shear-layer spacing and velocity jump, so the resulting viscosity and tracer diffusivity are related to the Reynolds and
Schmidt numbers as:

$$\begin{align}
\nu &= \frac{1}{\text{Re}} \\
\eta &= \frac{1}{\text{Re}_M} \\
D &= \frac{\nu}{\text{Sc}}
\end{align}$$

The initial data field for running the simulation is produced using the Gaussian Random Field method in which the radial basis function kernel (RBF) is transformed into Fourier space to obey the desired periodic boundary conditions. Finally, two initial data fields the vorticity potential and magnetic potential are used to guarantee initial velocity and magnetic fields are divergence free. 

## Case Setup
Now let's start the problem by importing the required libraries and packages

```python
import torch
import numpy as np
from sympy import Symbol, Eq, Function, Number

import modulus
from modulus.sym.hydra import instantiate_arch , ModulusConfig
from modulus.sym.solver import Solver
from modulus.sym.domain import Domain
from modulus.sym.geometry.primitives_1d import Line1D
from modulus.sym.domain.constraint import (
    PointwiseBoundaryConstraint,
    PointwiseInteriorConstraint,
)
from modulus.sym.domain.validator import PointwiseValidator
from modulus.sym.domain.monitor import PointwiseMonitor
from modulus.sym.key import Key
from modulus.sym.node import Node
from modulus.sym.eq.pde import PDE
```


## Defining our Constraints - Setting up the PDE

Constraints are used to define the objectives for training our model. They house a set of nodes form which a computational graph is build for execution as well as the loss function. [PhysicsNemo Sim](https://docs.nvidia.com/deeplearning/physicsnemo/physicsnemo-sym/index.html) provides algorithms and utilities used with PhysicsNeMo core to explicitly physics inform the model training, and provides the means for intuitively setting up multi-objective problems. The types of constraints used will be problem dependent. For this example, we can define the following constraints: 

Data Loss
* Obtain simulation data and compare it to the PINO output.

PDE Loss
* Use the known PDEs of the PINO to describe violations of the time evolution of our system

Constraint Loss
* This loss describes constraints from the PDE. Specifically, the velocity divergence free condition of equation 3 and magnetic divergence free condition of equation 4.

Initial Condition Loss: 
* Input field compared to the output at t=0

Boundary Condition Loss:
* Difference in boundary terms. In our case, we have a periodic boundary constraint. 



To begin setting up our constraints, we can start by defining the MHD equations using the `PDE` class from `physicsnemo.sym.eq.pde`. The process of converting our PDEs into a form that is compatible with `PhysicsNeMo` involves defining a class to hold our equations, called `MHD_PDE`, and including each term of the equations. Each variable of the equations is set up as a `Sympy` `Function`, which is then used to create an attribute of our `MHD_PDE` class that holds the final `equations`. 

Because we have elected to solve the equations in two dimensions, we only have the input variables x, y, t and and the Laplacian. 

In PhysiceNeMo, it is preferable to represent our equations by isolating our target terms on the left, and moving the rest of the equation to the right-hand-side. To do this, various components of each equation are compartmentalized, and the final set of equations is composed from these parts.

Constant magnitude B: https://physics.stackexchange.com/questions/446375/what-are-the-possible-magnetic-fields-with-constant-magnitude



Once all equations are defined, the terms are arranged such that they are represented with the main terms isolated on the left hand side. 

```python
from physicsnemo.sym.eq.pde import PDE
from sympy import Symbol, Function, Number


class MHD_PDE(PDE):
    """MHD PDEs using PhysicsNeMo Sym"""

    name = "MHD_PDE"

    def __init__(self, nu=1e-4, eta=1e-4, rho0=1.0):

        # x, y, time
        x, y, t, lap = Symbol("x"), Symbol("y"), Symbol("t"), Symbol("lap")

        # make input variables
        input_variables = {"x": x, "y": y, "t": t, "lap": lap}

        # make functions
        u = Function("u")(*input_variables)
        v = Function("v")(*input_variables)
        Bx = Function("Bx")(*input_variables)
        By = Function("By")(*input_variables)
        A = Function("A")(*input_variables)
        # pressure
        ptot = Function("ptot")(*input_variables)

        u_rhs = Function("u_rhs")(*input_variables)
        v_rhs = Function("v_rhs")(*input_variables)
        Bx_rhs = Function("Bx_rhs")(*input_variables)
        By_rhs = Function("By_rhs")(*input_variables)
        A_rhs = Function("A_rhs")(*input_variables)

        # initialize constants
        nu = Number(nu)
        eta = Number(eta)
        rho0 = Number(rho0)

        # set equations
        self.equations = {}

        # u · ∇u
        self.equations["vel_grad_u"] = u * u.diff(x) + v * u.diff(y)
        self.equations["vel_grad_v"] = u * v.diff(x) + v * v.diff(y)
        # B · ∇u
        self.equations["B_grad_u"] = Bx * u.diff(x) + v * Bx.diff(y)
        self.equations["B_grad_v"] = Bx * v.diff(x) + By * v.diff(y)
        # u · ∇B
        self.equations["vel_grad_Bx"] = u * Bx.diff(x) + v * Bx.diff(y)
        self.equations["vel_grad_By"] = u * By.diff(x) + v * By.diff(y)
        # B · ∇B
        self.equations["B_grad_Bx"] = Bx * Bx.diff(x) + By * Bx.diff(y)
        self.equations["B_grad_By"] = Bx * By.diff(x) + By * By.diff(y)
        # ∇ × (u × B) = u(∇ · B) - B(∇ · u) B · ∇u − u · ∇B
        self.equations["uBy_x"] = u * By.diff(x) + By * u.diff(x)
        self.equations["uBy_y"] = u * By.diff(y) + By * u.diff(y)
        self.equations["vBx_x"] = v * Bx.diff(x) + Bx * v.diff(x)
        self.equations["vBx_y"] = v * Bx.diff(y) + Bx * v.diff(y)
        # ∇ · B 
        self.equations["div_B"] = Bx.diff(x) + By.diff(y)
        # ∇ · u 
        self.equations["div_vel"] = u.diff(x) + v.diff(y)

        # RHS of MHD equations
        self.equations["u_rhs"] = (
            -self.equations["vel_grad_u"]
            - ptot.diff(x) / rho0
            + self.equations["B_grad_Bx"] / rho0
            + nu * u.diff(lap)
        )
        self.equations["v_rhs"] = (
            -self.equations["vel_grad_v"]
            - ptot.diff(y) / rho0
            + self.equations["B_grad_By"] / rho0
            + nu * v.diff(lap)
        )
        u * By.diff(y) + By * u.diff(y) - v * Bx.diff(y) + Bx * v.diff(y)
        self.equations["Bx_rhs"] = (
            self.equations["uBy_y"] - self.equations["vBx_y"] + eta * Bx.diff(lap)
        )
        self.equations["By_rhs"] = -(
            self.equations["uBy_x"] - self.equations["vBx_x"]
        ) + eta * By.diff(lap)

        self.equations["Du"] = u.diff(t) - u_rhs
        self.equations["Dv"] = v.diff(t) - v_rhs
        self.equations["DBx"] = Bx.diff(t) - Bx_rhs
        self.equations["DBy"] = By.diff(t) - By_rhs

        # Vec potential equations
        self.equations["vel_grad_A"] = u * A.diff(x) + v * A.diff(y)
        self.equations["A_rhs"] = -self.equations["vel_grad_A"] + +eta * A.diff(lap)
        self.equations["DA"] = A.diff(t) - A_rhs
```

Our model's output may then be used to compute the loss between prediction and true values, and computing loss based on initial conditions, PDEs, or data loss using LpLoss.


## Defining our Constraints - Loss Functions 

Now that we have defined our PDE, we can begin to define all of the constraints that make up the loss function for our problem. The loss functions are defined inside of a class called `LossMHD_PhysicsNeMo`, which can use a weighted sum of individual losses for training. Additionally, all of fixed and constant parameters needed are added to the class definition.

```python 
import numpy as np
import torch
import torch.nn.functional as F
import math
from .losses import LpLoss, fourier_derivatives_lap, fourier_derivatives_ptot
from .mhd_pde import MHD_PDE
from physicsnemo.models.layers.spectral_layers import fourier_derivatives


class LossMHD_PhysicsNeMo(object):
    "Calculate loss for MHD equations with magnetic field, using physicsnemo derivatives"

    def __init__(
        self,
        nu=1e-4,
        eta=1e-4,
        rho0=1.0,
        data_weight=1.0,
        ic_weight=1.0,
        pde_weight=1.0,
        constraint_weight=1.0,
        use_data_loss=True,
        use_ic_loss=True,
        use_pde_loss=True,
        use_constraint_loss=True,
        u_weight=1.0,
        v_weight=1.0,
        Bx_weight=1.0,
        By_weight=1.0,
        Du_weight=1.0,
        Dv_weight=1.0,
        DBx_weight=1.0,
        DBy_weight=1.0,
        div_B_weight=1.0,
        div_vel_weight=1.0,
        Lx=1.0,
        Ly=1.0,
        tend=1.0,
        use_weighted_mean=False,
        **kwargs
    ):  # add **kwards so that we ignore unexpected kwargs when passing a config dict
        self.nu = nu
        self.eta = eta
        self.rho0 = rho0
        self.data_weight = data_weight
        self.ic_weight = ic_weight
        self.pde_weight = pde_weight
        self.constraint_weight = constraint_weight
        self.use_data_loss = use_data_loss
        self.use_ic_loss = use_ic_loss
        self.use_pde_loss = use_pde_loss
        self.use_constraint_loss = use_constraint_loss
        self.u_weight = u_weight
        self.v_weight = v_weight
        self.Bx_weight = Bx_weight
        self.By_weight = By_weight
        self.Du_weight = Du_weight
        self.Dv_weight = Dv_weight
        self.DBx_weight = DBx_weight
        self.DBy_weight = DBy_weight
        self.div_B_weight = div_B_weight
        self.div_vel_weight = div_vel_weight
        self.Lx = Lx
        self.Ly = Ly
        self.tend = tend
        self.use_weighted_mean = use_weighted_mean
        # Define 2D MHD PDEs
        self.mhd_pde_eq = MHD_PDE(self.nu, self.eta, self.rho0)
        self.mhd_pde_node = self.mhd_pde_eq.make_nodes()

        if not self.use_data_loss:
            self.data_weight = 0
        if not self.use_ic_loss:
            self.ic_weight = 0
        if not self.use_pde_loss:
            self.pde_weight = 0
        if not self.use_constraint_loss:
            self.constraint_weight = 0

    def __call__(self, pred, true, inputs, return_loss_dict=False):
        loss, loss_dict = self.compute_losses(pred, true, inputs)
        return loss, loss_dict
```

The MDH equations that we defined before are initialized for use within the  following loss functions. 

### Data Loss
The data loss is used to compare simulation data to the output of our model. The velocity in x, and y, as well as magnetic field in x and y directions is directly compared to the ground truth data through the `Lp-Loss`, and the relative mean squared error is returned. 


```python
def data_loss(self, pred, true, return_all_losses=False):
    "Compute data loss"
    lploss = LpLoss(size_average=True)
    u_pred = pred[..., 0]
    v_pred = pred[..., 1]
    Bx_pred = pred[..., 2]
    By_pred = pred[..., 3]

    u_true = true[..., 0]
    v_true = true[..., 1]
    Bx_true = true[..., 2]
    By_true = true[..., 3]

    loss_u = lploss(u_pred, u_true)
    loss_v = lploss(v_pred, v_true)
    loss_Bx = lploss(Bx_pred, Bx_true)
    loss_By = lploss(By_pred, By_true)

    if self.use_weighted_mean:
        weight_sum = self.u_weight + self.v_weight + self.Bx_weight + self.By_weight
    else:
        weight_sum = 1.0

    loss_data = (
        self.u_weight * loss_u
        + self.v_weight * loss_v
        + self.Bx_weight * loss_Bx
        + self.By_weight * loss_By
    ) / weight_sum

    if return_all_losses:
        return loss_data, loss_u, loss_v, loss_Bx, loss_By
    else:
        return loss_data
```

## PDE Loss
The PDE loss describes violations of the time evolution of the PDEs and the PINO outputs. In order to make this comparison, the spatial and temporal derivatives of the output fields need to be computed. To do so, Fourier differentiation is used to calculate the spacial derivatives, and second order finite differencing is used for temporal derivatives. The output fields are the velocity in the x direction u, the velocity in the y direction v, the magnetic field in the x direction Bx, and the magnetic field in the y direction By. The PDE loss $L_{PDE}$ is then defined as the
MSE loss between zero and the PDE, after putting all the terms on the same side of the equation.

```python
def mhd_pde(self, u, v, Bx, By, p=None):
    "Compute PDEs for MHD using magnetic field"
    batchsize = u.size(0)
    nt = u.size(1)
    nx = u.size(2)
    ny = u.size(3)
    device = u.device
    dt = self.tend / (nt - 1)
    dx = self.Lx / nx
    dy = self.Ly / ny

    B2 = Bx**2 + By**2
    B2_h = torch.fft.fftn(B2, dim=[2, 3])

    # compute fourier derivatives
    f_du, _ = fourier_derivatives(u, [self.Lx, self.Ly])
    f_dv, _ = fourier_derivatives(v, [self.Lx, self.Ly])
    f_dBx, _ = fourier_derivatives(Bx, [self.Lx, self.Ly])
    f_dBy, _ = fourier_derivatives(By, [self.Lx, self.Ly])

    u_x = f_du[:, 0:nt, :nx, :ny]
    u_y = f_du[:, nt : 2 * nt, :nx, :ny]
    v_x = f_dv[:, 0:nt, :nx, :ny]
    v_y = f_dv[:, nt : 2 * nt, :nx, :ny]
    Bx_x = f_dBx[:, 0:nt, :nx, :ny]
    Bx_y = f_dBx[:, nt : 2 * nt, :nx, :ny]
    By_x = f_dBy[:, 0:nt, :nx, :ny]
    By_y = f_dBy[:, nt : 2 * nt, :nx, :ny]

    u_lap = fourier_derivatives_lap(u, [self.Lx, self.Ly])
    v_lap = fourier_derivatives_lap(v, [self.Lx, self.Ly])
    Bx_lap = fourier_derivatives_lap(Bx, [self.Lx, self.Ly])
    By_lap = fourier_derivatives_lap(By, [self.Lx, self.Ly])

    # note that for pressure, the zero mode (the mean) cannot be zero for invertability so it is set to 1
    div_vel_grad_vel = u_x**2 + 2 * u_y * v_x + v_y**2
    div_B_grad_B = Bx_x**2 + 2 * Bx_y * By_x + By_y**2
    f_dptot = fourier_derivatives_ptot(
        p, div_vel_grad_vel, div_B_grad_B, B2_h, self.rho0, [self.Lx, self.Ly]
    )
    ptot_x = f_dptot[:, 0:nt, :nx, :ny]
    ptot_y = f_dptot[:, nt : 2 * nt, :nx, :ny]

    # Plug inputs into dictionary
    all_inputs = {
        "u": u,
        "u__x": u_x,
        "u__y": u_y,
        "v": v,
        "v__x": v_x,
        "v__y": v_y,
        "Bx": Bx,
        "Bx__x": Bx_x,
        "Bx__y": Bx_y,
        "By": By,
        "By__x": By_x,
        "By__y": By_y,
        "ptot__x": ptot_x,
        "ptot__y": ptot_y,
        "u__lap": u_lap,
        "v__lap": v_lap,
        "Bx__lap": Bx_lap,
        "By__lap": By_lap,
    }

    # Substitute values into PDE equations
    u_rhs = self.mhd_pde_node[14].evaluate(all_inputs)["u_rhs"]
    v_rhs = self.mhd_pde_node[15].evaluate(all_inputs)["v_rhs"]
    Bx_rhs = self.mhd_pde_node[16].evaluate(all_inputs)["Bx_rhs"]
    By_rhs = self.mhd_pde_node[17].evaluate(all_inputs)["By_rhs"]

    u_t = self.Du_t(u, dt)
    v_t = self.Du_t(v, dt)
    Bx_t = self.Du_t(Bx, dt)
    By_t = self.Du_t(By, dt)

    # Find difference
    Du = self.mhd_pde_node[18].evaluate({"u__t": u_t, "u_rhs": u_rhs[:, 1:-1]})[
        "Du"
    ]
    Dv = self.mhd_pde_node[19].evaluate({"v__t": v_t, "v_rhs": v_rhs[:, 1:-1]})[
        "Dv"
    ]
    DBx = self.mhd_pde_node[20].evaluate(
        {"Bx__t": Bx_t, "Bx_rhs": Bx_rhs[:, 1:-1]}
    )["DBx"]
    DBy = self.mhd_pde_node[21].evaluate(
        {"By__t": By_t, "By_rhs": By_rhs[:, 1:-1]}
    )["DBy"]

    return Du, Dv, DBx, DBy


def mhd_pde_loss(self, Du, Dv, DBx, DBy, return_all_losses=None):
    "Compute PDE loss"
    Du_val = torch.zeros_like(Du)
    Dv_val = torch.zeros_like(Dv)
    DBx_val = torch.zeros_like(DBx)
    DBy_val = torch.zeros_like(DBy)

    loss_Du = F.mse_loss(Du, Du_val)
    loss_Dv = F.mse_loss(Dv, Dv_val)
    loss_DBx = F.mse_loss(DBx, DBx_val)
    loss_DBy = F.mse_loss(DBy, DBy_val)

    if self.use_weighted_mean:
        weight_sum = (
            self.Du_weight + self.Dv_weight + self.DBx_weight + self.DBy_weight
        )
    else:
        weight_sum = 1.0

    loss_pde = (
        self.Du_weight * loss_Du
        + self.Dv_weight * loss_Dv
        + self.DBx_weight * loss_DBx
        + self.DBy_weight * loss_DBy
    ) / weight_sum

    if return_all_losses:
        return loss_pde, loss_Du, loss_Dv, loss_DBx, loss_DBy
    else:
        return loss_pde
```

## Constraint Loss
The constraint illustrates the deviations of the velocity divergence free condition of Eq3 and the magnetic divergence free condition of Eq4. These conditions are implemented similarly to the PDE loss, but without time derivative terms. The constraint loss is then the MSE between each of the constraint equations and zero.  

```python
def mhd_constraint(self, u, v, Bx, By):
    "Compute constraints"
    batchsize = u.size(0)
    nt = u.size(1)
    nx = u.size(2)
    ny = u.size(3)
    device = u.device
    dt = self.tend / (nt - 1)
    dx = self.Lx / nx
    dy = self.Ly / ny

    f_du, _ = fourier_derivatives(u, [self.Lx, self.Ly])
    f_dv, _ = fourier_derivatives(v, [self.Lx, self.Ly])
    f_dBx, _ = fourier_derivatives(Bx, [self.Lx, self.Ly])
    f_dBy, _ = fourier_derivatives(By, [self.Lx, self.Ly])

    u_x = f_du[:, 0:nt, :nx, :ny]
    v_y = f_dv[:, nt : 2 * nt, :nx, :ny]
    Bx_x = f_dBx[:, 0:nt, :nx, :ny]
    By_y = f_dBy[:, nt : 2 * nt, :nx, :ny]

    div_B = self.mhd_pde_node[12].evaluate({"Bx__x": Bx_x, "By__y": By_y})["div_B"]
    div_vel = self.mhd_pde_node[13].evaluate({"u__x": u_x, "v__y": v_y})["div_vel"]

    return div_vel, div_B

```

## Initial Condition Loss
The initial condition loss encourages the model to associate the input field with the output field specifically at t=0. This constraint can usually be achieved with data loss, however this approach emphasized the importance of correct initial condition prediction, and enables training in the absence of data. Training without data and the significance of the initial condition term stem from the PDE loss term. 

```python
def ic_loss(self, pred, inputs, return_all_losses=False):
    "Compute initial condition loss"
    lploss = LpLoss(size_average=True)
    ic_pred = pred[:, 0]
    ic_true = inputs[:, 0, ..., 3:]
    u_ic_pred = ic_pred[..., 0]
    v_ic_pred = ic_pred[..., 1]
    Bx_ic_pred = ic_pred[..., 2]
    By_ic_pred = ic_pred[..., 3]

    u_ic_true = ic_true[..., 0]
    v_ic_true = ic_true[..., 1]
    Bx_ic_true = ic_true[..., 2]
    By_ic_true = ic_true[..., 3]

    loss_u_ic = lploss(u_ic_pred, u_ic_true)
    loss_v_ic = lploss(v_ic_pred, v_ic_true)
    loss_Bx_ic = lploss(Bx_ic_pred, Bx_ic_true)
    loss_By_ic = lploss(By_ic_pred, By_ic_true)

    if self.use_weighted_mean:
        weight_sum = weight_sum = (
            self.u_weight + self.v_weight + self.Bx_weight + self.By_weight
        )
    else:
        weight_sum = 1.0

    loss_ic = (
        self.u_weight * loss_u_ic
        + self.v_weight * loss_v_ic
        + self.Bx_weight * loss_Bx_ic
        + self.By_weight * loss_By_ic
    ) / weight_sum

    if return_all_losses:
        return loss_ic, loss_u_ic, loss_v_ic, loss_Bx_ic, loss_By_ic
    else:
        return loss_ic
```

## Boundary Condition Loss
This loss describes violations of the boundary terms. In this specific case, the tFNO architecture ensures that the periodic boundary conditions are satisfied, thus the term is not used in this example. 

In theory, training can be done by correctly predicting the initial conditions, boundary conditions and evolving the PDE correctly forward in time. In practice, having data helps the model converge more quickly. However, an incorrect initial condition results in the PDE evolving the wrong state forward in time, which is why it is emphasized as its own term. The initial condition loss is calculated by taking the input fields and computing the relative MSE with output fields at t=0. 


## Training our Model

PhysicsNeMo has two distinct styles, namely Core and Sym. PhysicsNeMo Sym is a framework providing pythonic APIs, algorithms and utilities to be used with PhysicsNeMo Core, while PhysicsNeMo Core interoperates with PyTorch directly. Working with PhysicsNeMo Core looks and feels more like a PyTorch workflow with some key utils like models, utils, datapipes, imported directly from `physicsnemo` itself. While some components of the workflow so far have borrowed from PhysicsNeMo Sym (`MHD_PDE`), the training workflow for this problem will be build using Core. This will provide more flexibility over our training loop, and allow for further customizations to our workflow. 

