# A Tutorial on Community Models and Integrations with PhysicsNeMo
Scientific machine learning (SciML) is becoming a fundamental part of of research, development, and discovery workflows for scientists across many domains such as computational fluid dynamics, materials characterization, climate and weather modeling, and computer aided engineering. With the increase use of AI and ML in these domains also comes an increase in the number of resources available for developing models, and a plethora of datasets to use as starting points for model training. While many efforts are segmented and tailor-made for specific use cases, projects such as PhysicsNeMo, The Well, and Proxima Fusion's ConStellaration dataset are great examples of community efforts to bridge the gap between siloed SciML research and collaborative community projects.

In this tutorial, PhysicsNeMo is used to expand the scope of community models and datasets from [The Well](https://github.com/PolymathicAI/the_well) and the [ConStellaration Challenge](https://huggingface.co/blog/cgeorgiaw/constellaration-fusion-challenge) by leveraging pre-trained physics informed machine learning models, community accessible datasets, and the robust SciML framework from PhysicsNeMo.


Specifically, this tutorial covers:

* [Loading data from The Well](#loading-models-and-data-from-the-well)
* [Training a PhysicsNeMo model using data from The Well](#using-the-well-data-to-train-physicsnemo-models)
* [Running models from The Well in the PhysicsNeMo framework](#running-models-from-the-well-in-physicsnemo)
* [Fine-tuning models from The Well using PhysicsNeMo](#fine-tuning-models-from-the-well-in-physicsnemo)
* [Exploring a community design challenge for fusion research and engineering](#exploring-the-huggingface-stellerator-design-dataset)

Note that the recommended hardware for this tutorial is at least 100GB of free disk space, and a GPU with at least 24GB VRAM.

## Loading Models and Data from The Well

The Well is large-scale collection of machine learning datasets containing numerical simulations of a wide variety of spatiotemporal physical systems. Additionally, a variety of state-of-the-art models are included, with a large selection of pre-trained models available for download on HuggingFace. Specifically, The Well includes 16 datasets covering diverse domains such as biological systems, fluid dynamics, acoustic scattering, and magnetohydrodynamics simulations. 

In this tutorial, the focus will largely be on magnetohydrodynamics. Magnetohydrodynamics (MHD), is the study of the dynamics of electrically conducting fluids such as plasmas. Its applications range from understanding the flow of plasmas in the Sun, to simulating the physics inside of magnetically confined fusion devices. The dynamics of these systems are represented by combining the Naiver-Stokes equations with Maxwells equations - capturing both fluid flows and electromagnetic forces. 


While the specific domain focus of this tutorial is only on MDH and its applications, the workflows and integrations described are transferable to other domains, datasets, and models. 

The tutorial will specifically make use of the [MHD_64 dataset](https://github.com/PolymathicAI/the_well/tree/master/datasets/MHD_64).  

Datasets from The Well are available through HuggingFace, and can either be streamed via the `datasets` API, or saved locally. The command to download all splits is `the-well-download --base-path path/to/base --dataset MHD_64`. This dataset is around 71GB and may take a up to an hour to download and save locally.

To quickly examine the dataset, a single sample can be loaded from the streaming API. The [data-card](https://huggingface.co/datasets/polymathic-ai/MHD_64) available on HuggingFace provides additional information about the simulations, equations modeled, and a reference paper for in depth explanation of the physics of the problem. A snipped for loading the dataset and details from the data-card are provided for reference.

Note that in the following code, either streaming or loading from a locally saved version of the dataset is supported.

In [None]:
from the_well.data import WellDataset
from the_well.data.normalization import ZScoreNormalization
from torchinfo import summary

use_streaming = True

# Enable streaming the dataset from HuggingFace or loading from local directory
# The following line may take a couple of minutes to instantiate the datamodule
dataset = WellDataset(
    well_base_path="hf://datasets/polymathic-ai/" if use_streaming else "./TheWellMHDData/datasets",
    well_dataset_name="MHD_64",
    well_split_name="train",
    use_normalization=True,
    normalization_type=ZScoreNormalization,
    n_steps_input=4,
    n_steps_output=1,
)

With the dataset on hand, its features, shape and size can be explored. The Well provides a great script for this already, and is left to the reader to go through if desired. A summary is provided below, also available on the [data-card](https://github.com/PolymathicAI/the_well/blob/master/datasets/MHD_64/README.md) online:

* **Dimension of discretized data:** 100 time steps of 64 $\times$ 64 $\times$ 64 cubes.
* **Fields available in the data:** Density (scalar field), velocity (vector field), magnetic field (vector field).
* **Number of trajectories:** 10 Initial conditions x 10 combination of parameters = 100 trajectories.
* **Estimated size of the ensemble of all simulations:** 71.6 GB.
* **Grid type:** uniform grid, cartesian coordinates.
* **Initial conditions:** uniform IC.
* **Boundary conditions:** periodic boundary conditions.
* **Data are stored separated by ($\Delta t$):** 0.01 (arbitrary units).
* **Total time range ($t\_{min}$ to $t\_{max}$):** $t\_{min} = 0$, $t\_{max} = 1$.
* **Spatial domain size ($L_x$, $L_y$, $L_z$):** dimensionless so 64 pixels.
* **Set of coefficients or non-dimensional parameters evaluated:** all combinations of $\mathcal{M}_s=${0.5, 0.7, 1.5, 2.0 7.0} and $\mathcal{M}_A =${0.7, 2.0}.
* **Approximate time and hardware used to generate the data:** Downsampled from `MHD_256` after applying ideal low-pass filter.
* **What phenomena of physical interest are catpured in the data:** MHD fluid flows in the compressible limit (sub and super sonic, sub and super Alfvenic).
* **How to evaluate a new simulator operating in this space:** Check metrics such as Power spectrum, two-points correlation function.

Please cite the associated paper if you use this data in your research:

```
@article{burkhart2020catalogue,
  title={The catalogue for astrophysical turbulence simulations (cats)},
  author={Burkhart, B and Appel, SM and Bialy, S and Cho, J and Christensen, AJ and Collins, D and Federrath, Christoph and Fielding, DB and Finkbeiner, D and Hill, AS and others},
  journal={The Astrophysical Journal},
  volume={905},
  number={1},
  pages={14},
  year={2020},
  publisher={IOP Publishing}
}
```




A single sample can be extracted and inspected. Some notes from The Well are shown below. More info on examining their data can be found in [this example](https://github.com/PolymathicAI/the_well/blob/master/docs/tutorials/dataset.ipynb):

The most important elements are `input_fields` and `output_fields`. They represent the time-varying physical fields of the dynamical system and are generally the input and target output of our models. For a dynamical system that has 3 spatial dimensions $x$, $y$, and $z$, `input_fields` would have a shape $(T_{in}, L_x, L_y, L_z, F)$ and `output_fields` would have a shape $(T_{out}, L_x, L_y, L_z, F)$. The number of input and output time steps $T_{in}$ and $T_{out}$ are specified at the instantiation of the dataset with the arguments `n_steps_input` and `n_steps_output`. $L_x$, $L_y$ and $L_z$ are the lengths of the spatial dimensions. $F$ represents the number of physical fields, where vector fields $v = (v_x, v_y, v_z)$ and tensor fields $t = (t_{xx}, t_{xy}, t_{xz}, t_{yy}, t_{yx}, t_{yz},  t_{zz}, t_{zx}, t_{zy})$ are flattened.

Note that the MHD_64 dataset only contains scalar and vector fields.

In [None]:
print(f"Total number of samples: {len(dataset)}")
sample = dataset[0]
for k, v in sample.items():
    print(f"Key: {k.ljust(20)} Shape: {v.shape}")

print(f"Field Names: {dataset.metadata.field_names}")

## Using The Well Data to Train PhysicsNeMo Models

With the dataset prepared, a model can be trained to approximate the system dynamics, resulting in a surrogate model for magnetohydrodynamics simulations. To align with models from The Well, the PhysicsNeMo Tensor-Factorized Fourier Neural Operator implementation is adapted to closely match the implementation from The Well. The model utilizes Tucker factorization, and has around 300M parameters. 

For the remainder of this tutorial, a boiler-plate `Trainer` class is implemented in `training_utils.py` that contains the core components needed for loading models, running training loops, saving checkpoints, evaluating models, etc. The method `setup_model` needs to be implemented in order to attach a model to the Trainer. For completeness, the TFNO architecture from `PhysicsNeMo` is included in the `tfno` folder, which is initialized with parameters that closely match with the defaults from The Well. The config file used to define the parameters for our model, data, and trainer are in the `config` folder.

The PhysicsNeMo framework utilizes the Hydra configuration framework, enabling streamlined tracking of all parameters associated with running experiments, and is a common best-practice in machine learning workflows. Additionally, the `training_utils.py` makes use of other best-practices such as metric logging, checkpointing, and distributed workflow utilities.

In [None]:
import hydra
from hydra import compose, initialize
from training_utils import Trainer

hydra.core.global_hydra.GlobalHydra.instance().clear()

initialize(version_base=None, config_path="./config")
cfg = compose(config_name="mhd_config.yaml")


Note that the `training_utils.py` script can be used either by calling it directly or through `torchrun`, however the `setup_model` method will need to be implemented if using this approach.
```python 
python training_utils.py
```
or 
```python
torchrun --standalone --nnodes=1 --nproc_per_node=1 training_utils.py
```

In the following code, the TFNO model from PhysicsNeMo is set up in a child class of the `Trainer`, which instantiates the model through various model parameters set in the Hydra config file. A model summary is printed for convenience and reference. 

In [None]:
class MHDTrainer(Trainer):
    def setup_model(self):
        """Setup the TFNO model based in PhysicsNeMo. Input parameters are set and controlled by the yaml config."""
        from tfno.tfno import TFNO

        self.model = TFNO(
            in_channels=self.model_params.in_dim,
            coord_features=self.model_params.coord_features,
            out_channels=self.model_params.out_dim,
            decoder_layers=self.model_params.decoder_layers,
            decoder_layer_size=self.model_params.fc_dim,
            dimension=self.model_params.dimension,
            latent_channels=self.model_params.layers,
            num_fno_layers=self.model_params.num_fno_layers,
            num_fno_modes=self.model_params.modes,
            padding=[
                self.model_params.pad_z,
                self.model_params.pad_y,
                self.model_params.pad_x,
            ],
            rank=self.model_params.rank,
            factorization=self.model_params.factorization,
            fixed_rank_modes=self.model_params.fixed_rank_modes,
            decomposition_kwargs=self.model_params.decomposition_kwargs,
        ).to(self.dist.device)

        print(summary(self.model, depth=3))

Finally, the model can be trained. Note that one of the config parameters is overwritten here for convenience, which could have optionally be set directly in the `.yaml` file itself. 

In [None]:
cfg.train_params.ckpt_path = "./checkpoints/pnm_model_well_data"
mhd_trainer = MHDTrainer(cfg)
mhd_trainer.train()

During training, checkpoints are saved at the specified `ckpt_freq`, which can be loaded with the same `MDHTrainer` to use for fine-tuning, inference or evaluation.

## Running Models From The Well in PhysicsNemo

PhysicsNeMo offers great utility in running community models natively by way of a simple conversion. The documentation for this can be found [here](https://docs.nvidia.com/deeplearning/physicsnemo/physicsnemo-core/api/physicsnemo.models.html#converting-pytorch-models-to-physicsnemo-models), with the general steps outlined below. Before working through the conversion process, the model interface to The Well is explored to understand how these models are used in practice.

Models from The Well can be loaded through their benchmark API. The vanilla PyTorch model is used with the `Module.from_torch` method to pull the model into the PhysicsNeMo framework. The `ModelMetaData` class allows for various optimization strategies to be included in this conversion, including just-in-time compilation, automatic mixed precision training, and cuda-graphs.

```python
@dataclass
class GenericModelMetadata(ModelMetaData):
    name: str = "GenericModel"
    # Optimization
    jit: bool = True
    cuda_graphs: bool = True
    amp_cpu: bool = True
    amp_gpu: bool = True
```

In [None]:
from dataclasses import dataclass

from physicsnemo.models.meta import ModelMetaData
from physicsnemo.models.module import Module
from the_well.benchmark.models import TFNO


@dataclass
class WellMetaData(ModelMetaData):
    name: str = "WellTFNOModel"


well_nemo_model = Module.from_torch(TFNO, meta=WellMetaData())

At this stage, the `well_nemo_model` still needs to be initialized with model parameters. We can simply use:
```python
instantiated_model = well_nemo_model(**parameters)
```

Alternatively, the PhysicsNeMo model registry can be used to load in the newly created model. In the following cell, the model is instantiated through the model registry, however either method is valid. The model registry in PhysicsNeMo provides access to a variety of architectures that can be used for SciML tasks.

In [None]:
from physicsnemo.registry import ModelRegistry

ModelRegistry().list_models()

In [None]:
from physicsnemo.registry import ModelRegistry

well_nemo_model = ModelRegistry().factory("TFNOPhysicsNeMoModel")(
    dim_in=28,
    dim_out=7,
    n_spatial_dims=3,
    spatial_resolution=[64, 64, 64],
    hidden_channels=128,
    modes1=16,
    modes2=16,
    modes3=16,
)

summary(well_nemo_model, depth=5)

By examining the model summary from above and the order of operations in the [TFNO forward pass](https://github.com/neuraloperator/neuraloperator/blob/7bb578df787d6ca548b623b83f0601c98dc931fb/neuralop/models/fno.py#L337), the data is processed by:

1. Applying optional positional encoding
2. Sending inputs through a lifting layer to a high-dimensional latent space
3. Applying optional domain padding to high-dimensional intermediate function representation
4. Applying `n_layers` Fourier/TFNO layers in sequence (SpectralConvolution + skip connections, nonlinearity) 
5. If domain padding was applied, domain padding is removed
6. Projection of intermediate function representation to the output channels

The Well benchmark aims to showcase the effectiveness of applying SoTA models to the forward problem: predicting the next step of a simulation from a history of 4 previous time steps. Trained models can then be used in an autoregressive setting by predicting next timesteps, and the concatenating these predictions back into the input.

Concretely, the MHD TFNO model is used to predict the $T_{out} = 1$ next states given the $T_{in} = 4$ previous states. The input steps are concatenated along their channels, such that the model expects $T_{in} \times F$ channels as input and $T_{out} \times F$ channels as output. This introduces an assumption of our models that four timesteps will be available to use as input and can limit the application of these models to real world use cases, however this serves as an initial starting point towards building surrogate models for this dynamical system. 

To now train this model in PhysicsNeMo with the same dataset from The Well, the `Trainer` class can be updated to perform the same model loading and conversion that is required to pull the model into PhysicsNeMo framework.

In [None]:
class MHDTrainer(Trainer):
    def setup_model(self):
        """Setup the TFNO model based in PhysicsNeMo. Input parameters are set and controlled by the yaml config."""
        from dataclasses import dataclass

        from physicsnemo.models.meta import ModelMetaData
        from physicsnemo.models.module import Module
        from the_well.benchmark.models import TFNO

        @dataclass
        class WellMetaData(ModelMetaData):
            name: str = "WellTFNOModel"

        well_nemo_model = Module.from_torch(TFNO, meta=WellMetaData())
        well_nemo_model = well_nemo_model(
            dim_in=28,
            dim_out=7,
            n_spatial_dims=3,
            spatial_resolution=[64, 64, 64],
            hidden_channels=128,
            modes1=16,
            modes2=16,
            modes3=16,
        )

        self.model = well_nemo_model.to(self.dist.device)
        print(summary(self.model, depth=5))

Train the model!

In [None]:
cfg.train_params.ckpt_path = "./checkpoints/well_model_well_data"
well_model_trainer = MHDTrainer(cfg)
well_model_trainer.train()

Similar to the previous section, checkpoints are available for the trained model in the specified directory.

## Fine-tuning Models from The Well in PhysicsNeMo
One great feature of The Well is the abundance of pre-trained models that they provided, typically offering [many pretrained model architectures](https://huggingface.co/collections/polymathic-ai/the-well-benchmark-models-67e69bd7cd8e60229b5cd43e) for a single dataset. These pre-trained models can be fine-tuned using PhysicsNeMo by leveraging a similar approach to the above example - converting a model into the PhysicsNeMo format. 

In this section, the pre-trained TFNO model will be loaded from The Well, converted to a PhysicsNeMo model, and then used in the context of transfer learning to adapt the model to a new task with yet another community dataset.

The new dataset will come from a HuggingFace and Proxima Fusion collaboration project: [a community challenge focused around Stellarator design](https://huggingface.co/blog/cgeorgiaw/constellaration-fusion-challenge). In the field of magnetically confined fusion, the physics of magnetohydrodynamics influence the system in many ways, and modeling these equations help researchers and engineers understand reaction ability, equipment design, and plasma shapes, to name a few. In the remainder of this tutorial, we will set up the trainer, inspect the Stellarator design dataset, and use transfer learning to fine-tune a model for a new task.

### Loading The Well Pre-Trained Checkpoint as PhysicsNeMo Model

Models from The Well benchmark are available through HuggingFace, and can be trivially loaded with just a few lines of code:

```python
from the_well.benchmark.models import TFNO

well_model = TFNO.from_pretrained("polymathic-ai/TFNO-MHD_64")
```

In order to use this model with PhysicsNeMo, the base TFNO pytorch model needs to first be converted to PhysicsNeMo format in the same way as the previous section. After this, the pre-trained model parameters can be directly transferred to the PhysicsNeMo model. To ensure the model is instantiated correctly the following steps can be used:
* Collect all of the model hyperparameters from the pre-trained model instance
* Inspect the TFNO class for its required input arguments
* Instantiate the PhysicsNeMo model with the required TFNO arguments and their values from the pre-trained model.

The explicit steps for this procedure are outlined in the `setup_model` method of `MHDTrainer` below. 

Note that this procedure is general and can be applied other models from within The Well, as well as additional community models.

In [None]:
class MHDTrainer(Trainer):
    def setup_model(self):
        """Setup the pretrained TFNO model from The Well as a PhysicsNeMo model.
        Model input parameters are set from the well, and the original config is updated to reflect this."""
        import inspect

        from physicsnemo.models.meta import ModelMetaData
        from physicsnemo.models.module import Module
        from the_well.benchmark.models import TFNO

        well_model = TFNO.from_pretrained("polymathic-ai/TFNO-MHD_64")
        model_dict = well_model.__dict__

        signature = inspect.signature(TFNO)
        parameters = signature.parameters
        filtered_params = {k: model_dict[k] for k in parameters if k in model_dict}

        model = Module.from_torch(TFNO, meta=ModelMetaData(name="converted_tfno"))
        well_pretrained_model = model(**filtered_params)
        well_pretrained_model.inner_model.load_state_dict(
            well_model.state_dict(), strict=True
        )
        self.model = well_pretrained_model.to(self.dist.device)

        print(summary(self.model, depth=4))

With a community pre-trained model now available trough the PhysicsNeMo framework! It can now be applied to downstream tasks such as fine-tuning with a custom dataset, or using the pre-trained model as as surrogate in simulation. Note that if starting down the path of fine-tuning, custom dataloaders may needed if working with dataset outside of The Well and PhysicsNeMo.

### Exploring the HuggingFace Stellerator Design Dataset
With resources for a simple framework for using community datasets and models with PhysicsNeMo, some deeper problems can be explored. In this section, the pre-trained TFNO model for magnetohydrodynamics will be augmented, and used as a foundation for transfer learning onto a new dataset with a related objective: optimizing the design of a stellarator. Stellarators are a type of magnetically confined fusion device for containing plasma of very high temperature and pressure. The design of these devices is complex due to their geometry. A recent challenge on HuggingFace serves as the basis for this transfer learning problem.

From the challenge page:

*"The ConStellaration dataset contains over 150,000 QI equilibria produced by VMEC++. As a reminder, QI stellarators are a subset of configurations that minimize the internal plasma currents that can lead to disruptive events in a tokamak. The provided equilibria correspond to different 3D plasma boundary surfaces and offer samples across a wide and physically meaningful range of parameters. The dataset includes:*

*Input parameters: the 3D plasma boundary, together with the pressure and current profiles.*
*Equilibrium outputs: full VMEC++ equilibrium solution plus additional metrics of interest for stellarator design (e.g., degree of QI symmetry, turbulent transport geometrical quantities)."*


There are many ways in which to use the dataset, and namely there are three unique challenges present that are available to solve, starting with low complexity and moving up to multimodal optimization problems. The three challenge problems involve optimizing the stellarator geometry to minimize or maximize certain related parameters such as minimizing elongation under fixed constraints, optimizing plasma shapes for confinement, and balancing compactness and simplicity.


These challenges serve a great problem to use as an end goal for combing these multiple frameworks, and while this example will not directly solve any of the stated challenges, it will serve as a starting point towards utilizing community models and datasets with PhysicsNeMo to solve real world problems.

The goal of the model in this case is to serve as a surrogate for approximating the equilibrium dynamics of the plasma, which then maps to quantities of interest. The dataset contains input geometry and a large list of output parameters that can be learned. Because the pre-trained TFNO model already has learned knowledge of magnetohydrodynamics, it is used to translate the input design parameters into predicted properties, by way of mapping input geometry to predicted MHD equilibrium field, and then onto the properties of interest. The model here will utilize the pre-trained TFNO as a frozen intermediate surrogate model, while the input and output projections will be learned during training. 

Specifically, the input geometry of the stellarator plasma is converted from Fourier coefficients defining its boundaries into a 3D representation, and the output is the predicted max elongation of the plasma.

The dataloader class is provided in `constellaration_dataloader.py` provides a base implementation that can be used to start working on machine learning models for predicting properties of stellarators. The `FullConstellarationDataLoader` transforms the Fourier coefficients in to a 3D representation of the stellarator boundary, and the `ConstellarationDataLoader` simply returns a concatenated list of the coefficients. In this example, the mapping of Fourier coefficients to 3D space is utilized to ensure closer alignment with the intermediate surrogate model. An example from the dataset is loaded and visualized below.

In [None]:
import matplotlib.pyplot as plt
from constellaration_dataloader import ConstellarationDataLoader

# Initialize the dataloader for challenge_1 with full dataset type
dataloader = ConstellarationDataLoader(
    challenge="challenge_1",
    dataset_type="full",
    grid_size=(64, 64, 64),
    batch_size=1,
)

# Prepare the dataset
dataloader.prepare()

# Get a single sample from the training dataset
sample = dataloader.train_dataset[0]

# Extract the 3D volume data and metric value
input_fields = sample["input_fields"]  # Shape: [3, Nt, Np, Nrho]
output_fields = sample["output_fields"]  # Shape: [1] for challenge_1

# Get the metric value (max_elongation for challenge_1)
max_elongation = output_fields.item()

# Extract the surface points (rho=1, which is the last index)
# The input_fields contains [X, Y, Z] coordinates
X_surface = input_fields[0, :, :, -1].cpu().numpy()  # X coordinates at surface
Y_surface = input_fields[1, :, :, -1].cpu().numpy()  # Y coordinates at surface
Z_surface = input_fields[2, :, :, -1].cpu().numpy()  # Z coordinates at surface

# Create the 3D plot
fig = plt.figure(figsize=(12, 8))
ax = fig.add_subplot(111, projection="3d")

# Plot the surface
surface = ax.plot_surface(
    X_surface,
    Y_surface,
    Z_surface,
    cmap="viridis",
    alpha=0.8,
    linewidth=0,
    antialiased=True,
)

# Add colorbar
fig.colorbar(surface, ax=ax, shrink=0.5, aspect=5)

# Set labels and title
ax.set_xlabel("X")
ax.set_ylabel("Y")
ax.set_zlabel("Z")
ax.set_title(f"3D Stellarator Surface - Max Elongation: {max_elongation:.4f}")

ax.view_init(elev=20, azim=45)
ax.set_box_aspect([1, 1, 1])

plt.tight_layout()
plt.show()
plt.savefig("3D_Stellarator_Surface.png")

# Print some information about the sample
print(f"Sample metric (max_elongation): {max_elongation:.4f}")
print(f"Surface shape: {X_surface.shape}")
print(f"X range: [{X_surface.min():.3f}, {X_surface.max():.3f}]")
print(f"Y range: [{Y_surface.min():.3f}, {Y_surface.max():.3f}]")
print(f"Z range: [{Z_surface.min():.3f}, {Z_surface.max():.3f}]")

This visualization shows the surface of the stellarator plasma boundary that has an associated `max_elongation` property associated with it. 

The model used to predict `max_elongation` from the input plasm boundary surface is provided below. The pre-trained model from The Well is utilized as an intermediate surrogate model for approximating the equilibrium fields. The models operation is structured into three distinct stages: input projection, pre-trained surrogate core, and output projection. The model first processes the input through an adapter. The `input_projection` module is a small 3D CNN that transforms the initial 3-channel input tensor into the 28-channel format expected by the core surrogate model. The intermediate surrogate model is the same TFNO from The Well that was previously explored. The weights of this core model are frozen, meaning it is not trained further. Instead, it functions as a fixed feature extractor for processing the adapted input. Finally, the model converts the high-dimensional 3D tensor from the TFNO into a simple vector output. This is done by reducing the surrogates 7-channel output into a single channel feature map, using a global pooling of features, flattening into a vector, and processing through a small MLP that maps to the desired output dimension.

In [None]:
import torch
import torch.nn as nn


class TFNOSurrogate(nn.Module):
    def __init__(
        self,
        input_channels=3,
        output_dim=3,
        hidden_dim=256,
        target_input_channels=28,
        target_output_channels=7,
        image_size=64,  # Assuming square HxW dimensions
    ):
        super().__init__()
        self.input_channels = input_channels
        self.output_dim = output_dim
        self.hidden_dim = hidden_dim
        self.target_input_channels = target_input_channels
        self.target_output_channels = target_output_channels

        # Input projection: 3D conv to project from input_channels to target_input_channels
        self.input_projection = nn.Sequential(
            nn.Conv3d(input_channels, hidden_dim, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Dropout(0.1),
            nn.Conv3d(hidden_dim, hidden_dim // 2, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Dropout(0.1),
            nn.Conv3d(hidden_dim // 2, target_input_channels, kernel_size=1),
        )

        # Output projection: Reduce channels to 1 and pool along the depth axis
        self.output_projection = nn.Sequential(
            # Reduce channels from 7 to 1. Shape: [B, 7, 64, 64, 64] -> [B, 1, 64, 64, 64]
            nn.Conv3d(target_output_channels, 1, kernel_size=1),
            # Pool only the depth dim. Shape: [B, 1, 64, 64, 64] -> [B, 1, 64, 64, 1]
            nn.AdaptiveAvgPool3d((None, None, 1)),
        )

        # Calculate the size of the flattened 2D feature map
        flat_feature_dim = image_size * image_size

        # Final projection to output dimension
        self.final_projection = nn.Sequential(
            nn.Linear(flat_feature_dim, hidden_dim // 2),
            nn.ReLU(),
            nn.Dropout(0.1),
            nn.Linear(hidden_dim // 2, output_dim),
        )

        self.surrogate = self.setup_model()  # pretrained TFNOPhysicsNeMoModel
        self.freeze_surrogate_weights()

    def setup_model(self):
        """Setup the pretrained TFNO model from The Well as a PhysicsNeMo model."""
        import inspect

        from physicsnemo.models.meta import ModelMetaData
        from physicsnemo.models.module import Module
        from the_well.benchmark.models import TFNO

        well_model = TFNO.from_pretrained("polymathic-ai/TFNO-MHD_64")
        model_dict = well_model.__dict__

        signature = inspect.signature(TFNO)
        parameters = signature.parameters
        filtered_params = {k: model_dict[k] for k in parameters if k in model_dict}

        model = Module.from_torch(TFNO, meta=ModelMetaData(name="converted_tfno"))
        well_pretrained_model = model(**filtered_params)
        well_pretrained_model.inner_model.load_state_dict(
            well_model.state_dict(), strict=True
        )
        return well_pretrained_model

    def freeze_surrogate_weights(self):
        """Freeze all parameters in the pretrained TFNO surrogate model."""
        for param in self.surrogate.parameters():
            param.requires_grad = False
        self.surrogate.eval()
        print(
            f"Frozen {sum(p.numel() for p in self.surrogate.parameters())} parameters in pretrained TFNO model"
        )

    def forward(self, x):
        # Input projection: [B, 3, 64, 64, 64] -> [B, 28, 64, 64, 64]
        x_projected = self.input_projection(x)

        # Pass through the pretrained TFNO model: [B, 28, 64, 64, 64] -> [B, 7, 64, 64, 64]
        tfno_output = self.surrogate(x_projected)

        # Project output to a 2D feature map: [B, 7, 64, 64, 64] -> [B, 1, 64, 64, 1]
        projected_2d = self.output_projection(tfno_output)

        # Flatten the 2D feature map: [B, 1, 64, 64, 1] -> [B, 4096]
        flattened = torch.flatten(projected_2d, start_dim=1)

        # Final projection to output dimension: [B, 4096] -> [B, 3]
        output = self.final_projection(flattened)

        return output

The model and dataloader can finally be used with the previous training utilities and a few slight modifications to train a model for this task. Namely, the `setup_model`, `_setup_data`, and `_forward` methods are updated to accommodate the new model and dataset structure.

In [None]:
from typing import List
from einops import rearrange


class ConstellarationTrainer(Trainer):
    def setup_model(self):
        """Setup the TFNO model based in PhysicsNeMo. Input parameters are set and controlled by the yaml config."""
        from constellaration_surrogate import TFNOSurrogate

        self.model = TFNOSurrogate(input_channels=3, output_dim=1)
        self.model.to(self.dist.device)

    def _setup_data(self):
        from constellaration_dataloader import ConstellarationDataLoader

        self.datamodule = ConstellarationDataLoader(
            challenge="challenge_1",
            dataset_type="full",
            grid_size=(64, 64, 64),
            batch_size=2,
        )
        self.datamodule.prepare()

    def _forward_pass(self, batch: List[torch.Tensor]) -> torch.Tensor:
        # Add dimension for our one input field
        inputs = batch["input_fields"].unsqueeze(-1)
        # Rearrange for model: (batch, time, x, y, z, fields) -> (batch, (time fields), x, y, z)
        model_inputs = rearrange(inputs, "B Ti Lx Ly Lz F -> B (Ti F) Lx Ly Lz")

        # Forward pass through model
        model_outputs = self.model(model_inputs)
        return model_outputs

In [None]:
cfg.train_params.ckpt_path = "./checkpoints/well_model_constellar_data"
constellaration_model_trainer = ConstellarationTrainer(cfg)
constellaration_model_trainer.train()

At this point, three distinct SciML projects have been combined into one unified interface built on top of the PhysicsNeMo framework. The combination of projects has enabled deep research and modeling opportunities that leveraging surrogate models for Stellarator design, and provides the working models for utilizing pre-trained models for a variety of down stream tasks. Each of the outlined recipes for loading datasets, models, and pre-trained models are transferable across models and datasets from The Well.

## Expanding The Scope - More Models


Another great feature of PhysicsNeMo is its catalog of state-of-the-art models prebuilt into the framework. This catalog expands on the available models from The Well, and provide reference implementations for a variety of physics inspired problems. The same steps outlined in this tutorial can be used to apply these models to the same dataset here, or to try out other combinations of datasets and models that have not yet been explored in The Well of PhysicsNeMo. Surrogate modeling is _not_ an easy task, and having a unified framework to enable rapid prototyping and research will lead to quicker exploration and adoption of these techniques to accelerate scientific workloads. To this end, it is encouraged utilize this framework to explore the models both in The Well and PhysicsNeMo to train a variety of surrogate models for downstream SciML tasks.

## Conclusion: Using PhysicsNeMo as a Catalyst for Collaborative SciML Research

PhysicsNeMo is fundamentally transforming scientific machine learning research by serving as a powerful integration hub that bridges previously siloed community efforts. Through its interoperability with major SciML initiatives like The Well and the ConStellaration Challenge, the framework enables researchers to leverage collective knowledge for rapid research and development cycles. In this setting, PhysicsNeMo unifies diverse community resources under a single, cohesive framework, demonstrated through dataset and model integrations. The stellarator design example illustrates PhysicsNeMo's ability to orchestrate complex, multi-stage scientific workflows by combining The Well's pre-trained MHD models with the ConStellaration dataset to create surrogate models that map stellarator geometry directly to performance metrics - a task that would be prohibitively expensive through traditional numerical simulation alone. This collaborative approach enables individual research efforts become more valuable through integration, new researchers can build upon existing work more easily, and the collective knowledge of the SciML community becomes more accessible and actionable. By providing a unified interface that respects and integrates existing community efforts, PhysicsNeMo is helping to realize the full potential of collaborative scientific research in the AI/ML era, where scientific discovery is accelerated not just by individual breakthroughs, but by the seamless integration and amplification of community knowledge.