# A Tutorial on Community Models and Integrations - PhysicsNeMo

In this tutorial, PhysicsNeMo is used to expand the scope of community models and datasets, such as [The Well](https://github.com/PolymathicAI/the_well) by leveraging physics informed utilities, optimized model layers, and MLOps best practices. Specifically, the tutorial covers:
. Loading and evaluating a checkpoint from `The Well` 
. How to use a pretrained checkpoint from `The Well` and run it as a PhysicsNeMo user
. Training community models with PhysicsNeMo
. Fine-tuning community models with PhysicsNeMo
. Experimenting with different architectures, from the community and internal to PhysicsNeMo

## Loading Models and Data from `The Well`

To begin, a model and dataset from `the_well` is selected for use throughout this example. For this example, the [Magnetohydrodynamics dataset](https://github.com/PolymathicAI/the_well/tree/master/datasets/MHD_64) is used. Magnetohydrodynamics (MHD), is the study of the dynamics of electrically conducting fluids such as plasmas.

Note that any one of the models and dataset combinations may be selected.

In addition to the data, a pre-trained model with a Tucker-Factorized Fourier Neural Operator (TFNO) architecture is used that will be later converted to PhysicsNeMo.

The dataset streaming functionality from `the_well` will be utilized so that the dataset does not need to be downloaded locally. This requires accessing huggingface. 

In [1]:
from the_well.benchmark.models import TFNO
from torchinfo import summary

# Load The Well model (1.21 GB)
well_model = TFNO.from_pretrained("polymathic-ai/TFNO-MHD_64")
well_model = well_model.to("cuda")

# Have a look at the model summary
# Note that the order of layers does not represent order of execution
summary(well_model, depth=5)



config.json:   0%|          | 0.00/220 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.21G [00:00<?, ?B/s]

Layer (type:depth-idx)                                  Param #
TFNO                                                    --
├─NeuralOpsCheckpointWrapper: 1-1                       --
│    └─FNOBlocks: 2-1                                   --
│    │    └─SpectralConv: 3-1                           512
│    │    │    └─ModuleList: 4-1                        --
│    │    │    │    └─ComplexTuckerTensor: 5-1          75,564,194
│    │    │    │    └─ComplexTuckerTensor: 5-2          75,564,194
│    │    │    │    └─ComplexTuckerTensor: 5-3          75,564,194
│    │    │    │    └─ComplexTuckerTensor: 5-4          75,564,194
│    │    └─ModuleList: 3-2                             --
│    │    │    └─Conv3d: 4-2                            16,384
│    │    │    └─Conv3d: 4-3                            16,384
│    │    │    └─Conv3d: 4-4                            16,384
│    │    │    └─Conv3d: 4-5                            16,384
│    └─MLP: 2-2                                         --
│ 

In [2]:
from the_well.data import WellDataset
from the_well.data.normalization import ZScoreNormalization

# Enable streaming the dataset from HuggingFace
# The following line may take a couple of minutes to instantiate the datamodule
dataset = WellDataset(
    well_base_path="hf://datasets/polymathic-ai/",  # access from HF hub
    well_dataset_name="MHD_64",
    well_split_name="test",
    use_normalization=True,
    normalization_type=ZScoreNormalization,
    n_steps_input=4,
    n_steps_output=1,
)

With the dataset on hand, its features, shape and size can be explored. `The Well` provides a great script for this already, and is left to the reader to go through if desired. A summary is provided below, also available on the [dataset card](https://github.com/PolymathicAI/the_well/blob/master/datasets/MHD_64/README.md) online:

**Dimension of discretized data:** 100 timesteps of 64 $\times$ 64 $\times$ 64 cubes.

**Fields available in the data:** Density (scalar field), velocity (vector field), magnetic field (vector field).

**Number of trajectories:** 10 Initial conditions x 10 combination of parameters = 100 trajectories.

**Estimated size of the ensemble of all simulations:** 71.6 GB.

**Grid type:** uniform grid, cartesian coordinates.

**Initial conditions:** uniform IC.

**Boundary conditions:** periodic boundary conditions.

**Data are stored separated by ($\Delta t$):** 0.01 (arbitrary units).

**Total time range ($t\_{min}$ to $t\_{max}$):** $t\_{min} = 0$, $t\_{max} = 1$.

**Spatial domain size ($L_x$, $L_y$, $L_z$):** dimensionless so 64 pixels.

**Set of coefficients or non-dimensional parameters evaluated:** all combinations of $\mathcal{M}_s=${0.5, 0.7, 1.5, 2.0 7.0} and $\mathcal{M}_A =${0.7, 2.0}.

**Approximate time and hardware used to generate the data:** Downsampled from `MHD_256` after applying ideal low-pass filter.

**What phenomena of physical interest are catpured in the data:** MHD fluid flows in the compressible limit (sub and super sonic, sub and super Alfvenic).

**How to evaluate a new simulator operating in this space:** Check metrics such as Power spectrum, two-points correlation function.

Please cite the associated paper if you use this data in your research:

```
@article{burkhart2020catalogue,
  title={The catalogue for astrophysical turbulence simulations (cats)},
  author={Burkhart, B and Appel, SM and Bialy, S and Cho, J and Christensen, AJ and Collins, D and Federrath, Christoph and Fielding, DB and Finkbeiner, D and Hill, AS and others},
  journal={The Astrophysical Journal},
  volume={905},
  number={1},
  pages={14},
  year={2020},
  publisher={IOP Publishing}
}
```




A single sample can be extracted and inspected. Some notes from `The Well` are shown below. More info on examining their data can be found in [this example](https://github.com/PolymathicAI/the_well/blob/master/docs/tutorials/dataset.ipynb):

The most important elements are `input_fields` and `output_fields`. They represent the time-varying physical fields of the dynamical system and are generally the input and target of our models. For a dynamical system that has 2 spatial dimensions $x$ and $y$, `input_fields` would have a shape $(T_{in}, L_x, L_y, F)$ and `output_fields` would have a shape $(T_{out}, L_x, L_y, F)$. The number of input and output timesteps $T_{in}$ and $T_{out}$ are specified at the instantiation of the dataset with the arguments `n_steps_input` and `n_steps_output`. $L_x$ and $L_y$ are the lengths of the spatial dimensions. $F$ represents the number of physical fields, where vector fields $v = (v_x, v_y)$ and tensor fields $t = (t_{xx}, t_{xy}, t_{yx}, t_{yy})$ are flattened.

Note that the dataset for MHD has three spatial dimensions, so an extra $z$ term will be included.

In [3]:
sample = dataset[0]
for k, v in sample.items():
    print(f"Key: {k.ljust(20)} Shape: {v.shape}")

print(f"Field Names: {dataset.metadata.field_names}")

Key: input_fields         Shape: torch.Size([4, 64, 64, 64, 7])
Key: output_fields        Shape: torch.Size([1, 64, 64, 64, 7])
Key: constant_scalars     Shape: torch.Size([2])
Key: boundary_conditions  Shape: torch.Size([3, 2])
Key: space_grid           Shape: torch.Size([64, 64, 64, 3])
Key: input_time_grid      Shape: torch.Size([4])
Key: output_time_grid     Shape: torch.Size([1])
Field Names: {0: ['density'], 1: ['magnetic_field_x', 'magnetic_field_y', 'magnetic_field_z', 'velocity_x', 'velocity_y', 'velocity_z'], 2: []}


Using the model summary from above and the order of operations in the [TFNO forward pass](https://github.com/neuraloperator/neuraloperator/blob/7bb578df787d6ca548b623b83f0601c98dc931fb/neuralop/models/fno.py#L337), the data is processed by:

1. Applying optional positional encoding
2. Sending inputs through a lifting layer to a high-dimensional latent space
3. Applying optional domain padding to high-dimensional intermediate function representation
4. Applying `n_layers` Fourier/TFNO layers in sequence (SpectralConvolution + skip connections, nonlinearity) 
5. If domain padding was applied, domain padding is removed
6. Projection of intermediate function representation to the output channels


This pretrained model is trained to predict the $T_{out} = 1$ next states given the $T_{in} = 4$ previous states. The input steps are concatenated along their channels, such that the model expects $T_{in} \times F$ channels as input and $T_{out} \times F$ channels as output. Because `WellDataset` is a PyTorch dataset, we can use it conveniently with PyTorch data-loaders.

We can now evaluate and verify the performacne of the pretrained MHD model from `the_well`. To do this, we will utilize the _streaming_ data mode from `the_well`, and the utilities in the `Trainer` for validation. 

In [None]:
import numpy as np
import torch
from einops import rearrange
from the_well.benchmark.metrics import VRMSE
from torch.utils.data import DataLoader
from tqdm import tqdm

device = torch.device("cuda")

well_model.eval()

all_batch_mean_normalized_errors = []
all_batch_mean_denormalized_errors = []

dataloader = DataLoader(dataset, batch_size=2)

for idx, batch in enumerate(
    tqdm(dataloader, total=len(dataloader), desc="Evaluating Batches")
):
    input_batch = batch["input_fields"].to(
        device="cuda"
    )  # Shape: (B, Ti, Lx, Ly, Lz, F)
    output_batch = batch["output_fields"].to(
        device="cuda"
    )  # Shape: (B, To, Lx, Ly, Lz, F)

    input_batch = rearrange(input_batch, "B Ti Lx Ly Lz F -> B (Ti F) Lx Ly Lz")

    with torch.no_grad():
        pred_batch = well_model(input_batch)  # Shape: (B, (To*F), Lx, Ly)

        pred_batch = rearrange(
            pred_batch, "B (Tp F) Lx Ly Lz -> B Tp Lx Ly Lz F", Tp=dataset.n_steps_output
        )  # Shape: (B, Tp, Lx, Ly, Lz, F)
        normalized_errors_per_sample = VRMSE.eval(
            pred_batch, output_batch, dataset.metadata
        )
        mean_normalized_err = normalized_errors_per_sample.mean().item()
        all_batch_mean_normalized_errors.append(mean_normalized_err)

        denormalized_pred_batch = dataset.norm.denormalize_flattened(
            pred_batch, mode="variable"
        )
        denormalized_output_batch = dataset.norm.denormalize_flattened(
            output_batch, mode="variable"
        )
        denormalized_errors_per_sample = VRMSE.eval(
            denormalized_pred_batch, denormalized_output_batch, dataset.metadata
        )
        mean_denormalized_err = denormalized_errors_per_sample.mean().item()
        all_batch_mean_denormalized_errors.append(mean_denormalized_err)


final_mean_normalized = np.mean(all_batch_mean_normalized_errors)
final_std_normalized = np.std(all_batch_mean_normalized_errors)
final_mean_denormalized = np.mean(all_batch_mean_denormalized_errors)
final_std_denormalized = np.std(all_batch_mean_denormalized_errors)

print("\n--- Overall Evaluation Results ---")
print(
    f"Total samples evaluated: {len(dataloader)} ({len(dataloader)//dataloader.batch_size} batches of {dataloader.batch_size} samples)"
)
print(
    f"Mean Normalized VRMSE: {final_mean_normalized:.6f} (Std Dev: {final_std_normalized:.6f})"
)
print(
    f"Mean Denormalized VRMSE: {final_mean_denormalized:.6f} (Std Dev: {final_std_denormalized:.6f})"
)


In [None]:
# print(f"pred shape: {pred_batch.shape}")
# print(f"gt shape: {output_batch.shape}")
# normalized_errors_per_sample = VRMSE.eval(pred_batch, output_batch, dataset.metadata)
# print(normalized_errors_per_sample.mean().item())
# batch['input_fields'].shape
# ib = batch['input_fields']
# ob = batch['output_fields']
# ib.shape
# len(dataloader)//dataloader.batch_size
# input_batch = rearrange(input_batch, "B Ti Lx Ly Lz F -> B (Ti F) Lx Ly Lz")
# input_batch.shape

## Using The Well Data to Train PhysicsNeMo Models
Now that the baseline model from `the_well` has been evaluated, lets look at training a similar architecture from scratch in `PhysicsNeMo`. In the `PhysicsNeMo` examples, there is a similar example of a Tensor-Factorized Fourier Neural Operator implementation that has been used in a related use case for incompressible magnetohydrodynamics in 2D. This model can be adapted for use in 3D with the dataset from `The Well`. This also serves as an example for how `PhysicsNeMo` can be used as a framework for building and testing model architectures that are not found in base model collection. For the remainder of this notebook, a boiler-plate `Trainer` class is implemented in `utils.py` that contains the core components needed for loading models, running training loops, saving checkpoints, evaluating models, etc. The method `setup_model` needs to be implemented in order to use a model from PhysicsNeMo. For completeness, the TFNO architecture from `PhysicsNeMo` is included in the `tfno` folder. Additionally, there is a config file in `config` folder that is used to initialize some parameters for our pipeline.


In [54]:
%reload_ext autoreload

In [55]:
import hydra
from hydra import compose, initialize
from omegaconf import OmegaConf
from training_utils import Trainer

hydra.core.global_hydra.GlobalHydra.instance().clear()

initialize(version_base=None, config_path="./config")
cfg = compose(config_name="mhd_config.yaml")

In [98]:
class MHDTrainer(Trainer):
    def setup_model(self):
        """Setup the TFNO model based in PhysicsNeMo. Input parameters are set and controlled by the yaml config."""
        from tfno.tfno import TFNO

        self.model = TFNO(
            in_channels=self.model_params.in_dim,
            coord_features=self.model_params.coord_features,
            out_channels=self.model_params.out_dim,
            decoder_layers=self.model_params.decoder_layers,
            decoder_layer_size=self.model_params.fc_dim,
            dimension=self.model_params.dimension,
            latent_channels=self.model_params.layers,
            num_fno_layers=self.model_params.num_fno_layers,
            num_fno_modes=self.model_params.modes,
            padding=[
                self.model_params.pad_z,
                self.model_params.pad_y,
                self.model_params.pad_x,
            ],
            rank=self.model_params.rank,
            factorization=self.model_params.factorization,
            fixed_rank_modes=self.model_params.fixed_rank_modes,
            decomposition_kwargs=self.model_params.decomposition_kwargs,
        ).to(self.dist.device)

        print(summary(self.model, depth=3))

In [99]:
mhd_trainer = MHDTrainer(cfg)
# mhd_trainer.train()

Layer (type:depth-idx)                        Param #
TFNO                                          --
├─GELU: 1-1                                   --
├─FullyConnected: 1-2                         --
│    └─ModuleList: 2-1                        --
│    │    └─FCLayer: 3-1                      16,512
│    └─FCLayer: 2-2                           --
│    │    └─Identity: 3-2                     --
│    │    └─Linear: 3-3                       903
├─TFNO3DEncoder: 1-3                          --
│    └─GELU: 2-3                              --
│    └─Sequential: 2-4                        --
│    │    └─Conv3dFCLayer: 3-4                1,856
│    │    └─GELU: 3-5                         --
│    │    └─Conv3dFCLayer: 3-6                8,320
│    └─ModuleList: 2-5                        --
│    │    └─FactorizedSpectralConv3d: 3-7     302,123,348
│    └─ModuleList: 2-6                        --
│    │    └─Conv3d: 3-8                       16,512
Total params: 302,167,451
Trainable para

## Running Models From The Well in PhysicsNemo

PhysicsNeMo offers great utility in running community models natively in PhysicsNeMo by way of a simple conversion. The documentation for this can be found [here](https://docs.nvidia.com/deeplearning/physicsnemo/physicsnemo-core/api/physicsnemo.models.html#converting-pytorch-models-to-physicsnemo-models). The first process of the conversion is setting up the base PyTorch model as a PhysicsNeMo model.


In [120]:
from physicsnemo.registry import ModelRegistry
model_registry = ModelRegistry()
model_registry.__clear_registry__()
model_registry.__restore_registry__()


from dataclasses import dataclass

from physicsnemo.models.meta import ModelMetaData
from physicsnemo.models.module import Module
from the_well.benchmark.models import TFNO


@dataclass
class WellMetaData(ModelMetaData):
    name: str = "WellTFNOModel"


well_nemo_model = Module.from_torch(TFNO, meta=WellMetaData())
print(well_nemo_model)

<class 'physicsnemo.models.module.Module.from_torch.<locals>.PhysicsNeMoModel'>


In [125]:
from physicsnemo.registry import ModelRegistry

print(well_model.__dict__)
well_nemo_model = ModelRegistry().factory("TFNOPhysicsNeMoModel")(
    dim_in=28,
    dim_out=7,
    n_spatial_dims=3,
    spatial_resolution=[64, 64, 64],
    hidden_channels=128,
    modes1=16,
    modes2=16,
    modes3=16,
)

summary(well_nemo_model, depth=5)

{'_hub_mixin_config': {'modes3': 16, 'hidden_channels': 128, 'gradient_checkpointing': False, 'dim_in': 28, 'dim_out': 7, 'n_spatial_dims': 3, 'spatial_resolution': [64, 64, 64], 'modes1': 16, 'modes2': 16}, 'training': True, '_parameters': {}, '_buffers': {}, '_non_persistent_buffers_set': set(), '_backward_pre_hooks': OrderedDict(), '_backward_hooks': OrderedDict(), '_is_full_backward_hook': None, '_forward_hooks': OrderedDict(), '_forward_hooks_with_kwargs': OrderedDict(), '_forward_hooks_always_called': OrderedDict(), '_forward_pre_hooks': OrderedDict(), '_forward_pre_hooks_with_kwargs': OrderedDict(), '_state_dict_hooks': OrderedDict(), '_state_dict_pre_hooks': OrderedDict(), '_load_state_dict_pre_hooks': OrderedDict(), '_load_state_dict_post_hooks': OrderedDict(), '_modules': {'model': NeuralOpsCheckpointWrapper(
  (fno_blocks): FNOBlocks(
    (convs): SpectralConv(
      (weight): ModuleList(
        (0-3): 4 x ComplexTuckerTensor(shape=(128, 128, 16, 16, 9), rank=(128, 128, 16,

Layer (type:depth-idx)                                       Param #
TFNOPhysicsNeMoModel                                         --
├─TFNO: 1-1                                                  --
│    └─NeuralOpsCheckpointWrapper: 2-1                       --
│    │    └─FNOBlocks: 3-1                                   --
│    │    │    └─SpectralConv: 4-1                           512
│    │    │    │    └─ModuleList: 5-1                        302,256,776
│    │    │    └─ModuleList: 4-2                             --
│    │    │    │    └─Conv3d: 5-2                            16,384
│    │    │    │    └─Conv3d: 5-3                            16,384
│    │    │    │    └─Conv3d: 5-4                            16,384
│    │    │    │    └─Conv3d: 5-5                            16,384
│    │    └─MLP: 3-2                                         --
│    │    │    └─ModuleList: 4-3                             --
│    │    │    │    └─Conv3d: 5-6                            7,424
│    │

The resulting `well_nemo_model` is simply the underlying PyTorch model from The Well, that can now be used with PhysicsNeMo and its utilities. Note that this is a newly instantiated model. To now train this model in PhysicsNeMo with the same dataset from The Well, we can update our `Trainer` to use this new model. 

In [138]:
class MHDTrainer(Trainer):
    def setup_model(self):
        """Setup the TFNO model based in PhysicsNeMo. Input parameters are set and controlled by the yaml config."""
        from dataclasses import dataclass
        
        from physicsnemo.models.meta import ModelMetaData
        from physicsnemo.models.module import Module
        from the_well.benchmark.models import TFNO
        
        
        @dataclass
        class WellMetaData(ModelMetaData):
            name: str = "WellTFNOModel"
        
        
        well_nemo_model = Module.from_torch(TFNO, meta=WellMetaData())
        well_nemo_model = well_nemo_model(
            dim_in=28,
            dim_out=7,
            n_spatial_dims=3,
            spatial_resolution=[64, 64, 64],
            hidden_channels=128,
            modes1=16,
            modes2=16,
            modes3=16,
        )

        self.model = well_nemo_model
        print(summary(self.model, depth=5))

cfg.model_params = "Set manually from The Well config."
print(cfg.model_params)
        
well_model_trainer = MHDTrainer(cfg)
# well_model_trainer.train()

Set manually from The Well config.
Layer (type:depth-idx)                                       Param #
TFNOPhysicsNeMoModel                                         --
├─TFNO: 1-1                                                  --
│    └─NeuralOpsCheckpointWrapper: 2-1                       --
│    │    └─FNOBlocks: 3-1                                   --
│    │    │    └─SpectralConv: 4-1                           512
│    │    │    │    └─ModuleList: 5-1                        302,256,776
│    │    │    └─ModuleList: 4-2                             --
│    │    │    │    └─Conv3d: 5-2                            16,384
│    │    │    │    └─Conv3d: 5-3                            16,384
│    │    │    │    └─Conv3d: 5-4                            16,384
│    │    │    │    └─Conv3d: 5-5                            16,384
│    │    └─MLP: 3-2                                         --
│    │    │    └─ModuleList: 4-3                             --
│    │    │    │    └─Conv3d: 5-6     

## Fine-tuning Models from The Well in PhysicsNeMo
Using the approach of converting models to PhysicsNeMo models allows users to utilize all of the helpful features inside of PhysicsNeMo itself. One great feature of The Well is the abundance of pre-trained model that they provided, often sharing [many pretrained model architectures](https://huggingface.co/collections/polymathic-ai/the-well-benchmark-models-67e69bd7cd8e60229b5cd43e) for a single dataset. These pretrained models can be fine-tuned using PhysicsNeMo by leveraging a similar approach to the above example of converting a model into PhysicsNeMo format.


In [139]:
class MHDTrainer(Trainer):
    def setup_model(self):
        """Setup the pretrained TFNO model from The Well as a PhysicsNeMo model.
        Model input parameters are set from the well, and the original config is updated to reflect this."""
        import inspect

        from physicsnemo.models.meta import ModelMetaData
        from physicsnemo.models.module import Module
        from the_well.benchmark.models import TFNO

        well_model = TFNO.from_pretrained("polymathic-ai/TFNO-MHD_64")
        model_dict = well_model.__dict__

        signature = inspect.signature(TFNO)
        parameters = signature.parameters
        filtered_params = {k: model_dict[k] for k in parameters if k in model_dict}

        model = Module.from_torch(TFNO, meta=ModelMetaData(name="converted_tfno"))
        well_pretrained_model = model(**filtered_params)
        well_pretrained_model.inner_model.load_state_dict(
            well_model.state_dict(), strict=True
        )
        self.model = pnm_model.to(self.dist.device)

        print(summary(self.model, depth=4))


cfg.model_params = "set manually from The Well config."
print(f"Model config (cfg.model_params) are: {cfg.model_params}")

well_model_trainer = MHDTrainer(cfg)
# well_model_trainer.train()


Set manually from The Well config.
Layer (type:depth-idx)                                       Param #
TFNOPhysicsNeMoModel                                         --
├─TFNO: 1-1                                                  --
│    └─NeuralOpsCheckpointWrapper: 2-1                       --
│    │    └─FNOBlocks: 3-1                                   --
│    │    │    └─SpectralConv: 4-1                           302,257,288
│    │    │    └─ModuleList: 4-2                             65,536
│    │    └─MLP: 3-2                                         --
│    │    │    └─ModuleList: 4-3                             40,320
│    │    └─MLP: 3-3                                         --
│    │    │    └─ModuleList: 4-4                             34,823
Total params: 302,397,967
Trainable params: 302,397,967
Non-trainable params: 0


## Expanding The Scope - More Models


# General Notes

Essentially we can build a nice example to illustrate the points we make in Converting PyTorch models to PhysicsNeMo models.


[Converting PyTorch Models to PhysicsNeMo Models](https://docs.nvidia.com/deeplearning/physicsnemo/physicsnemo-core/api/physicsnemo.models.html)


In this example, we show:
1. How to bring a pretrained checkpoint from the community ("The well") and run it as a PhysicsNeMo user
2. Then train the same model architecture in PhysicsNeMo - showcasing how easy it is to work with any community PyTorch model
3. Then talk about how someone can experiment with different other architectures - true value of PhysicsNeMo

In [None]:
export PYTHONPATH=$PYTHONPATH:/workspace/PhysicsNeMo-git/physicsnemo

In [None]:
### Loading a Pre-trained Model from The Well
```python
from the_well.benchmark.models import FNO

# Load a pre-trained FNO model
model = FNO.from_pretrained("polymathic-ai/FNO-active_matter")
type(model)
```