Bug ext ensemble metrics by dallasfoster · Pull Request #63 · NVIDIA/physicsnemo

dallasfoster · 2023-06-30T00:07:11Z

Fixes a bug in the ensemble metric calculation for ensemble mean and variance where the tensors are not located on the gpu / are not dense.

Addresses issues #57 and #59

dallasfoster · 2023-06-30T00:14:49Z

/blossom-ci

dallasfoster · 2023-06-30T03:09:06Z

/blossom-ci

* Fix distributed ensemble metrics bug and crps_gaussian bug * fix tests * linting * Update docs and typing * black * Fix ensemble mean bug

Addresses Peter Sharpe's CHANGES_REQUESTED review on PR NVIDIA#1576 in full, and subsumes the 3D-UFNO portion of the planned xFNO PR into this one. Net: -139 lines despite gaining trunkless mode, time-axis-extend, multi-channel output, coord features, multi-layer lift, and 2D/3D genericity. Theme 1 — Dimensional unification (Peter NVIDIA#37, NVIDIA#47, NVIDIA#52, NVIDIA#53): - ``DeepONet`` (formerly DeepONet + DeepONet3D) takes ``dimension: int`` (2 or 3) and dispatches via ``_DIM_DEFAULTS`` and per-dim conv/spectral primitives, mirroring the ``FNO`` pattern. - ``SpatialBranch`` (formerly SpatialBranch + SpatialBranch3D) takes ``dimension`` and uses an ``_DIM_LAYERS`` lookup for ``SpectralConv``/``Conv``/``BatchNorm``/``AdaptiveAvgPool``/ ``UNetAdapter`` and the permute helpers. - ``Conv{2,3}dFCLayer`` is selected via a one-line lookup (Peter NVIDIA#45). Theme 2 — Wrappers folded into DeepONet (Peter NVIDIA#54, NVIDIA#64, NVIDIA#65): - ``wrappers.py`` deleted (``DeepONetWrapper`` and ``DeepONet3DWrapper`` removed). Padding behaviour is now a constructor flag, ``auto_pad: bool = False``, and the model dispatches to ``_forward_packed`` / ``_forward_packed_trunkless`` accordingly. - 6-cell call matrix (trunked/trunkless × packed/core × spatial/mlp) is documented in the class docstring. - The previous private ``_temporal_projection`` attribute is exposed as a public ``has_temporal_projection`` property (Peter NVIDIA#55). Theme 3 — Deduplication (Peter NVIDIA#43, NVIDIA#44, NVIDIA#50, NVIDIA#51, NVIDIA#40, related Greptile): - ``TrunkNet`` and ``MLPBranch`` deleted — both duplicated ``physicsnemo.models.mlp.FullyConnected``; users now pass any ``nn.Module`` for the trunk / branches (DI-first API). - ``_SinActivation`` deleted; the activation is registered as ``"sin"`` in ``physicsnemo.nn.module.activations.ACT2FN`` (previous commit). All ``if activation_fn.lower() == "sin"`` special-cases removed. - ``DeepONet.from_config`` and the dict-config schema removed entirely; Hydra-style ``_target_`` instantiation supersedes it. - ``count_params`` collapsed from 4 duplicate copies to 1. Theme 4 — xFNO fold-in: - ``trunk: nn.Module | None = None`` enables trunkless mode (the 3D-UFNO use case from the planned xFNO PR). - ``out_channels: int = 1`` adds multi-channel output to every path. - ``time_modes: int | None`` enables xFNO-style time-axis-extend in trunkless packed mode: replicate-pads the last spatial axis to fit ``2 * time_modes`` and crops to the requested ``target_times``. - ``coord_features`` and ``lift_layers``/``lift_hidden_width`` parameters on ``SpatialBranch`` replace the deleted dict-driven "conv encoder" option. Theme 5 — Housekeeping (Peter NVIDIA#33, NVIDIA#34, NVIDIA#38, NVIDIA#41, NVIDIA#48, NVIDIA#57, NVIDIA#58, NVIDIA#59, Charlelie NVIDIA#26, Greptile NVIDIA#5, NVIDIA#6): - ``padding.py`` renamed to private ``_padding.py``; all functions carry ``jaxtyping.Shaped`` annotations. - All public forward methods carry ``jaxtyping.Float`` annotations and ``torch.compiler.is_compiling`` shape-validation guards. - ``Literal`` type aliases for ``decoder_type`` and other enums; case-insensitive validation against ``get_args`` (Greptile NVIDIA#15). - Modern type hints throughout (``dict[str, Any] | None``, no ``Dict``/``Optional``). - All public docstrings use ``r"""`` raw-string prefix, LaTeX math for tensor shapes, double backticks for inline code, and Examples sections. - ``Notes`` block in ``branches.py`` documents the ``num_unet_layers`` 8x memory/compute penalty (Peter NVIDIA#49). Theme 6 — Tests (Peter NVIDIA#60, NVIDIA#61, NVIDIA#62, NVIDIA#63, Charlelie NVIDIA#29): - ``_FIXTURE_REGISTRY`` drives all non-regression tests across 9 scenarios: u_deeponet 2D/3D, fourier_deeponet, mionet, temporal_projection, multi-channel packed 2D, xfno trunkless 3D (with and without time-axis-extend), and core 2D MLP-branch. - New 3D gradient-flow test and 3D ``torch.compile`` test. - ``fullgraph=True`` probe tests for 2D and 3D marked ``@pytest.mark.xfail(strict=False)`` to empirically answer Peter NVIDIA#63. - ``_load_golden`` uses ``pytest.skip`` for missing fixtures so CI passes pending cluster-side golden regeneration. - Test class structure mirrors MOD-008a/b/c: ``TestDeepONetConstructor``, ``TestDeepONetNonRegression``, ``TestDeepONetCheckpoint``, ``TestDeepONetGradientFlow``, ``TestDeepONetCompile``, ``TestDeepONetTimeAxisExtend``. CHANGELOG bullet rewritten to describe the actual shipped API (was stale, still described the old config-driven 8-variant model).

dallasfoster added 6 commits June 28, 2023 12:40

Fix distributed ensemble metrics bug and crps_gaussian bug

99fb9aa

fix tests

bf3fb74

linting

38a1c54

Update docs and typing

6de04c5

black

ac4a44a

Fix ensemble mean bug

4f360bb

dallasfoster requested a review from NickGeneva June 30, 2023 00:07

This was referenced Jun 30, 2023

Error with setting value of n in ensemble mean calculation in distributed environment #57

Closed

Error with Ensemble Mean Calculation #59

Closed

NickGeneva approved these changes Jun 30, 2023

View reviewed changes

dallasfoster added 2 commits June 29, 2023 19:13

Merge branch 'main' into bug-ext-ensemble-metrics

c486c02

Merge branch 'main' into bug-ext-ensemble-metrics

08542bc

dallasfoster merged commit e991258 into NVIDIA:main Jun 30, 2023

kk98kk mentioned this pull request Oct 19, 2025

🐛[BUG]: DOMINO issues #1116

Closed

wdyab pushed a commit to wdyab/physicsnemo that referenced this pull request Mar 17, 2026

Bug ext ensemble metrics (NVIDIA#63)

9124d2e

* Fix distributed ensemble metrics bug and crps_gaussian bug * fix tests * linting * Update docs and typing * black * Fix ensemble mean bug

nbren12 pushed a commit to nbren12/modulus that referenced this pull request Mar 24, 2026

Bug ext ensemble metrics (NVIDIA#63)

885ded2

* Fix distributed ensemble metrics bug and crps_gaussian bug * fix tests * linting * Update docs and typing * black * Fix ensemble mean bug

Kaushikreddym pushed a commit to Kaushikreddym/physicsnemo that referenced this pull request Mar 26, 2026

Bug ext ensemble metrics (NVIDIA#63)

a3b5e12

* Fix distributed ensemble metrics bug and crps_gaussian bug * fix tests * linting * Update docs and typing * black * Fix ensemble mean bug

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug ext ensemble metrics#63

Bug ext ensemble metrics#63
dallasfoster merged 8 commits into
NVIDIA:mainfrom
dallasfoster:bug-ext-ensemble-metrics

dallasfoster commented Jun 30, 2023

Uh oh!

dallasfoster commented Jun 30, 2023

Uh oh!

dallasfoster commented Jun 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dallasfoster commented Jun 30, 2023

Uh oh!

dallasfoster commented Jun 30, 2023

Uh oh!

dallasfoster commented Jun 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants