Revert "Update geotransolver for 2d and 3d use cases" by coreyjadams · Pull Request #1619 · NVIDIA/physicsnemo

coreyjadams · 2026-05-05T17:28:06Z

Reverts #1502

This reverts commit b193e03.

coreyjadams · 2026-05-05T17:28:28Z

/blossom-ci

greptile-apps · 2026-05-05T17:33:30Z

Greptile Summary

This PR reverts #1502 ("Update geotransolver for 2d and 3d use cases"), removing structured 2D/3D mesh support, the Muon optimizer, multi-model Hydra dispatch, JSONL metrics logging, and the shared free-function refactor (_project_input, _compute_slices_from_projections, _flare_self_attention). It is not a pure revert: GALE_FA is extracted into a new dedicated gale_fa.py file rather than being deleted, and the test import path is updated accordingly.

Core models (gale.py, geotransolver.py, context_projector.py, physics_attention.py): GALEStructuredMesh2D/3D, StructuredContextProjector, structured_shape parameter, and the structured-grid flatten/unflatten helpers are all removed; shared free functions are inlined back into their respective class methods.
gale_fa.py (new file): GALE_FA is moved here from gale.py; the constructor hard-rejects use_te=True immediately, leaving all subsequent TE code branches (linear_layer = te.Linear, self.attn_fn = te.DotProductAttention(...), and TE blocks in forward()) permanently unreachable.
Training example: train_transolver_darcy_fix.py is simplified back to Transolver-only with AdamW; config_fix.yaml retains developer-local dataset paths (/user_data/datasets/...).

Important Files Changed

Filename	Overview
physicsnemo/experimental/models/geotransolver/gale_fa.py	New file extracting GALE_FA into its own module; contains unreachable TE code paths (constructor raises ValueError for use_te=True but TE branches remain), docstring formatting issues, and a commented-out profiling line.
physicsnemo/experimental/models/geotransolver/gale.py	Reverts structured-mesh (GALEStructuredMesh2D/3D) variants and shared free functions; GALE_FA moved to gale_fa.py; minor docstring whitespace artifact in two locations.
physicsnemo/experimental/models/geotransolver/context_projector.py	Reverts StructuredContextProjector, _SliceToContextMixin, and shared _compute_slices_from_projections; ContextProjector now operates on (B, H, N, S) layout with inlined slice computation.
physicsnemo/experimental/models/geotransolver/geotransolver.py	Removes structured_shape parameter and all 2D/3D flatten/unflatten helpers; GeoTransolver is now unstructured-mesh only.
physicsnemo/nn/module/physics_attention.py	Removes shared free functions _project_input and _compute_slices_from_projections; inlines logic back into PhysicsAttentionBase and structured-mesh subclasses; switches from OptionalImport to direct importlib for TE.
physicsnemo/experimental/nn/flare_attention.py	Removes the _flare_self_attention free function; FLARE.forward now contains inlined FLARE attention logic with explicit TE/non-TE branches.
examples/cfd/darcy_transolver/train_transolver_darcy_fix.py	Reverts multi-model dispatch, Muon optimizer, JSONL logging, and separate train/val TensorBoard writers; now hardcoded to Transolver with AdamW; validation plots generated unconditionally on rank 0.
examples/cfd/darcy_transolver/config_fix.yaml	Reverts Hydra multi-model defaults; hardcoded Transolver hyperparameters; data paths point to developer-local /user_data/ location.
physicsnemo/experimental/models/geotransolver/init.py	Removes GALE_FA, GALEStructuredMesh2D/3D, MultiScaleFeatureExtractor, StructuredContextProjector, and GeometricFeatureProcessor from public exports; GALE_FA is now accessible only via gale_fa.py.
test/models/geotransolver/test_geotransolver.py	Removes structured 2D/3D forward tests and the local-features-with-structured-shape validation test, matching the revert of structured_shape support.
test/models/geotransolver/test_gale.py	Trivial import path update: GALE_FA now imported from gale_fa module instead of gale.

_{Reviews (1): Last reviewed commit: "Revert "Update geotransolver for 2d and ..." | Re-trigger Greptile}

greptile-apps · 2026-05-05T17:33:34Z

+        if use_te:
+            raise ValueError(
+                "GALE_FA does not support Transformer Engine backend. "
+                "Use use_te=False; TE disables FlashAttention for differing q/k sizes in FLARE attention."
+            )
+        super().__init__()
+        if state_mixing_mode not in ("weighted", "concat_project"):
+            raise ValueError(
+                f"Invalid state_mixing_mode: {state_mixing_mode!r}. "
+                f"Expected 'weighted' or 'concat_project'."
+            )
+        self.state_mixing_mode = state_mixing_mode
+        self.use_te = use_te
+        self.heads = heads
+        self.dim_head = dim_head
+        self.scale = 1.0
+        # It is recommended by the FLARE authors to use self.scale = 1 if self.dim_head <= 8 else (self.dim_head ** -0.5)
+        # but we use self.scale = 1.0 because the recommended scaling is not tested yet.
+        inner_dim = dim_head * heads
+
+        linear_layer = te.Linear if self.use_te else nn.Linear
+
+        # Global queries for FLARE self-attention
+        self.q_global = nn.Parameter(torch.randn(1, heads, n_global_queries, dim_head))
+
+        # Linear projections for self-attention
+        self.in_project_x = linear_layer(dim, inner_dim)
+        self.self_k = linear_layer(dim_head, dim_head)
+        self.self_v = linear_layer(dim_head, dim_head)
+
+        if context_dim > 0:
+            # Linear projections for cross-attention
+            self.cross_q = linear_layer(dim_head, dim_head)
+            self.cross_k = linear_layer(context_dim, dim_head)
+            self.cross_v = linear_layer(context_dim, dim_head)
+
+            # Mixing layers for blending self-attention and cross-attention
+            if state_mixing_mode == "weighted":
+                # Learnable mixing weight between self and cross attention
+                self.state_mixing = nn.Parameter(torch.tensor(0.0))
+            else:
+                # Concatenate self and cross attention and project back to dim_head
+                self.concat_project = nn.Sequential(
+                    linear_layer(2 * dim_head, dim_head),
+                    nn.GELU(),
+                )
+
+        # te attention
+        if self.use_te:
+            self.attn_fn = te.DotProductAttention(
+                num_attention_heads=self.heads,
+                kv_channels=self.dim_head,
+                attention_dropout=dropout,
+                qkv_format="bshd",
+                softmax_scale=self.scale
+            )


Unreachable TE code paths throughout constructor and forward

use_te=True raises ValueError at line 129 before super().__init__() is ever called, so self.use_te is always False. Every subsequent if self.use_te: branch — including self.attn_fn = te.DotProductAttention(...) in __init__ (lines 177–184) and the entire TE blocks inside forward() (lines 237–249, 261–268) — is permanently dead code. This also means the te.Linear option in linear_layer (line 149) can never be selected. Consider removing the dead branches or adding a TODO comment explaining TE support is planned for the future.

greptile-apps · 2026-05-05T17:33:35Z

+        Whether to use learned concrete dropout instead of standard dropout.
+        Default is ``False``.
+    state_mixing_mode : str, optional
+        How to blend self-attention and cross-attention outputs.         ``"weighted"`` uses


Same extra-space docstring issue as in gale.py for state_mixing_mode — extraneous spaces between the period and the first backtick-quoted option.

Suggested change

How to blend self-attention and cross-attention outputs. ``"weighted"`` uses

How to blend self-attention and cross-attention outputs. ``"weighted"`` uses

coreyjadams · 2026-05-05T17:42:25Z

None of the greptile comments will be resolved because we are reverting here.

ktangsali · 2026-05-05T18:37:34Z

This PR fixes the issue with the CFD checkpoints

Revert "Update geotransolver for 2d and 3d use cases (#1502)"

54fca57

This reverts commit b193e03.

coreyjadams requested a review from loliverhennigh as a code owner May 5, 2026 17:28

greptile-apps Bot reviewed May 5, 2026

View reviewed changes

ktangsali self-requested a review May 5, 2026 18:29

ktangsali approved these changes May 5, 2026

View reviewed changes

coreyjadams merged commit 70701e1 into main May 5, 2026
6 checks passed

coreyjadams deleted the revert-1502-geotransolver-2d-3d branch May 5, 2026 18:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "Update geotransolver for 2d and 3d use cases"#1619

Revert "Update geotransolver for 2d and 3d use cases"#1619
coreyjadams merged 1 commit intomainfrom
revert-1502-geotransolver-2d-3d

coreyjadams commented May 5, 2026

Uh oh!

coreyjadams commented May 5, 2026

Uh oh!

greptile-apps Bot commented May 5, 2026

Uh oh!

greptile-apps Bot May 5, 2026

Uh oh!

greptile-apps Bot May 5, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coreyjadams commented May 5, 2026

Uh oh!

ktangsali commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	How to blend self-attention and cross-attention outputs. ``"weighted"`` uses
	How to blend self-attention and cross-attention outputs. ``"weighted"`` uses

Conversation

coreyjadams commented May 5, 2026

Uh oh!

coreyjadams commented May 5, 2026

Uh oh!

greptile-apps Bot commented May 5, 2026

Greptile Summary

Important Files Changed

Uh oh!

greptile-apps Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coreyjadams commented May 5, 2026

Uh oh!

ktangsali commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants