Skip to content

Revert "Update geotransolver for 2d and 3d use cases"#1619

Merged
coreyjadams merged 1 commit intomainfrom
revert-1502-geotransolver-2d-3d
May 5, 2026
Merged

Revert "Update geotransolver for 2d and 3d use cases"#1619
coreyjadams merged 1 commit intomainfrom
revert-1502-geotransolver-2d-3d

Conversation

@coreyjadams
Copy link
Copy Markdown
Collaborator

Reverts #1502

@coreyjadams
Copy link
Copy Markdown
Collaborator Author

/blossom-ci

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 5, 2026

Greptile Summary

This PR reverts #1502 ("Update geotransolver for 2d and 3d use cases"), removing structured 2D/3D mesh support, the Muon optimizer, multi-model Hydra dispatch, JSONL metrics logging, and the shared free-function refactor (_project_input, _compute_slices_from_projections, _flare_self_attention). It is not a pure revert: GALE_FA is extracted into a new dedicated gale_fa.py file rather than being deleted, and the test import path is updated accordingly.

  • Core models (gale.py, geotransolver.py, context_projector.py, physics_attention.py): GALEStructuredMesh2D/3D, StructuredContextProjector, structured_shape parameter, and the structured-grid flatten/unflatten helpers are all removed; shared free functions are inlined back into their respective class methods.
  • gale_fa.py (new file): GALE_FA is moved here from gale.py; the constructor hard-rejects use_te=True immediately, leaving all subsequent TE code branches (linear_layer = te.Linear, self.attn_fn = te.DotProductAttention(...), and TE blocks in forward()) permanently unreachable.
  • Training example: train_transolver_darcy_fix.py is simplified back to Transolver-only with AdamW; config_fix.yaml retains developer-local dataset paths (/user_data/datasets/...).

Important Files Changed

Filename Overview
physicsnemo/experimental/models/geotransolver/gale_fa.py New file extracting GALE_FA into its own module; contains unreachable TE code paths (constructor raises ValueError for use_te=True but TE branches remain), docstring formatting issues, and a commented-out profiling line.
physicsnemo/experimental/models/geotransolver/gale.py Reverts structured-mesh (GALEStructuredMesh2D/3D) variants and shared free functions; GALE_FA moved to gale_fa.py; minor docstring whitespace artifact in two locations.
physicsnemo/experimental/models/geotransolver/context_projector.py Reverts StructuredContextProjector, _SliceToContextMixin, and shared _compute_slices_from_projections; ContextProjector now operates on (B, H, N, S) layout with inlined slice computation.
physicsnemo/experimental/models/geotransolver/geotransolver.py Removes structured_shape parameter and all 2D/3D flatten/unflatten helpers; GeoTransolver is now unstructured-mesh only.
physicsnemo/nn/module/physics_attention.py Removes shared free functions _project_input and _compute_slices_from_projections; inlines logic back into PhysicsAttentionBase and structured-mesh subclasses; switches from OptionalImport to direct importlib for TE.
physicsnemo/experimental/nn/flare_attention.py Removes the _flare_self_attention free function; FLARE.forward now contains inlined FLARE attention logic with explicit TE/non-TE branches.
examples/cfd/darcy_transolver/train_transolver_darcy_fix.py Reverts multi-model dispatch, Muon optimizer, JSONL logging, and separate train/val TensorBoard writers; now hardcoded to Transolver with AdamW; validation plots generated unconditionally on rank 0.
examples/cfd/darcy_transolver/config_fix.yaml Reverts Hydra multi-model defaults; hardcoded Transolver hyperparameters; data paths point to developer-local /user_data/ location.
physicsnemo/experimental/models/geotransolver/init.py Removes GALE_FA, GALEStructuredMesh2D/3D, MultiScaleFeatureExtractor, StructuredContextProjector, and GeometricFeatureProcessor from public exports; GALE_FA is now accessible only via gale_fa.py.
test/models/geotransolver/test_geotransolver.py Removes structured 2D/3D forward tests and the local-features-with-structured-shape validation test, matching the revert of structured_shape support.
test/models/geotransolver/test_gale.py Trivial import path update: GALE_FA now imported from gale_fa module instead of gale.

Reviews (1): Last reviewed commit: "Revert "Update geotransolver for 2d and ..." | Re-trigger Greptile

Comment on lines +129 to +184
if use_te:
raise ValueError(
"GALE_FA does not support Transformer Engine backend. "
"Use use_te=False; TE disables FlashAttention for differing q/k sizes in FLARE attention."
)
super().__init__()
if state_mixing_mode not in ("weighted", "concat_project"):
raise ValueError(
f"Invalid state_mixing_mode: {state_mixing_mode!r}. "
f"Expected 'weighted' or 'concat_project'."
)
self.state_mixing_mode = state_mixing_mode
self.use_te = use_te
self.heads = heads
self.dim_head = dim_head
self.scale = 1.0
# It is recommended by the FLARE authors to use self.scale = 1 if self.dim_head <= 8 else (self.dim_head ** -0.5)
# but we use self.scale = 1.0 because the recommended scaling is not tested yet.
inner_dim = dim_head * heads

linear_layer = te.Linear if self.use_te else nn.Linear

# Global queries for FLARE self-attention
self.q_global = nn.Parameter(torch.randn(1, heads, n_global_queries, dim_head))

# Linear projections for self-attention
self.in_project_x = linear_layer(dim, inner_dim)
self.self_k = linear_layer(dim_head, dim_head)
self.self_v = linear_layer(dim_head, dim_head)

if context_dim > 0:
# Linear projections for cross-attention
self.cross_q = linear_layer(dim_head, dim_head)
self.cross_k = linear_layer(context_dim, dim_head)
self.cross_v = linear_layer(context_dim, dim_head)

# Mixing layers for blending self-attention and cross-attention
if state_mixing_mode == "weighted":
# Learnable mixing weight between self and cross attention
self.state_mixing = nn.Parameter(torch.tensor(0.0))
else:
# Concatenate self and cross attention and project back to dim_head
self.concat_project = nn.Sequential(
linear_layer(2 * dim_head, dim_head),
nn.GELU(),
)

# te attention
if self.use_te:
self.attn_fn = te.DotProductAttention(
num_attention_heads=self.heads,
kv_channels=self.dim_head,
attention_dropout=dropout,
qkv_format="bshd",
softmax_scale=self.scale
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unreachable TE code paths throughout constructor and forward

use_te=True raises ValueError at line 129 before super().__init__() is ever called, so self.use_te is always False. Every subsequent if self.use_te: branch — including self.attn_fn = te.DotProductAttention(...) in __init__ (lines 177–184) and the entire TE blocks inside forward() (lines 237–249, 261–268) — is permanently dead code. This also means the te.Linear option in linear_layer (line 149) can never be selected. Consider removing the dead branches or adding a TODO comment explaining TE support is planned for the future.

Whether to use learned concrete dropout instead of standard dropout.
Default is ``False``.
state_mixing_mode : str, optional
How to blend self-attention and cross-attention outputs. ``"weighted"`` uses
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Same extra-space docstring issue as in gale.py for state_mixing_mode — extraneous spaces between the period and the first backtick-quoted option.

Suggested change
How to blend self-attention and cross-attention outputs. ``"weighted"`` uses
How to blend self-attention and cross-attention outputs. ``"weighted"`` uses

Comment thread physicsnemo/experimental/models/geotransolver/gale_fa.py
Comment thread physicsnemo/experimental/models/geotransolver/gale.py
Comment thread examples/cfd/darcy_transolver/config_fix.yaml
@coreyjadams
Copy link
Copy Markdown
Collaborator Author

None of the greptile comments will be resolved because we are reverting here.

@ktangsali ktangsali self-requested a review May 5, 2026 18:29
@ktangsali
Copy link
Copy Markdown
Collaborator

This PR fixes the issue with the CFD checkpoints

@coreyjadams coreyjadams merged commit 70701e1 into main May 5, 2026
6 checks passed
@coreyjadams coreyjadams deleted the revert-1502-geotransolver-2d-3d branch May 5, 2026 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants