Adds fix for sporadic CI bug in Barnes-Hut test by peterdsharpe · Pull Request #1580 · NVIDIA/physicsnemo

peterdsharpe · 2026-04-22T18:48:51Z

PhysicsNeMo Pull Request

Description

In CI, seeing a sporadic bug reported by @ktangsali (example failing run):

CI trace report

______________________ test_bh_nested_source_data_keys[2] ______________________

n_dims = 2

    @dims_params
    def test_bh_nested_source_data_keys(n_dims: int):
        """Convergence with nested TensorDict keys matching GLOBE's production format.
    
        GLOBE passes source_data structured like:
            {"physical": {"velocity": ...}, "latent": {"scalars": {"0": ...},
             "vectors": {"0": ...}}, "normals": ...}
    
        The aggregation, split_by_leaf_rank, and TensorDict.cat operations must
        handle this nesting correctly.
        """
        torch.manual_seed(DEFAULT_SEED)
        n_src, n_tgt = 30, 15
    
        source_data_ranks = {
            "physical": {"pressure": 0},
            "latent": {"scalars": {"0": 0, "1": 0}, "vectors": {"0": 1}},
            "normals": 1,
        }
        output_field_ranks = {"p": 0, "u": 1}
    
        common_kwargs = dict(
            n_spatial_dims=n_dims,
            output_field_ranks={
                k: (0 if v == "scalar" else 1) for k, v in output_field_ranks.items()
            },
            source_data_ranks=source_data_ranks,
            hidden_layer_sizes=[16],
        )
    
        bh_kernel = BarnesHutKernel(**common_kwargs, leaf_size=DEFAULT_LEAF_SIZE)
        exact_kernel = Kernel(**common_kwargs)
        exact_kernel.load_state_dict(bh_kernel.state_dict(), strict=False)
        bh_kernel.eval()
        exact_kernel.eval()
    
        torch.manual_seed(DEFAULT_SEED + 1)
        source_data = TensorDict(
            {
                "physical": TensorDict(
                    {"pressure": torch.randn(n_src)},
                    batch_size=[n_src],
                ),
                "latent": TensorDict(
                    {
                        "scalars": TensorDict(
                            {"0": torch.randn(n_src), "1": torch.randn(n_src)},
                            batch_size=[n_src],
                        ),
                        "vectors": TensorDict(
                            {"0": F.normalize(torch.randn(n_src, n_dims), dim=-1)},
                            batch_size=[n_src],
                        ),
                    },
                    batch_size=[n_src],
                ),
        }
        output_field_ranks = {"p": 0, "u": 1}
    
        common_kwargs = dict(
            n_spatial_dims=n_dims,
            output_field_ranks={
                k: (0 if v == "scalar" else 1) for k, v in output_field_ranks.items()
            },
            source_data_ranks=source_data_ranks,
            hidden_layer_sizes=[16],
        )
    
        bh_kernel = BarnesHutKernel(**common_kwargs, leaf_size=DEFAULT_LEAF_SIZE)
        exact_kernel = Kernel(**common_kwargs)
        exact_kernel.load_state_dict(bh_kernel.state_dict(), strict=False)
        bh_kernel.eval()
        exact_kernel.eval()
    
        torch.manual_seed(DEFAULT_SEED + 1)
        source_data = TensorDict(
            {
                "physical": TensorDict(
                    {"pressure": torch.randn(n_src)},
                    batch_size=[n_src],
                ),
                "latent": TensorDict(
                    {
                        "scalars": TensorDict(
                            {"0": torch.randn(n_src), "1": torch.randn(n_src)},
                            batch_size=[n_src],
                        ),
                        "vectors": TensorDict(
                            {"0": F.normalize(torch.randn(n_src, n_dims), dim=-1)},
                            batch_size=[n_src],
                        ),
                    },
                    batch_size=[n_src],
                ),
                "normals": F.normalize(torch.randn(n_src, n_dims), dim=-1),
            },
            batch_size=[n_src],
        )
    
        data = {
            "source_points": torch.randn(n_src, n_dims),
            "target_points": torch.randn(n_tgt, n_dims) * 5,
            "source_strengths": torch.rand(n_src) + 0.1,
            "reference_length": torch.ones(()),
            "source_data": source_data,
            "global_data": TensorDict({}, batch_size=[]),
        }
    
        exact_result = exact_kernel(**data)
        bh_result = bh_kernel(**data, theta=0.01)
    
        for field_name in output_field_ranks:
>           torch.testing.assert_close(
                bh_result[field_name],
                exact_result[field_name],
                atol=1e-3,
                rtol=1e-2,
                msg=f"Nested keys: {field_name!r} not close to exact at theta=0.01",
            )
E           AssertionError: Nested keys: 'p' not close to exact at theta=0.01

test/models/globe/test_barnes_hut_kernel.py:960: AssertionError

This is possibly introduced by #1494, which recently modified these files. However, CI passed on this PR, and this bug seems to occur only sporadically.

On local machines, these test seem to consistently pass.

For now, adding diagnostic code here to try reproducing this on CI and get more debugging details.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.
An issue is linked to this pull request.
If I am implementing a new model or modifying any existing model, I have followed the Models Implementation Coding Standards.

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

… consistent TensorDict structure. Update test tolerances for output comparisons to enhance robustness against numerical discrepancies.

peterdsharpe · 2026-04-23T16:10:58Z

Root cause: tensordict shipped a new release on April 20 which changed the ordering of keys when flattening tensordicts, and we didn't guard against this. Hence, causing our CI to pick up the new version, which caused a previously-passing test to fail. Testing a fix.

greptile-apps · 2026-04-23T17:18:54Z

Greptile Summary

This PR fixes a sporadic CI failure in the Barnes-Hut kernel caused by TensorDict returning leaves in a non-deterministic iteration order across library versions. concatenate_leaves now sorts items by str(key) for a stable column layout, and all split-back enumerations plus the vectors.keys() loop in _evaluate_interactions are aligned to that same order. The test gains per-pair pre-aggregation invariant checks via a monkey-patch capture pattern, tighter tolerances, and a correction to the output_field_ranks setup that was silently misconfiguring the test.

Important Files Changed

Filename	Overview
physicsnemo/experimental/models/globe/utilities/tensordict_utils.py	Core fix: `concatenate_leaves` now sorts items by `str(key)` before concatenation, producing a canonical column ordering independent of TensorDict construction order and library version.
physicsnemo/experimental/models/globe/field_kernel.py	Two aligned fixes: scalar split-back now reconstructs a nested TensorDict matching the exact kernel's leaf structure; both scalar and vector key enumerations are sorted by `str` to match `concatenate_leaves`'s canonical ordering. The `vectors.keys()` loop is also sorted for consistency.
test/models/globe/test_barnes_hut_kernel.py	Adds diagnostic invariants: state_dict completeness, per-pair pre-aggregation bit-identity via monkey-patch + inv_perm reindex, and tighter final tolerances. Also fixes a silent test bug where the dict-comprehension always returned rank=1 for integer-valued `output_field_ranks`.
CHANGELOG.md	Adds a changelog entry accurately describing the TensorDict key-ordering fix.

_{Reviews (1): Last reviewed commit: "Revises fix" | Re-trigger Greptile}

peterdsharpe · 2026-04-23T17:25:01Z

/blossom-ci

Adds diagnostics

a2dd916

peterdsharpe added the 5 - DO NOT MERGE Hold off on merging; see PR for details label Apr 22, 2026

peterdsharpe mentioned this pull request Apr 23, 2026

test: loosening absolute tolerance for layer norm tests #1581

Open

6 tasks

peterdsharpe added 3 commits April 23, 2026 11:31

adds diagnostics

4809921

formatting

a7eec60

Refactor BarnesHutKernel to improve source scalar handling and ensure…

1c995b4

… consistent TensorDict structure. Update test tolerances for output comparisons to enhance robustness against numerical discrepancies.

peterdsharpe added 2 commits April 23, 2026 13:06

Removes diagnostics

cb17797

Revises fix

7d37951

peterdsharpe marked this pull request as ready for review April 23, 2026 17:13

peterdsharpe requested review from ktangsali April 23, 2026 17:14

greptile-apps Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread test/models/globe/test_barnes_hut_kernel.py Outdated

Comment thread test/models/globe/test_barnes_hut_kernel.py Outdated

Simplifies diff

ce7b972

ktangsali approved these changes Apr 23, 2026

View reviewed changes

peterdsharpe removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Apr 23, 2026

peterdsharpe self-assigned this Apr 23, 2026

peterdsharpe enabled auto-merge April 23, 2026 17:50

peterdsharpe added this pull request to the merge queue Apr 23, 2026

Merged via the queue into NVIDIA:main with commit def51ba Apr 23, 2026
4 checks passed

peterdsharpe deleted the psharpe/globe_bh_test_hotfix branch April 23, 2026 19:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds fix for sporadic CI bug in Barnes-Hut test#1580

Adds fix for sporadic CI bug in Barnes-Hut test#1580
peterdsharpe merged 7 commits intoNVIDIA:mainfrom
peterdsharpe:psharpe/globe_bh_test_hotfix

peterdsharpe commented Apr 22, 2026 •

edited

Loading

Uh oh!

peterdsharpe commented Apr 23, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 23, 2026

Uh oh!

Uh oh!

Uh oh!

peterdsharpe commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

peterdsharpe commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PhysicsNeMo Pull Request

Description

Checklist

Dependencies

Review Process

Uh oh!

peterdsharpe commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

Important Files Changed

Uh oh!

Uh oh!

Uh oh!

peterdsharpe commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

peterdsharpe commented Apr 22, 2026 •

edited

Loading

peterdsharpe commented Apr 23, 2026 •

edited

Loading