Assert fp32 for rope embeddings, misc test fixes by pstjohn · Pull Request #1496 · NVIDIA/bionemo-framework

pstjohn · 2026-03-05T15:22:24Z

This wouldn't have caught @savitha-eng's cast_forward_inputs=True bug (that casts these right as they enter the TransformerLayer), but it turns out our test suite was actually casting these to bfloat16 with model.to(bfloat16) calls 😬 .

This also fixes a few other misc. test failures I saw locally making sure the esm2 & llama3 recipe and model tests pass.

will require #1495 for tests to pass

coderabbitai · 2026-03-05T15:22:43Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b5554505-bad8-4a0b-87ec-fed7f5c589d5

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

savitha-eng

lgtm

Self-contained FSDP2 + TransformerEngine recipe for OpenGenome2 training, extracted from the generic llama3_native_te recipe with OG2-specific defaults: - FP32 master weights with MixedPrecisionPolicy (cast_forward_inputs=False) - Megatron-style scaled init for proj/fc2 layers - Spike-No-More embedding initialization (std=1.0) - Genomic masking for degenerate bases - Weight decay grouping (skip bias/1D params) - THD sequence packing with GQA - FP8 training with first/last layer BF16 override - RoPE fp32 assertion (from PR #1496) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Signed-off-by: Peter St. John <pstjohn@nvidia.com>

pstjohn requested review from cspades, dorotat-nv, jomitchellnv, jstjohn, jwilber, savitha-eng and trvachov as code owners March 5, 2026 15:22

savitha-eng approved these changes Mar 5, 2026

View reviewed changes

pstjohn enabled auto-merge March 5, 2026 19:05

savitha-eng mentioned this pull request Mar 5, 2026

Add opengenome2 llama3 recipe #1499

Open

9 tasks

pstjohn added 2 commits March 5, 2026 14:55

assert fp32 for rope embeddings, misc test fixes

179ce89

Signed-off-by: Peter St. John <pstjohn@nvidia.com>

fix failing tests and convert to warning

a5deb7c

Signed-off-by: Peter St. John <pstjohn@nvidia.com>

pstjohn force-pushed the pstjohn/assert-fp32-for-rope-embeddings branch from e609c74 to a5deb7c Compare March 5, 2026 22:22

pstjohn added this pull request to the merge queue Mar 5, 2026

Merged via the queue into NVIDIA:main with commit b2ddae1 Mar 5, 2026
21 checks passed

pstjohn deleted the pstjohn/assert-fp32-for-rope-embeddings branch March 5, 2026 23:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assert fp32 for rope embeddings, misc test fixes#1496

Assert fp32 for rope embeddings, misc test fixes#1496
pstjohn merged 2 commits intoNVIDIA:mainfrom
pstjohn:pstjohn/assert-fp32-for-rope-embeddings

pstjohn commented Mar 5, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 5, 2026 •

edited

Loading

Review skipped

Uh oh!

savitha-eng left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pstjohn commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

savitha-eng left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pstjohn commented Mar 5, 2026 •

edited

Loading

coderabbitai bot commented Mar 5, 2026 •

edited

Loading