Refactor MLMDataCollatorWithFlattening by pstjohn · Pull Request #1382 · NVIDIA/bionemo-framework

pstjohn · 2025-12-16T15:25:55Z

Updates this class to be more general for llama3 usage by making it accept a base collator and perform flattening after-the-fact. Essentially we're now always doing the bshd-compatable forward call.

This is going to make it easier to implement CP in the llama3 recipe

copy-pr-bot · 2025-12-16T15:25:59Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

jomitchellnv

LGTM but let's build a container and test out the CP tests to make sure this passes before submission

Signed-off-by: Peter St. John <pstjohn@nvidia.com>

pstjohn marked this pull request as ready for review December 16, 2025 15:26

pstjohn requested review from cspades, dorotat-nv, jomitchellnv, jstjohn, jwilber, savitha-eng and trvachov as code owners December 16, 2025 15:26

jomitchellnv approved these changes Dec 17, 2025

View reviewed changes

refactor flattening collator

51d4c0a

Signed-off-by: Peter St. John <pstjohn@nvidia.com>

pstjohn force-pushed the pstjohn/update-flattening-collator branch from be53581 to 526791b Compare December 22, 2025 20:13

simplify esm2 CP collator creation

61710b0

Signed-off-by: Peter St. John <pstjohn@nvidia.com>

pstjohn force-pushed the pstjohn/update-flattening-collator branch from 526791b to 61710b0 Compare December 22, 2025 20:15

pstjohn added this pull request to the merge queue Dec 22, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 22, 2025

pstjohn added this pull request to the merge queue Dec 22, 2025

Merged via the queue into NVIDIA:main with commit 4873914 Dec 22, 2025
16 checks passed

pstjohn deleted the pstjohn/update-flattening-collator branch December 22, 2025 22:44

pstjohn mentioned this pull request Dec 23, 2025

simplify esm2 CP collator creation #1383

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor MLMDataCollatorWithFlattening#1382

Refactor MLMDataCollatorWithFlattening#1382
pstjohn merged 2 commits intoNVIDIA:mainfrom
pstjohn:pstjohn/update-flattening-collator

pstjohn commented Dec 16, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Dec 16, 2025

Uh oh!

jomitchellnv left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pstjohn commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Dec 16, 2025

Uh oh!

jomitchellnv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pstjohn commented Dec 16, 2025 •

edited

Loading