-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
[Model] Use merge_by_field_config for MM models (M-N) #26710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Use merge_by_field_config for MM models (M-N) #26710
Conversation
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors multi-modal data handling across several models to use a new, centralized framework enabled by merge_by_field_config = True. This is a significant improvement that simplifies model-specific code by removing manual data validation, batching, and token ID handling, leading to cleaner and more maintainable code. The changes are mostly consistent. My review focuses on a couple of areas in nano_nemotron_vl.py where the refactoring appears incomplete, resulting in unused code that should be removed to fully align with the PR's goals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 1994 <1994@users.noreply.github.com>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: bbartels <benjamin@bartels.dev>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…6710) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Purpose
Part of #26149
Also remove unnecessary token IDs from the field config, since we have removed V0 specific code in #25331 already
Test Plan
Tensor schema tests should pass.
I have also run
MiDashengLMModel,MiniCPMO,MiniCPMVandMolmoForCausalLMusing the example scripts and confirmed they still work.Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.