-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
[Bugfix] Update multimodel models mapping to fit new checkpoint after Transformers v4.52 #19151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Isotr0py <2037008807@qq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @Isotr0py, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
Hello team, gemini-code-assist here to provide a summary of this pull request. This PR addresses an issue where recent checkpoints saved after the Transformers library updated to v4.52.2 have changed the naming convention for weights in certain multimodal models. Specifically, it aims to fix the loading of these models by introducing and applying weight mapping logic for Fuyu, Gemma3, and PaliGemma models to align the new checkpoint names with vLLM's expected structure.
Highlights
- Bugfix for Checkpoint Loading: The primary goal is to fix the loading of multimodal model checkpoints that were saved using Transformers versions 4.52.2 or later, which introduced changes to the internal weight naming structure.
- Affected Models: The fix is applied to the Fuyu, Gemma3, and PaliGemma multimodal models.
- Weight Mapping Implementation: A
WeightsMapper
utility is introduced and configured for the affected models to translate the new checkpoint weight names to the names expected by vLLM's model implementations.
Changelog
Click here to see the changelog
- vllm/model_executor/models/fuyu.py
- Imported the
WeightsMapper
utility. - Added a class attribute
hf_to_vllm_mapper
toFuyuForCausalLM
with a mapping forlanguage_model.layers.
tolanguage_model.model.layers.
.
- Imported the
- vllm/model_executor/models/gemma3_mm.py
- Imported the
WeightsMapper
utility. - Added a class attribute
hf_to_vllm_mapper
toGemma3ForConditionalGeneration
with mappings formodel.language_model.
,model.vision_tower.
,model.multi_modal_projector.
, andlm_head.
prefixes. - Updated the
load_weights
method to use the newly definedhf_to_vllm_mapper
.
- Imported the
- vllm/model_executor/models/paligemma.py
- Imported the
WeightsMapper
utility. - Added a class attribute
hf_to_vllm_mapper
toPaliGemmaForConditionalGeneration
with mappings formodel.language_model.
,model.vision_tower.
,model.multi_modal_projector.
, andlm_head.
prefixes (same as Gemma3). - Updated the
load_weights
method to use the newly definedhf_to_vllm_mapper
.
- Imported the
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Weights shift and names change,
Transformers update's range.
Map the old to new,
Make the models true,
Loading now feels less strange.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to update the weight mappings for several multimodal models to ensure compatibility with checkpoints saved after Transformers v4.52.2. The introduction of WeightsMapper
is a good approach to handle these changes systematically.
Key observations:
- The
WeightsMapper
has been correctly added toFuyuForCausalLM
,Gemma3ForConditionalGeneration
, andPaliGemmaForConditionalGeneration
. - The
load_weights
method has been updated to use this mapper forGemma3ForConditionalGeneration
andPaliGemmaForConditionalGeneration
. - There's a critical oversight in
fuyu.py
where theload_weights
method does not utilize the newly addedhf_to_vllm_mapper
.
Regarding the PR description:
- It's great that you've linked the related issue and the Transformers PR that introduced these changes.
- The test plan is partially filled. It would be beneficial to include the specific test commands used for verification.
- Test results are currently missing. Adding a comparison of results before and after the changes, or e2e test results, would greatly help in validating the fix.
Could you also clarify if other multimodal models listed in the test plan (e.g., llava, qwen2_audio) require similar mapping updates, or if they were checked and found not to need changes with Transformers v4.52.2+ checkpoints?
Summary of Findings
- Missing Mapper Usage in FuyuForCausalLM: The
load_weights
method invllm/model_executor/models/fuyu.py
forFuyuForCausalLM
does not use thehf_to_vllm_mapper
that was added to the class. This means the intended weight name mapping for Fuyu models is not being applied, potentially leading to errors when loading newer checkpoints. - PR Description Completeness: The pull request description is missing a detailed test plan (commands) and test results, which are important for verifying the bugfix across all affected models.
Merge Readiness
This pull request addresses an important compatibility issue with newer Transformers checkpoints. However, there is a critical issue in fuyu.py
where the new weight mapper is not being utilized. This must be addressed before the PR can be considered for merging. Additionally, completing the test plan and providing test results in the PR description would improve confidence in the changes. I am unable to approve this PR in its current state.
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
### What does this PR do? Fixes #1710  1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`  ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
### What does this PR do? Fixes volcengine#1710  1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`  ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
### What does this PR do? Fixes volcengine#1710  1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 ### Test run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`  ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as long as tests pass
Edit: I forgot that I ran the tests in fastcheck already. Should be good to go then!
… Transformers v4.52 (vllm-project#19151) Signed-off-by: Isotr0py <2037008807@qq.com>
Fixes volcengine#1710  1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this parameter must be a `dict`. 2. Transformers 4.52.* changes the weight keys in the model state dict, causing mismatches with vLLM's weight loader. See also: huggingface/transformers#38385 vllm-project/vllm#19054 vllm-project/vllm#19151 run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`  - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.
… Transformers v4.52 (vllm-project#19151) Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: minpeter <kali2005611@gmail.com>
… Transformers v4.52 (vllm-project#19151) Signed-off-by: Isotr0py <2037008807@qq.com> Signed-off-by: Yang Wang <elainewy@meta.com>
… Transformers v4.52 (vllm-project#19151) Signed-off-by: Isotr0py <2037008807@qq.com>
… Transformers v4.52 (vllm-project#19151) Signed-off-by: Isotr0py <2037008807@qq.com>
Essential Elements of an Effective PR Description Checklist
Purpose
🔴 [VLM] Add base model without head huggingface/transformers#37033 after Transformers v4.52.2
Test Plan
vLLM supported models touched in that PR:
qwen2_audio(That PR doesn't touch audio model, it only edited comments for qwen2-audio.)qwen2_5_omniUnchanged.Test Result