Skip to content

[Bugfix] Update multimodel models mapping to fit new checkpoint after Transformers v4.52 #19151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 17, 2025

Conversation

Isotr0py
Copy link
Collaborator

@Isotr0py Isotr0py commented Jun 4, 2025

Essential Elements of an Effective PR Description Checklist

  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results

Purpose

Test Plan

vLLM supported models touched in that PR:

  • aria
  • aya_vision
  • fuyu
  • gemma3
  • llava
  • llava-next
  • llava-next-video
  • llava-ov
  • mistral-3
  • mllama
  • paligemma
  • qwen2_audio (That PR doesn't touch audio model, it only edited comments for qwen2-audio.)
  • qwen2_vl
  • qwen2_5_vl
  • qwen2_5_omni Unchanged.
pytest -s -v tests/models/multimodal/test_mapping.py -m core_model

Test Result

Signed-off-by: Isotr0py <2037008807@qq.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @Isotr0py, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello team, gemini-code-assist here to provide a summary of this pull request. This PR addresses an issue where recent checkpoints saved after the Transformers library updated to v4.52.2 have changed the naming convention for weights in certain multimodal models. Specifically, it aims to fix the loading of these models by introducing and applying weight mapping logic for Fuyu, Gemma3, and PaliGemma models to align the new checkpoint names with vLLM's expected structure.

Highlights

  • Bugfix for Checkpoint Loading: The primary goal is to fix the loading of multimodal model checkpoints that were saved using Transformers versions 4.52.2 or later, which introduced changes to the internal weight naming structure.
  • Affected Models: The fix is applied to the Fuyu, Gemma3, and PaliGemma multimodal models.
  • Weight Mapping Implementation: A WeightsMapper utility is introduced and configured for the affected models to translate the new checkpoint weight names to the names expected by vLLM's model implementations.

Changelog

Click here to see the changelog
  • vllm/model_executor/models/fuyu.py
    • Imported the WeightsMapper utility.
    • Added a class attribute hf_to_vllm_mapper to FuyuForCausalLM with a mapping for language_model.layers. to language_model.model.layers..
  • vllm/model_executor/models/gemma3_mm.py
    • Imported the WeightsMapper utility.
    • Added a class attribute hf_to_vllm_mapper to Gemma3ForConditionalGeneration with mappings for model.language_model., model.vision_tower., model.multi_modal_projector., and lm_head. prefixes.
    • Updated the load_weights method to use the newly defined hf_to_vllm_mapper.
  • vllm/model_executor/models/paligemma.py
    • Imported the WeightsMapper utility.
    • Added a class attribute hf_to_vllm_mapper to PaliGemmaForConditionalGeneration with mappings for model.language_model., model.vision_tower., model.multi_modal_projector., and lm_head. prefixes (same as Gemma3).
    • Updated the load_weights method to use the newly defined hf_to_vllm_mapper.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Weights shift and names change,
Transformers update's range.
Map the old to new,
Make the models true,
Loading now feels less strange.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to update the weight mappings for several multimodal models to ensure compatibility with checkpoints saved after Transformers v4.52.2. The introduction of WeightsMapper is a good approach to handle these changes systematically.

Key observations:

  • The WeightsMapper has been correctly added to FuyuForCausalLM, Gemma3ForConditionalGeneration, and PaliGemmaForConditionalGeneration.
  • The load_weights method has been updated to use this mapper for Gemma3ForConditionalGeneration and PaliGemmaForConditionalGeneration.
  • There's a critical oversight in fuyu.py where the load_weights method does not utilize the newly added hf_to_vllm_mapper.

Regarding the PR description:

  • It's great that you've linked the related issue and the Transformers PR that introduced these changes.
  • The test plan is partially filled. It would be beneficial to include the specific test commands used for verification.
  • Test results are currently missing. Adding a comparison of results before and after the changes, or e2e test results, would greatly help in validating the fix.

Could you also clarify if other multimodal models listed in the test plan (e.g., llava, qwen2_audio) require similar mapping updates, or if they were checked and found not to need changes with Transformers v4.52.2+ checkpoints?

Summary of Findings

  • Missing Mapper Usage in FuyuForCausalLM: The load_weights method in vllm/model_executor/models/fuyu.py for FuyuForCausalLM does not use the hf_to_vllm_mapper that was added to the class. This means the intended weight name mapping for Fuyu models is not being applied, potentially leading to errors when loading newer checkpoints.
  • PR Description Completeness: The pull request description is missing a detailed test plan (commands) and test results, which are important for verifying the bugfix across all affected models.

Merge Readiness

This pull request addresses an important compatibility issue with newer Transformers checkpoints. However, there is a critical issue in fuyu.py where the new weight mapper is not being utilized. This must be addressed before the PR can be considered for merging. Additionally, completing the test plan and providing test results in the PR description would improve confidence in the changes. I am unable to approve this PR in its current state.

Copy link

github-actions bot commented Jun 4, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

vermouth1992 pushed a commit to volcengine/verl that referenced this pull request Jun 7, 2025
### What does this PR do?

Fixes #1710


![image](https://github.com/user-attachments/assets/185d37b6-a4fe-4e89-8eed-72f4477937e8)

1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this
parameter must be a `dict`.
2. Transformers 4.52.* changes the weight keys in the model state dict,
causing mismatches with vLLM's weight loader.

See also:
huggingface/transformers#38385
vllm-project/vllm#19054
vllm-project/vllm#19151

### Test

run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`


![image](https://github.com/user-attachments/assets/b8137c87-f250-40d0-b9c3-c3f44f1a40a1)

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] New CI unit test(s) are added to cover the code path.
- [ ] Rely on existing unit tests on CI that covers the code path.
Isotr0py added 5 commits June 8, 2025 21:25
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
@mergify mergify bot added the multi-modality Related to multi-modality (#4194) label Jun 8, 2025
Isotr0py added 2 commits June 9, 2025 00:38
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
yellowbee686 pushed a commit to yellowbee686/verl that referenced this pull request Jun 10, 2025
### What does this PR do?

Fixes volcengine#1710


![image](https://github.com/user-attachments/assets/185d37b6-a4fe-4e89-8eed-72f4477937e8)

1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this
parameter must be a `dict`.
2. Transformers 4.52.* changes the weight keys in the model state dict,
causing mismatches with vLLM's weight loader.

See also:
huggingface/transformers#38385
vllm-project/vllm#19054
vllm-project/vllm#19151

### Test

run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`


![image](https://github.com/user-attachments/assets/b8137c87-f250-40d0-b9c3-c3f44f1a40a1)

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] New CI unit test(s) are added to cover the code path.
- [ ] Rely on existing unit tests on CI that covers the code path.
Signed-off-by: Isotr0py <2037008807@qq.com>
@mergify mergify bot added the llama Related to Llama models label Jun 12, 2025
Signed-off-by: Isotr0py <2037008807@qq.com>
@Isotr0py Isotr0py marked this pull request as ready for review June 12, 2025 07:55
GitMonkey0 pushed a commit to GitMonkey0/verl that referenced this pull request Jun 14, 2025
### What does this PR do?

Fixes volcengine#1710


![image](https://github.com/user-attachments/assets/185d37b6-a4fe-4e89-8eed-72f4477937e8)

1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this
parameter must be a `dict`.
2. Transformers 4.52.* changes the weight keys in the model state dict,
causing mismatches with vLLM's weight loader.

See also:
huggingface/transformers#38385
vllm-project/vllm#19054
vllm-project/vllm#19151

### Test

run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`


![image](https://github.com/user-attachments/assets/b8137c87-f250-40d0-b9c3-c3f44f1a40a1)

### Checklist Before Submitting

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] New CI unit test(s) are added to cover the code path.
- [ ] Rely on existing unit tests on CI that covers the code path.
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as long as tests pass

Edit: I forgot that I ran the tests in fastcheck already. Should be good to go then!

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) June 17, 2025 14:07
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jun 17, 2025
@DarkLight1337 DarkLight1337 merged commit ca94d7f into vllm-project:main Jun 17, 2025
85 of 86 checks passed
@Isotr0py Isotr0py deleted the fix-mapping branch June 17, 2025 16:23
yeqcharlotte pushed a commit to yeqcharlotte/vllm that referenced this pull request Jun 22, 2025
… Transformers v4.52 (vllm-project#19151)

Signed-off-by: Isotr0py <2037008807@qq.com>
yellowbee686 pushed a commit to yellowbee686/verl that referenced this pull request Jun 23, 2025
Fixes volcengine#1710

![image](https://github.com/user-attachments/assets/185d37b6-a4fe-4e89-8eed-72f4477937e8)

1. vLLM 0.9.0 does not support `limit_mm_per_prompt=None`; this
parameter must be a `dict`.
2. Transformers 4.52.* changes the weight keys in the model state dict,
causing mismatches with vLLM's weight loader.

See also:
huggingface/transformers#38385
vllm-project/vllm#19054
vllm-project/vllm#19151

run `bash examples/grpo_trainer/run_qwen2_5_vl-7b.sh`

![image](https://github.com/user-attachments/assets/b8137c87-f250-40d0-b9c3-c3f44f1a40a1)

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting).
- [ ] Add `[BREAKING]` to the PR title if it breaks any API.
- [ ] Update the documentation about your changes in the
[docs](https://github.com/volcengine/verl/tree/main/docs).
- [ ] New CI unit test(s) are added to cover the code path.
- [ ] Rely on existing unit tests on CI that covers the code path.
minpeter pushed a commit to minpeter/vllm that referenced this pull request Jun 24, 2025
… Transformers v4.52 (vllm-project#19151)

Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: minpeter <kali2005611@gmail.com>
yangw-dev pushed a commit to yangw-dev/vllm that referenced this pull request Jun 24, 2025
… Transformers v4.52 (vllm-project#19151)

Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: Yang Wang <elainewy@meta.com>
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jun 30, 2025
… Transformers v4.52 (vllm-project#19151)

Signed-off-by: Isotr0py <2037008807@qq.com>
wseaton pushed a commit to wseaton/vllm that referenced this pull request Jun 30, 2025
… Transformers v4.52 (vllm-project#19151)

Signed-off-by: Isotr0py <2037008807@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llama Related to Llama models multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
2 participants