-
Notifications
You must be signed in to change notification settings - Fork 31.3k
Let's break Qwen-VL 🚨 #42420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let's break Qwen-VL 🚨 #42420
Conversation
|
run-slow: qwen2_5_vl, qwen2_vl |
|
This comment contains models: ["models/qwen2_5_vl", "models/qwen2_vl"] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
CI Results✅ No failing test specific to this PR 🎉 ! |
…Reported by TRL in the past
|
run-slow: qwen2_5_vl, qwen2_vl |
|
This comment contains models: ["models/qwen2_5_vl", "models/qwen2_vl"] |
CI ResultsModel CI Report❌ Failed tests
|
|
run-slow: qwen2_5_vl, qwen2_vl |
|
This comment contains models: ["models/qwen2_5_vl", "models/qwen2_vl"] |
|
run-slow: qwen2_5_vl, qwen2_vl |
2 similar comments
|
run-slow: qwen2_5_vl, qwen2_vl |
|
run-slow: qwen2_5_vl, qwen2_vl |
| # Hub configs are saved as flat dicts so we pop some of kwargs to init `TextConfig` | ||
| text_params = inspect.signature(self.sub_configs["text_config"].__init__).parameters.keys() | ||
| text_params = list(text_params) + ["rope_scaling", "rope_theta"] | ||
| text_config = {key: kwargs.pop(key) for key in text_params if key in kwargs} | ||
| text_config["dtype"] = kwargs.get("torch_dtype", kwargs.get("dtype")) # don't pop the dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not the best way, i should admit. The problem is that config will assign all kwargs as attributes, and if we don't pop we end up with the same set of kwargs in text config and in the general config
CI Results |
|
This comment contains models: ["models/qwen2_5_vl", "models/qwen2_vl"] |
CI Results✅ No failing test specific to this PR 🎉 ! |
|
Thanks for addressing this issue, @zucchini-nlp. 🤗 For completeness, here is the original issue we opened in TRL that tracked this problem: |
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks let's add a note in the migration guide please!
|
[For maintainers] Suggested jobs to run (before merge) run-slow: qwen2_5_vl, qwen2_vl |
What does this PR do?
As per title, break the cycle. Qwen will no longer have access to text attributes with
config.vocab_sizewhich solves the duplicated attribute issue we had in TRL. A config should not have two different values for the same keyTested with the TRL issue that there is no regression and ran slow test locally