-
Notifications
You must be signed in to change notification settings - Fork 31.3k
Gemma3 conversion script maintenance #41704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gemma3 conversion script maintenance #41704
Conversation
|
Please hold off on merging this. Going to add one more flag. |
|
Okay, flag added. Ready for review and merge at your leisure. Thanks for the patience 🤗 |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: gemma3 |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
zucchini-nlp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks , lgtm! Just curious if the new vocab size is an intended change?
|
Yes it is. Since we've had a few recent releases with text only variants (270M, EmbeddingGemma), we're normalizing on the original vocab size (262144) in the main configs and then adding 64 to that during conversion if the include vision encoder flag is true. |
|
Oke, thanks for clarifying |
* conversion: add include_vision_encoder flag (default true) * conversion: update for inverted model.language_model weight path * conversion: revert include_vision_encoder to True by default * conversion: add chat template path flag
What does this PR do?
Maintenance on the Gemma 3 weights conversion script.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
cc @ArthurZucker @Cyrilvallez @zucchini-nlp