-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🔴 🔴 🔴 supersede paligemma forward to shift pos id indexing #36859
Conversation
run slow: gemma3 |
run_slow: gemma3 |
1 similar comment
run_slow: gemma3 |
This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs: models: ['models/gemma3'] |
@molbap BTW, this is not the only place where this shift occurs, it also happens in |
Nice catch, thanks a lot @oceanusxiv 🤗 |
run_slow: gemma3 |
This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs: models: ['models/gemma3'] |
run_slow: gemma3 |
This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs: models: ['models/gemma3'] |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
don't merge yet- the 4b crops test with pan_and_scan is broken with this change. Unsure yet if related to #36864 , checking |
Ok, I think unrelated as well, https://huggingface.co/google/gemma-3-4b-pt/commit/2166bd1fba050eab3ed4fdd0b591ad1e11c96802 added another eos_token which causes the generation to stop earlier. Confirming, fixing in #36882 then we can merge |
Not sure what is causing the difference, but it's not big AFAIK and other integration tests pass. I'll merge now. |
What does this PR do?
Should fix #36856
This is because, in the modular file,
Gemma3ForConditionalGeneration
inherits fromPaliGemmaForConditionalGeneration
where this specific fix happens. But it is specific to PaliGemma input ordering and should not have been propagated to Gemma3.