Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔴 🔴 🔴 supersede paligemma forward to shift pos id indexing #36859

Merged
merged 7 commits into from
Mar 21, 2025

Conversation

molbap
Copy link
Contributor

@molbap molbap commented Mar 20, 2025

What does this PR do?

Should fix #36856

This is because, in the modular file, Gemma3ForConditionalGeneration inherits from PaliGemmaForConditionalGeneration where this specific fix happens. But it is specific to PaliGemma input ordering and should not have been propagated to Gemma3.

@molbap molbap marked this pull request as ready for review March 20, 2025 15:35
@molbap molbap changed the title supersede paligemma forward to shift pos id indexing 🔴 🔴 🔴 supersede paligemma forward to shift pos id indexing Mar 20, 2025
@molbap
Copy link
Contributor Author

molbap commented Mar 20, 2025

run slow: gemma3

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@molbap
Copy link
Contributor Author

molbap commented Mar 20, 2025

run_slow: gemma3

1 similar comment
@ArthurZucker
Copy link
Collaborator

run_slow: gemma3

Copy link

This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs:

models: ['models/gemma3']
quantizations: [] ...

@oceanusxiv
Copy link

@molbap BTW, this is not the only place where this shift occurs, it also happens in prepare_inputs_for_generation, so you probably need to fix that too.

@molbap
Copy link
Contributor Author

molbap commented Mar 21, 2025

Nice catch, thanks a lot @oceanusxiv 🤗
I repushed the slow tests, seems there's no change. To note, beyond_sliding_window tests seem broken though.

@molbap
Copy link
Contributor Author

molbap commented Mar 21, 2025

run_slow: gemma3

Copy link

This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs:

models: ['models/gemma3']
quantizations: [] ...

@molbap
Copy link
Contributor Author

molbap commented Mar 21, 2025

run_slow: gemma3

Copy link

This comment contains run-slow, running the specified jobs: This comment contains run-slow, running the specified jobs:

models: ['models/gemma3']
quantizations: [] ...

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@molbap
Copy link
Contributor Author

molbap commented Mar 21, 2025

don't merge yet- the 4b crops test with pan_and_scan is broken with this change. Unsure yet if related to #36864 , checking

@molbap
Copy link
Contributor Author

molbap commented Mar 21, 2025

Ok, I think unrelated as well, https://huggingface.co/google/gemma-3-4b-pt/commit/2166bd1fba050eab3ed4fdd0b591ad1e11c96802 added another eos_token which causes the generation to stop earlier. Confirming, fixing in #36882 then we can merge

@molbap
Copy link
Contributor Author

molbap commented Mar 21, 2025

Not sure what is causing the difference, but it's not big AFAIK and other integration tests pass. I'll merge now.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@molbap molbap merged commit b8aadc3 into main Mar 21, 2025
15 checks passed
@molbap molbap deleted the position_ids_gemma3 branch March 21, 2025 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gemma3 (and Paligemma) position_ids 1-indexed?
4 participants