Do not erase a cache_position passed explicitly to generate(), if there is one #37986

FremyCompany · 2025-05-06T21:18:40Z

What does this PR do?

Currently, any cache_position passed to model.generate() is silently replaced by the initial initialization. This seems wrong, since there is no need to reinitialize an argument that has already been passed. In some models, being able to pass custom cache_positions which are not 0-initialized is required for proper behavior.

This PR adds a check to the initialization code of model_kwargs['cache_position'] to return early if that argument was already provided.

There is no linked issue because I fixed the bug before filing an issue. PR is thus easier.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

I don't think that this warrants any documentation change, the current behavior is not described anywhere. I could add a test, but it would be very artificial. I think merging as-is is fine, but I welcome feedback.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ArthurZucker @zucchini-nlp @gante

github-actions · 2025-05-06T21:18:54Z

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

zucchini-nlp

Thanks for the PR! I am curious, is the logic to create build positions from scratch not working in some cases?

Not many users know about cache_position or pass a custom new position, so I guess this is for a custom usage with compressed cache

FremyCompany · 2025-05-07T19:33:23Z

The use case is related to Mamba models, so the logic for initializing the cache position unfortunately doesn't work because there is no way to know how many tokens were already processed, and starting with index 0 disables the cache since it is supposed not to be initialized yet for the first token.

FremyCompany · 2025-05-07T19:36:44Z

Calling generate is the easiest way to continue generating after the first batch of tokens. Calling forward directly would also work but then it's necessary to reimplement all the sampling stuff which generate usually does for you in a very convenient way.

zucchini-nlp

Thanks for explaining, makes sense. For special models we don't infer cache position correctly. Can you makr the PR ready for review and rebase main, so it can be merged?

FremyCompany · 2025-05-08T14:25:15Z

Yes, will do tomorrow.

gante

👌

Thank you for noticing it!

gante · 2025-05-08T14:53:27Z

@FremyCompany to make our CI happy:

Tag the PR as Ready for review (button at the bottom of the PR page)
run make fixup on your terminal, inside the transformers root folder (this runs automated code formatting)
commit the changes

…erate(), if there is one. But: Let initialization replace cache_position if it's set to None. I assume that if the value is explicitly passed but None, we should initialize anyway.

HuggingFaceDocBuilderDev · 2025-05-09T10:57:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…re is one (huggingface#37986) Do not erase a cache_position initialization passed explicitly to generate(), if there is one. But: Let initialization replace cache_position if it's set to None. I assume that if the value is explicitly passed but None, we should initialize anyway.

github-actions bot marked this pull request as draft May 6, 2025 21:18

zucchini-nlp reviewed May 7, 2025

View reviewed changes

zucchini-nlp approved these changes May 8, 2025

View reviewed changes

gante approved these changes May 8, 2025

View reviewed changes

Do not erase a cache_position initialization passed explicitly to gen…

21c3373

…erate(), if there is one. But: Let initialization replace cache_position if it's set to None. I assume that if the value is explicitly passed but None, we should initialize anyway.

FremyCompany force-pushed the keep_cache_position branch from aa4e0e5 to 21c3373 Compare May 9, 2025 09:19

FremyCompany marked this pull request as ready for review May 9, 2025 09:19

zucchini-nlp enabled auto-merge (squash) May 9, 2025 10:43

zucchini-nlp merged commit 774dc27 into huggingface:main May 9, 2025
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not erase a cache_position passed explicitly to generate(), if there is one #37986

Do not erase a cache_position passed explicitly to generate(), if there is one #37986

Uh oh!

FremyCompany commented May 6, 2025 •

edited

Loading

Uh oh!

github-actions bot commented May 6, 2025

Uh oh!

zucchini-nlp left a comment

Uh oh!

FremyCompany commented May 7, 2025

Uh oh!

FremyCompany commented May 7, 2025

Uh oh!

zucchini-nlp left a comment

Uh oh!

FremyCompany commented May 8, 2025

Uh oh!

gante left a comment

Uh oh!

gante commented May 8, 2025

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 9, 2025

Uh oh!

Uh oh!

Do not erase a cache_position passed explicitly to generate(), if there is one #37986

Do not erase a cache_position passed explicitly to generate(), if there is one #37986

Uh oh!

Conversation

FremyCompany commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

github-actions bot commented May 6, 2025

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

FremyCompany commented May 7, 2025

Uh oh!

FremyCompany commented May 7, 2025

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

FremyCompany commented May 8, 2025

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

gante commented May 8, 2025

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 9, 2025

Uh oh!

Uh oh!

FremyCompany commented May 6, 2025 •

edited

Loading