[generate] Always use decoder config to init cache #40772

gante · 2025-09-09T16:04:49Z

What does this PR do?

(see title)

HuggingFaceDocBuilderDev · 2025-09-09T16:20:31Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante · 2025-09-09T16:23:41Z

src/transformers/generation/utils.py

            if requires_cross_attention_cache:
                cross_attention_cache_kwargs = {
-                    "config": self.config.get_text_config(encoder=True),
+                    "config": self.config.get_text_config(decoder=True),


🙈 🙈 🙈 past self is derp

Sorry? This one should not be changed no?

No, we want to use the decoder cache! I had the exact same thought [use encoder config] in a recent PR :D

In a nutshell: the config is used here to

determine which type of layers are used on cross attention

the number of layers of cross attention we have

Cross attention is a layer in the 👉 decoder 👈 model -- it's the attention between the encoder outputs and the data (tokens on llm) being decoded

Oh indeed! Easy to get confused with the name of the cache (EncoderDecoderCache) haha!

Cyrilvallez · 2025-09-10T12:02:37Z

src/transformers/cache_utils.py

        # If a config is passed, use it to infer the layer types and initialize accordingly
        if config is not None:
-            config = config.get_text_config()
+            config = config.get_text_config(decoder=True)


Could that somehow clash if we pass an encoder config to it elsewhere? I.e. for encoder/decoder cache

We always want the decoder config for KV cache purposes

The encoder is only used once to get the encoder outputs, which are then autoregressivelly used in the decoder with cross attention. Both self-attention and cross-attention are decoder layers, and thus parameterized by the decoder.

Cyrilvallez

LGTM!! Indeed, thanks for the fix and the added explanation!
Merging it as an unrelated test is failing, and this is ready to go!

* mega derp * fix * always use the decoder

gante added 2 commits September 9, 2025 16:06

mega derp

cd2b152

fix

dd57459

gante force-pushed the use_right_config branch from 0d88c0d to dd57459 Compare September 9, 2025 16:10

always use the decoder

0e4767e

gante changed the title ~~[generate] Use the right config to init encoder cache~~ [generate] Always use decoder config to init encoder cache Sep 9, 2025

gante changed the title ~~[generate] Always use decoder config to init encoder cache~~ [generate] Always use decoder config to init cache Sep 9, 2025

gante commented Sep 9, 2025

View reviewed changes

gante requested a review from Cyrilvallez September 9, 2025 16:24

Cyrilvallez reviewed Sep 10, 2025

View reviewed changes

Cyrilvallez approved these changes Sep 12, 2025

View reviewed changes

Cyrilvallez merged commit 6eb3255 into huggingface:main Sep 12, 2025
21 of 23 checks passed

gante deleted the use_right_config branch September 12, 2025 16:39

qgallouedec mentioned this pull request Sep 13, 2025

🛠️ Fix CI huggingface/trl#4076

Merged

ErfanBaghaei pushed a commit to ErfanBaghaei/transformers that referenced this pull request Sep 25, 2025

[generate] Always use decoder config to init cache (huggingface#40772)

a790005

* mega derp * fix * always use the decoder

vijayabhaskar-ev pushed a commit to vijayabhaskar-ev/transformers that referenced this pull request Oct 2, 2025

[generate] Always use decoder config to init cache (huggingface#40772)

a361b64

* mega derp * fix * always use the decoder

yuchenxie4645 pushed a commit to yuchenxie4645/transformers that referenced this pull request Oct 4, 2025

[generate] Always use decoder config to init cache (huggingface#40772)

fe88ad9

* mega derp * fix * always use the decoder

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[generate] Always use decoder config to init cache #40772

[generate] Always use decoder config to init cache #40772

Uh oh!

gante commented Sep 9, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Sep 9, 2025

Uh oh!

gante Sep 9, 2025 •

edited

Loading

Uh oh!

Cyrilvallez Sep 10, 2025

Uh oh!

gante Sep 11, 2025 •

edited

Loading

Uh oh!

Cyrilvallez Sep 12, 2025

Uh oh!

Cyrilvallez Sep 10, 2025

Uh oh!

gante Sep 11, 2025

Uh oh!

Cyrilvallez left a comment

Uh oh!

Uh oh!

Uh oh!

[generate] Always use decoder config to init cache #40772

[generate] Always use decoder config to init cache #40772

Uh oh!

Conversation

gante commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Sep 9, 2025

Uh oh!

gante Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

gante Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Sep 10, 2025

Choose a reason for hiding this comment

Uh oh!

gante Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gante commented Sep 9, 2025 •

edited

Loading

gante Sep 9, 2025 •

edited

Loading

gante Sep 11, 2025 •

edited

Loading