fix: honor static cache max len config#46436

Open

he-yufeng wants to merge 1 commit into

huggingface:mainfrom

he-yufeng:fix/static-cache-config-max-len

Contributor

he-yufeng commented Jun 5, 2026

Summary

honor generation_config.cache_config["max_cache_len"] when auto-preparing static caches
keep the cache length at least as large as the current generation requires
add a tiny GPT-2 regression test that checks the prepared StaticCache size

Fixes #46424.

To verify

PYTHONPATH=src python -m pytest tests\generation\test_utils.py::GenerationIntegrationTests::test_static_cache_uses_max_cache_len_from_cache_config -q
python -m py_compile src\transformers\generation\utils.py tests\generation\test_utils.py
git diff --check


          fix: honor static cache max len config

f50f45e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet