Skip to content

bug in new prefill_chunk_size implementation #38028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 4 tasks
SmerkyG opened this issue May 8, 2025 · 0 comments
Open
2 of 4 tasks

bug in new prefill_chunk_size implementation #38028

SmerkyG opened this issue May 8, 2025 · 0 comments
Labels

Comments

@SmerkyG
Copy link

SmerkyG commented May 8, 2025

System Info

Hi, I noticed a bug in the new chunked prefill code: the implementation does not check whether or not the forward method is compilable, as is done elsewhere in the code.

Specifically,

model_forward = self.get_compiled_call(generation_config.compile_config)

should include compilability checking code like found at

compile_forward = self._valid_auto_compile_criteria(model_kwargs, generation_config)

This error was encountered on Version: 4.51.3 but the code links above reference the main branch. Thanks!

Who can help?

@gante

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Any model that should not be compiled will be compiled when using GenerationConfig that contains prefill_chunk_size > 0, even if using disable_compile=True

Expected behavior

disable_compile=True should cause the model NOT to get compiled, even when using prefill_chunk_size

@SmerkyG SmerkyG added the bug label May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant