-
Notifications
You must be signed in to change notification settings - Fork 31.5k
Description
System Info
Hi, I noticed a bug in the new chunked prefill code: the implementation does not check whether or not the forward method is compilable, as is done elsewhere in the code.
Specifically,
transformers/src/transformers/generation/utils.py
Line 4910 in d231f5a
| model_forward = self.get_compiled_call(generation_config.compile_config) |
should include compilability checking code like found at
transformers/src/transformers/generation/utils.py
Line 3437 in d231f5a
| compile_forward = self._valid_auto_compile_criteria(model_kwargs, generation_config) |
This error was encountered on Version: 4.51.3 but the code links above reference the main branch. Thanks!
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Any model that should not be compiled will be compiled when using GenerationConfig that contains prefill_chunk_size > 0, even if using disable_compile=True
Expected behavior
disable_compile=True should cause the model NOT to get compiled, even when using prefill_chunk_size