Skip to content

bug in new prefill_chunk_size implementation #38028

@SmerkyG

Description

@SmerkyG

System Info

Hi, I noticed a bug in the new chunked prefill code: the implementation does not check whether or not the forward method is compilable, as is done elsewhere in the code.

Specifically,

model_forward = self.get_compiled_call(generation_config.compile_config)

should include compilability checking code like found at

compile_forward = self._valid_auto_compile_criteria(model_kwargs, generation_config)

This error was encountered on Version: 4.51.3 but the code links above reference the main branch. Thanks!

Who can help?

@gante

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Any model that should not be compiled will be compiled when using GenerationConfig that contains prefill_chunk_size > 0, even if using disable_compile=True

Expected behavior

disable_compile=True should cause the model NOT to get compiled, even when using prefill_chunk_size

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions