Skip to content

get_parameter(model_config, "max_attention_window_size", int) not support list #554

@Alireza3242

Description

@Alireza3242

System Info

a100

Who can help?

@ncomly-nvidia

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

In all_models/inflight_batcher_llm/tensorrt_llm/1/model.py line 422 we have:

get_parameter(model_config, "max_attention_window_size", int),

But i want to set max_attention_window_size as a list. Every layer have a max_attention_window_size.
Also i want to set this list in:
all_models/inflight_batcher_llm/tensorrt_llm/config.pbtxt:

parameters: {
  key: "max_attention_window_size"
  value: {
    string_value: "${max_attention_window_size}"
  }
}

I use this feature for gemma2:

max_attention_window_size = [8192, 4096]*21

Expected behavior

support list.

actual behavior

not support list. Only support int

additional notes

.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions