[QUESTION] How to figure out correct `injection_policy` for Flan-T5

I would like to use deepspeed-inference with the flan-t5 model and I have the following code:

```
def get_model():
    model_name = "google/flan-t5-small"
    tensor_parallel = int(os.getenv("TENSOR_PARALLEL_DEGREE", "2"))
    local_rank = int(os.getenv("LOCAL_RANK", "0"))
    model = T5ForConditionalGeneration.from_pretrained(
        model_name, device_map="auto"
    )
    tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-small")

    # create the model
    config = DeepSpeedInferenceConfig(
        replace_with_kernel_inject=True,
        dtype=model.dtype,
        tensor_parallel=DeepSpeedTPConfig(
            enabled=True, tp_size=tensor_parallel, mpu=None, tp_group=None
        ),
        injection_policy={T5Block: ('SelfAttention.o', 'EncDecAttention.o', 'DenseReluDense.wo')}
    )

    model = deepspeed.init_inference(
        model,
        config=config,
    )
    generator = pipeline(
        task="text2text-generation", model=model, tokenizer=tokenizer, device=local_rank # TODO: try text2text-generation instead
    )
    return generator
```

Basically I'm wondering if I can use the `T5Block` class in the `injection_policy` for the flan-t5 model as it's part of the same model family. I'm wondering how I can figure out whether this would or wouldn't work without just more or less blindly trying both out.

More generally, how can I find more information on the requirements of an `injection_policy` for models and verifying that the `injection_policy` actually makes sense?

I have read:
* https://deepspeed.readthedocs.io/en/latest/inference-init.html
* https://www.deepspeed.ai/tutorials/inference-tutorial/#initializing-for-inference

but wasn't able to find an answer to my question.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] How to figure out correct `injection_policy` for Flan-T5 #2689

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[QUESTION] How to figure out correct injection_policy for Flan-T5 #2689

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[QUESTION] How to figure out correct `injection_policy` for Flan-T5 #2689