Skip to content

truncation is not working #189

@seopbo

Description

@seopbo

In my case (model is served whose architecture is same to bloom), It looks like the truncation doesn't apply. I think this situation occurs after 9987960. How can I fix this?

text-generation-launcher \
--model-id /mount/lm_storage/checkpoints/alibi_2048_1.3b_v2 \ # same to bloom
--num-shard 4 \
--port 6006 \
--max-input-length 2048 \
--max-total-tokens 2560
from text_generation import Client

client = Client(base_url=f"{my_url}")
text = "how are you?" * 1000
a=client.generate(text, max_new_tokens=1, truncate=2048)
text_generation.errors.ValidationError: Input validation error: `inputs` tokens + `max_new_tokens` must be <= 2560. Given: 5000 `inputs` tokens and 1 `max_new_tokens`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions