-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Description
In my case (model is served whose architecture is same to bloom), It looks like the truncation doesn't apply. I think this situation occurs after 9987960. How can I fix this?
text-generation-launcher \
--model-id /mount/lm_storage/checkpoints/alibi_2048_1.3b_v2 \ # same to bloom
--num-shard 4 \
--port 6006 \
--max-input-length 2048 \
--max-total-tokens 2560from text_generation import Client
client = Client(base_url=f"{my_url}")
text = "how are you?" * 1000
a=client.generate(text, max_new_tokens=1, truncate=2048)text_generation.errors.ValidationError: Input validation error: `inputs` tokens + `max_new_tokens` must be <= 2560. Given: 5000 `inputs` tokens and 1 `max_new_tokens`seopbo and danny980521
Metadata
Metadata
Assignees
Labels
No labels