fix max seq len #489

LiuXiaoxuanPKU · 2023-07-18T06:14:43Z

This PR tries to remove the definition of max sequence length. A response is capped only when it's longer than max_model_len. A request is ignored if the input prompt is longer than min(max_model_len, max_num_batched_tokens). This should fix #446.

zhuohan123

LGTM! Thanks for the quick fix!

LiuXiaoxuanPKU added 2 commits July 17, 2023 23:09

fix max seq len

eb9de78

fix style

ba8e0ac

LiuXiaoxuanPKU requested a review from zhuohan123 July 18, 2023 06:17

zhuohan123 approved these changes Jul 18, 2023

View reviewed changes

zhuohan123 merged commit b4b195b into vllm-project:main Jul 18, 2023
2 checks passed

zhuohan123 mentioned this pull request Jul 18, 2023

Max prompt tokens/sequence length limit in vllm core scheduler #453

Closed

LiuXiaoxuanPKU mentioned this pull request Jul 20, 2023

LlaMA 2: Input prompt (2664 tokens) is too long and exceeds limit of 2048/2560 #525

Closed

LiuXiaoxuanPKU deleted the fix_seq branch August 10, 2023 17:49

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

fix max seq len (vllm-project#489)

9df95fd

sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024

fix max seq len (vllm-project#489)

4d45439

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix max seq len #489

fix max seq len #489

LiuXiaoxuanPKU commented Jul 18, 2023

zhuohan123 left a comment

fix max seq len #489

fix max seq len #489

Conversation

LiuXiaoxuanPKU commented Jul 18, 2023

zhuohan123 left a comment

Choose a reason for hiding this comment