Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Raise error when max_model_len is larger than KV cache size #2163

Merged
merged 1 commit into from
Dec 18, 2023

Conversation

WoosukKwon
Copy link
Collaborator

Currently, vLLM hangs when the length of a single sequence is larger than the system's KV cache size.

@WoosukKwon WoosukKwon changed the title [BugFix] Raise error when max_model_len larger than KV cache size [BugFix] Raise error when max_model_len is larger than KV cache size Dec 18, 2023
@WoosukKwon WoosukKwon merged commit 8041b73 into main Dec 18, 2023
2 checks passed
@WoosukKwon WoosukKwon deleted the too-long-sequence branch December 18, 2023 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants