Skip to content

About watermarks of physical memory allocation #263

Answered by WoosukKwon
LinPoly asked this question in Q&A
Discussion options

You must be logged in to vote

@LinPoly Thanks for the question! This is not a bug. The watermark is to prevent frequent preemptions (i.e., swapping or recomputation) which can be caused by accepting too many new requests in the batch. For the existing requests in the batch, we want them to use every slot in the KV cache.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by zhuohan123
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #255 on June 26, 2023 18:15.