Skip to content

KV Cache simulation for calculation of request prefill time #178

@irar2

Description

@irar2

For a request, check if we already have a part of it in the local kv cache, and only use the remaining number of tokens in prefill time calculation

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions