-
-
Notifications
You must be signed in to change notification settings - Fork 9.3k
Open
Labels
feature requestNew feature or requestNew feature or request
Description
🚀 The feature, motivation and pitch
Currently, when handling HTTP requests with multiple subrequests, the response only includes kv_transfer_params
for one subrequest, making it impossible to access KV transfer information for other subrequests.
Related PR: #17751
Reference code:
vllm/vllm/entrypoints/openai/serving_completion.py
Lines 514 to 520 in e384f2f
return CompletionResponse( | |
id=request_id, | |
created=created_time, | |
model=model_name, | |
choices=choices, | |
usage=usage, | |
kv_transfer_params=final_res_batch[0].kv_transfer_params) |
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
feature requestNew feature or requestNew feature or request