-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
[Frontend] Expose custom args in OpenAI APIs #16862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Frontend] Expose custom args in OpenAI APIs #16862
Conversation
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Thanks @afeldman-nm! It would be good to include a test that shows how these can be passed via the OpenAI client sdk using its I'm unsure whether we want these new custom args to be in a nested json object (as you've done here) or just extra top-level args. |
Thanks @njhill . Agree regarding the unit test. I need to think a bit about the right way to do it |
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Thanks for your review @comaniac . After chatting with Cody, I think this interface change is sufficiently impactful to merit an RFC which I will write and share shortly. |
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
Hi @helloworld1 - working on getting this PR landed as-is |
Signed-off-by: Andrew Feldman <afeldman@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @afeldman-nm LGTM, just a couple of minor comments.
Co-authored-by: Nick Hill <nhill@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @afeldman-nm
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com> Signed-off-by: Andrew Feldman <afeldman@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com> Signed-off-by: Andrew Feldman <afeldman@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com> Signed-off-by: minpeter <kali2005611@gmail.com>
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com> Signed-off-by: Andrew Feldman <afeldman@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com> Signed-off-by: Yang Wang <elainewy@meta.com>
Add a
vllm_xargs: Optional[dict[str, Union[str,int,float]]]
field toCompletionRequest
,ChatCompletionRequest
andTranscriptionRequest
(these are the only OpenAIBaseModel subclasses which had alogits_processors
field in v0.) This field is injected intoSamplingParams.extra_args
viaSamplingParams.from_optional()
; each dict key/value pair inextra_args
becomes an assignment to an attribute ofsampling_params
.Purpose
Enable extensible features such as logits processors and plugins to receive arbitrary custom arguments via the REST API. Mirror
SamplingParams.extra_args
in the REST API.Test plan
Does not require additional unit tests (when logitsprocs extensibility is introduced later, this will implicitly test custom args). Pre-existing unit tests must pass so we know pre-existing features are not being broken.
Test results
N/A
Documentation changes
SamplingParams
docstring clarifies thatextra_args
may plumb custom args to logitsprocs, plugins, etc. (previous it just said logitsprocs)vllm/entrypoints/openai/protocol.py
: forCompletionRequest
,ChatCompletionRequest
, andTranscriptionRequest
, movevllm_xargs
definition inside the# --8<-- [start:completion-extra-params]
section and add more detail to the description string.Final note
The pre-existing behavior of
protocol.py
ChatCompletionRequest
andCompletionRequest
is thatkv_transfer_params
is passed into the engine viaSamplingParams.extra_args
; this PR simply mergesvllm_xargs
intoSamplingParams.extra_args
alongsidekv_transfer_params
. In the future it may be worth considering whetherSamplingParams.extra_args
is the best pathway for plumbingkv_transfer_params
into the engine; it would seem to break the convention thatSamplingParams.extra_args
is not intended for "in-tree" functionality.RFC: #17191
Fixes #16802