Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI schema #3477

Merged
merged 34 commits into from
Apr 5, 2024
Merged

OpenAI schema #3477

merged 34 commits into from
Apr 5, 2024

Conversation

tessapham
Copy link
Contributor

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #3419.

Type of changes
Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Feature/Issue validation/testing:

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • Test A

  • Test B

  • Logs

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Checklist:

  • Have you added unit/e2e tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

Release note:


docker-compose.yml Outdated Show resolved Hide resolved
Comment on lines 269 to 271
body._attributes["content-type"] == "application/cloudevents+json"
or body._attributes["content-type"] == "application/json"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use JSON_HEADERS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be harder to read since JSON_HEADERS is just a list, so it'll just be JSON_HEADERS[0], JSON_HEADERS[1].

"""Retrieve all models in the repository that implement the OpenAI interface"""
from .openai_model import OpenAIModel

return [model for _, model in repository.get_models().items() if isinstance(model, OpenAIModel)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this load ALL models in-memory or only the model that will be served?

pkg/agent/storage/https.go Outdated Show resolved Hide resolved
@tessapham tessapham force-pushed the openai branch 5 times, most recently from 3ba860b to a6c7510 Compare April 1, 2024 22:08

Args:
model_name (str): Model name.
request (bytes|GenerateRequest|ChatCompletionRequest): Generate Request / ChatCompletion Request body data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be CompletionCreateParams


Args:
model_name (str): Model name.
request (bytes|GenerateRequest|ChatCompletionRequest): Generate Request / ChatCompletion Request body data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be ChatCompletionCreateParams

python/kserve/kserve/protocol/rest/openai/dataplane.py Outdated Show resolved Hide resolved
async def create_chat_completion(
self, params: ChatCompletionCreateParams
) -> Union[ChatCompletion, AsyncIterator[ChatCompletionChunk]]:
if params.get("n", 1) != 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about 0 which I guess it is the default if not set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n = 1 by default as n indicates the number of chat completions to generate for an input message.

python/lgbserver/poetry.lock Outdated Show resolved Hide resolved
@tessapham tessapham force-pushed the openai branch 2 times, most recently from a2bc2d8 to d6f95ef Compare April 2, 2024 00:06
pkg/agent/storage/https.go Fixed Show fixed Hide fixed
@tessapham tessapham force-pushed the openai branch 2 times, most recently from 78e1190 to 270e0ed Compare April 2, 2024 00:47
@@ -0,0 +1,3 @@
from .openai_model import ChatPrompt, OpenAIChatAdapterModel, OpenAIModel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still missing the license from this file

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
@yuzisun yuzisun marked this pull request as ready for review April 3, 2024 07:26
@yuzisun
Copy link
Member

yuzisun commented Apr 5, 2024

Awesome work, thanks @tessapham @cmaddalozzo !

/lgtm
/approve

Copy link

oss-prow-bot bot commented Apr 5, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tessapham, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@oss-prow-bot oss-prow-bot bot added the approved label Apr 5, 2024
@oss-prow-bot oss-prow-bot bot merged commit 80bf4a0 into kserve:master Apr 5, 2024
56 checks passed
@yuzisun yuzisun mentioned this pull request Apr 7, 2024
9 tasks

def register_openai_endpoints(app: FastAPI, dataplane: OpenAIDataPlane):
endpoints = OpenAIEndpoints(dataplane)
openai_router = APIRouter(prefix=OPENAI_ROUTE_PREFIX, tags=["OpenAI"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a late comment. I think/v1 should be in the OPENAI_ROUTE_PREFIX from openai openapi file and the SDK instead of route, so user can overwrite the base_url/prefix.

any comment? @yuzisun

tjandy98 pushed a commit to tjandy98/kserve that referenced this pull request Apr 10, 2024
* OpenAI data models and endpoints from vLLM

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* more components for OpenAI endpoints

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* add OpenAI endpoints to router

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* modify generate() in data plane

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* class OpenAIModel

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* delete and rename files

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* add create_chat_completion() to OpenAIModel

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* update routers and lint

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* Implement streaming

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>

* Add tests for OpenAI data conversion

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>

* Register OpenAI endpoints when appropriate

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>

* Add comments

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>

* Add tests for create_completion and create_chat_completion

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>

* Remove completion types from dataplane methods

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>

* WIP

Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>

* fix lint errors

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* update poetry.lock

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* update poetry.lock files

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* add dependency

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* fix test

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* revert poetry.lock files

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* add .itermconfig to .gitignore

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* add docker-compose.yml to .gitignore

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* fix build error

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* fix function descriptions

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* increase limit for model decompression size

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* add license & autoformat

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* make openai dependency mandatory

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* openai dependency back to optional

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* fix openai module import error

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* fix JSON unmarshalling of headers

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* drop formatting changes in unrelated files

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* fix openai_is_available()

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

* black reformat

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>

---------

Signed-off-by: Tessa Pham <hpham111@bloomberg.net>
Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
Co-authored-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support OpenAI Schema for KServe LLM runtime
9 participants