-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor the ModelServer to let uvicorn handle multiple workers #3757
base: master
Are you sure you want to change the base?
Refactor the ModelServer to let uvicorn handle multiple workers #3757
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sivanantha321 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
app.add_middleware( | ||
TimingMiddleware, | ||
client=PrintTimings(), | ||
metric_namer=StarletteScopeToName(prefix="kserve.io", starlette_app=app), | ||
) | ||
self.cfg = uvicorn.Config( | ||
app=app, | ||
app="kserve.model_server:app", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import string style is required for using --workers
and --reload
options. https://www.uvicorn.org/deployment/#running-programmatically
- Refactored the ModelServer to let uvicorn handle multiple workers. This will remove the bottleneck of using 'fork' for multiprocessing - Make FastAPI app instance easily accessible across the project so that users can easily add middlewares and custom exception handlers for custom models. - Use uvloop eventpolicy Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
be0f628
to
53dd335
Compare
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
e50015e
to
6f04620
Compare
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
d4510e2
to
00fdc0d
Compare
This can be called by a custom exception handler that wants to defer to the default handler behavior. | ||
""" | ||
# gracefully shutdown the server | ||
loop.run_until_complete(self.stop()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need to shut down the server?
if "future" in context: | ||
future = context["future"] | ||
if future.done(): | ||
future_exception = future.exception() | ||
if future_exception: | ||
logger.error(f"Future exception: {future_exception}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the difference between exception and future_exception?
|
||
@pytest.fixture(scope="class") | ||
def app(self, server): # pylint: disable=no-self-use | ||
mp = pytest.MonkeyPatch() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need monkey patch ?
await server.model_repository_extension.unload("TestModel") | ||
await server.model_repository_extension.unload("NotReadyModel") | ||
kserve_app.routes.clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are executed after app is terminated?
What this PR does / why we need it:
Refactored the ModelServer to let uvicorn handle multiple workers. This will remove the bottleneck of using 'fork' for multiprocessing
Make FastAPI app instance easily accessible across the project so that users can easily add middlewares and custom exception handlers for custom models.
Use uvloop eventpolicyWhich issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Type of changes
Please delete options that are not relevant.
Feature/Issue validation/testing:
Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Test B
Logs
Special notes for your reviewer:
Checklist:
Release note:
Re-running failed tests
/rerun-all
- rerun all failed workflows./rerun-workflow <workflow name>
- rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.