Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the ModelServer to let uvicorn handle multiple workers #3757

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

sivanantha321
Copy link
Member

@sivanantha321 sivanantha321 commented Jun 23, 2024

What this PR does / why we need it:

  • Refactored the ModelServer to let uvicorn handle multiple workers. This will remove the bottleneck of using 'fork' for multiprocessing

  • Make FastAPI app instance easily accessible across the project so that users can easily add middlewares and custom exception handlers for custom models.

  • Use uvloop eventpolicy

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):

Type of changes
Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Feature/Issue validation/testing:

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • Test A

  • Test B

  • Logs

Special notes for your reviewer:

  1. Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

Checklist:

  • Have you added unit/e2e tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

Release note:


Re-running failed tests

  • /rerun-all - rerun all failed workflows.
  • /rerun-workflow <workflow name> - rerun a specific failed workflow. Only one workflow name can be specified. Multiple /rerun-workflow commands are allowed per comment.

Copy link

oss-prow-bot bot commented Jun 23, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sivanantha321
Once this PR has been reviewed and has the lgtm label, please assign njhill for approval by writing /assign @njhill in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sivanantha321 sivanantha321 changed the title Refactore the ModelServer to let uvicorn handle multiple workers Refactor the ModelServer to let uvicorn handle multiple workers Jun 23, 2024
app.add_middleware(
TimingMiddleware,
client=PrintTimings(),
metric_namer=StarletteScopeToName(prefix="kserve.io", starlette_app=app),
)
self.cfg = uvicorn.Config(
app=app,
app="kserve.model_server:app",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import string style is required for using --workers and --reload options. https://www.uvicorn.org/deployment/#running-programmatically

- Refactored the ModelServer to let uvicorn handle multiple workers. This will remove the bottleneck of using 'fork' for multiprocessing

- Make FastAPI app instance easily accessible across the project so that users can easily add middlewares and custom exception handlers for custom models.

- Use uvloop eventpolicy

Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
@sivanantha321 sivanantha321 force-pushed the refactor-multiprocessing-modelserver branch from be0f628 to 53dd335 Compare July 8, 2024 09:59
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
@sivanantha321 sivanantha321 marked this pull request as ready for review July 8, 2024 10:59
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
@sivanantha321 sivanantha321 force-pushed the refactor-multiprocessing-modelserver branch from e50015e to 6f04620 Compare July 9, 2024 13:27
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
@sivanantha321 sivanantha321 force-pushed the refactor-multiprocessing-modelserver branch from d4510e2 to 00fdc0d Compare July 9, 2024 16:32
This can be called by a custom exception handler that wants to defer to the default handler behavior.
"""
# gracefully shutdown the server
loop.run_until_complete(self.stop())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to shut down the server?

Comment on lines +349 to +354
if "future" in context:
future = context["future"]
if future.done():
future_exception = future.exception()
if future_exception:
logger.error(f"Future exception: {future_exception}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the difference between exception and future_exception?


@pytest.fixture(scope="class")
def app(self, server): # pylint: disable=no-self-use
mp = pytest.MonkeyPatch()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need monkey patch ?

Comment on lines +46 to +48
await server.model_repository_extension.unload("TestModel")
await server.model_repository_extension.unload("NotReadyModel")
kserve_app.routes.clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are executed after app is terminated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants