Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-throw certain specific OOM errors as 507 for SageMaker MME + repo path enhancement #4392

Merged
merged 7 commits into from
May 23, 2022

Conversation

nskool
Copy link
Contributor

@nskool nskool commented May 17, 2022

  1. This change is an addition to the MME changes in this PR: Enable Triton for SageMaker MME mode #4181.
  2. This change adds the ability for the SM endpoint to throw a 507 to the SM platform as per - https://docs.aws.amazon.com/sagemaker/latest/dg/mms-container-apis.html#multi-model-api-load-model.
  3. Can be updated to throw capture more specific errors for different backends.
  4. It also adds the ability for Triton on SM to handle the model artifact within two locations i.e. both /opt/ml/models/<hash>/model as well as /opt/ml/models/<hash>/model/<model_subdir>. The second one is useful if the customer is re-using a model after trying it in SME mode.

src/sagemaker_server.cc Outdated Show resolved Hide resolved
@nskool nskool changed the title Re-throw certain specific OOM errors as 507 for SageMaker MME Re-throw certain specific OOM errors as 507 for SageMaker MME + repo path enhancement May 19, 2022
return;
} else {
/* Return a 400*/
evhtp_send_reply(req, EVHTP_RES_BADREQ);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this should be done outside the for loop, so it checks if any messages are matched before returning 400 early

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nskool let me know when it is ready for a CI run

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, fixed - request you to run the CI

@GuanLuo GuanLuo merged commit 2965840 into triton-inference-server:main May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants