Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fixing issue #82 #83

Merged
merged 2 commits into from
Jul 9, 2021
Merged

fix: Fixing issue #82 #83

merged 2 commits into from
Jul 9, 2021

Conversation

amaharek
Copy link
Contributor

@amaharek amaharek commented Mar 15, 2021

Issue #, if available:
Fixing issue #82

Description of changes:
The change includes using the option `` as suggested in the documentation

Testing done:
I have built a custom container using the patched version and the CPU count matches available CPU count in the container.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-inference-toolkit-pr
  • Commit ID: 236a49c
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@@ -150,6 +150,7 @@ def _generate_mms_config_properties():
"default_workers_per_model": env.model_server_workers,
"inference_address": "http://0.0.0.0:{}".format(env.inference_http_port),
"management_address": "http://0.0.0.0:{}".format(env.management_http_port),
"vmargs": "-XX:-UseContainerSupport",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should make this configurable via SM env? Not sure if it would require additional changes anywhere else +@dhanainme
"vmargs": env.vmargs if env.vmargs else "-XX:-UseContainerSupport"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. TS_VM_ARGS could be the env variable where we can pick up from

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change needs to happen here in this file

self._module_name = os.environ.get(parameters.USER_PROGRAM_ENV, DEFAULT_MODULE_NAME)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have tried the fix in this PR that should sovle #82 but we see no difference in the number of CPU's logged in cloudwatch. So not sure if more changes are involved, but this fix as seperate change seems not solving the issue. The container we use (pytorch 1.7.1, torch-serve 0.4.0), uses JDK 11 which should have the property -XX:-UseContainerSupport enabled by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it won't work for PyTorch >= 1.6 containers since torchserve model server is used

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why this doesn't fix PT >=1.6 is because the pytorch inference toolkit needs similar fix.

@vdantu
Copy link

vdantu commented Jul 7, 2021

Curious to know why this PR is still pending?

@ahsan-z-khan ahsan-z-khan changed the title Fixing issue #82 fix: Fixing issue #82 Jul 9, 2021
@ahsan-z-khan ahsan-z-khan merged commit 7fcb805 into aws:master Jul 9, 2021
@henryhu666
Copy link

Hi, we are facing the same issue and would like to use this fix in Sagemaker. Is there a plan to cut a release anytime soon?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants