fix: Fixing issue #82 #83

amaharek · 2021-03-15T11:36:58Z

Issue #, if available:
Fixing issue #82

Description of changes:
The change includes using the option `` as suggested in the documentation

Testing done:
I have built a custom container using the patched version and the CPU count matches available CPU count in the container.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

sagemaker-bot · 2021-03-15T11:48:18Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-inference-toolkit-pr
Commit ID: 236a49c
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

maaquib · 2021-03-16T17:22:11Z

src/sagemaker_inference/model_server.py

@@ -150,6 +150,7 @@ def _generate_mms_config_properties():
        "default_workers_per_model": env.model_server_workers,
        "inference_address": "http://0.0.0.0:{}".format(env.inference_http_port),
        "management_address": "http://0.0.0.0:{}".format(env.management_http_port),
+        "vmargs": "-XX:-UseContainerSupport",


Wondering if we should make this configurable via SM env? Not sure if it would require additional changes anywhere else +@dhanainme
"vmargs": env.vmargs if env.vmargs else "-XX:-UseContainerSupport"

+1. TS_VM_ARGS could be the env variable where we can pick up from

This change needs to happen here in this file

sagemaker-inference-toolkit/src/sagemaker_inference/environment.py

Line 66 in cb9e793

self._module_name = os.environ.get(parameters.USER_PROGRAM_ENV, DEFAULT_MODULE_NAME)

We have tried the fix in this PR that should sovle #82 but we see no difference in the number of CPU's logged in cloudwatch. So not sure if more changes are involved, but this fix as seperate change seems not solving the issue. The container we use (pytorch 1.7.1, torch-serve 0.4.0), uses JDK 11 which should have the property -XX:-UseContainerSupport enabled by default.

I think it won't work for PyTorch >= 1.6 containers since torchserve model server is used

The reason why this doesn't fix PT >=1.6 is because the pytorch inference toolkit needs similar fix.

vdantu · 2021-07-07T17:04:40Z

Curious to know why this PR is still pending?

henryhu666 · 2021-10-21T03:58:42Z

Hi, we are facing the same issue and would like to use this fix in Sagemaker. Is there a plan to cut a release anytime soon?

amaharek added 2 commits March 15, 2021 11:33

fixing issue 82

42f4043

removing extra line

236a49c

maaquib reviewed Mar 16, 2021

View reviewed changes

vdantu mentioned this pull request Jul 7, 2021

Add VMARGS to disable the container support. #87

Closed

6 tasks

dhanainme approved these changes Jul 9, 2021

View reviewed changes

ahsan-z-khan approved these changes Jul 9, 2021

View reviewed changes

ahsan-z-khan changed the title ~~Fixing issue #82~~ fix: Fixing issue #82 Jul 9, 2021

ahsan-z-khan merged commit 7fcb805 into aws:master Jul 9, 2021

amaharek mentioned this pull request Jul 9, 2021

JVM detect the CPU count as 1 when more CPUs are available for the container. #82

Closed

davidthomas426 mentioned this pull request Jan 22, 2023

add vmargs=-XX:-UseContainerSupport in config aws/sagemaker-pytorch-inference-toolkit#136

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Fixing issue #82 #83

fix: Fixing issue #82 #83

amaharek commented Mar 15, 2021 •

edited

Loading

sagemaker-bot commented Mar 15, 2021

maaquib Mar 16, 2021

dhanainme Mar 16, 2021

dhanainme Mar 16, 2021

npalm Mar 17, 2021

amaharek Mar 17, 2021

vdantu Jul 7, 2021

vdantu commented Jul 7, 2021

henryhu666 commented Oct 21, 2021

fix: Fixing issue #82 #83

fix: Fixing issue #82 #83

Conversation

amaharek commented Mar 15, 2021 • edited Loading

sagemaker-bot commented Mar 15, 2021

AWS CodeBuild CI Report

maaquib Mar 16, 2021

Choose a reason for hiding this comment

dhanainme Mar 16, 2021

Choose a reason for hiding this comment

dhanainme Mar 16, 2021

Choose a reason for hiding this comment

npalm Mar 17, 2021

Choose a reason for hiding this comment

amaharek Mar 17, 2021

Choose a reason for hiding this comment

vdantu Jul 7, 2021

Choose a reason for hiding this comment

vdantu commented Jul 7, 2021

henryhu666 commented Oct 21, 2021

amaharek commented Mar 15, 2021 •

edited

Loading