Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JVM detect the CPU count as 1 when more CPUs are available for the container. #82

Closed
amaharek opened this issue Mar 15, 2021 · 2 comments

Comments

@amaharek
Copy link
Contributor

Describe the bug
This issue is related to the issue aws/sagemaker-python-sdk#1275

JVM detect the CPU count as 1 when more CPUs are available for the container.

To reproduce

  1. Clone the SaeMaker example
  2. Deploy the model using the same endpoint.
  3. Check CloudWatch logs and the number of CPU cores detected will be like Number of CPUs: 1

Expected behavior
The CPU count from CloudWatch should match the CPU count for the used instance. For example, 4 if the instance is ml.m4.xlarge

System information
Container: pytorch-inference:1.5-gpu-py3
SageMaker inference v1.1.2

@daniel-hanmoi-choi
Copy link

daniel-hanmoi-choi commented Mar 18, 2021

@amaharek We had the same issue and fixed with this

TOOLKIT_PATH=python -c "import sagemaker_inference;print(sagemaker_inference.__path__[0])"

Add the following to the Dockerfile and build a new image based on the above

Single-model

RUN echo "vmargs=-XX:-UseContainerSupport" >> $TOOLKIT_PATH/etc/default-mms.properties

Multi-model

RUN echo "vmargs=-XX:-UseContainerSupport" >> $TOOLKIT_PATH/etc/mme-mms.properties
Correspondence

About UserContainerSupport

https://www.eclipse.org/openj9/docs/xxusecontainersupport/
https://blog.softwaremill.com/docker-support-in-new-java-8-finally-fd595df0ca54

@amaharek
Copy link
Contributor Author

amaharek commented Jul 9, 2021

PR 83 has been merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants