Skip to content

Conversation

nikhil-sk
Copy link
Contributor

@nikhil-sk nikhil-sk commented Sep 27, 2021

Issue #, if available:
NA

Description of changes:

  1. Add feature to specify the following properties for a model that help in a batch inference:
batchSize
maxBatchDelay
minWorkers
maxWorkers
responseTimeout
  1. The above properties for the model have been exposed as the following environment variables in the toolkit:
SAGEMAKER_TS_BATCH_SIZE
SAGEMAKER_TS_MAX_BATCH_DELAY
SAGEMAKER_TS_MIN_WORKERS
SAGEMAKER_TS_MAX_WORKERS
SAGEMAKER_TS_RESPONSE_TIMEOUT

These properties need to be supplied in a dictionary form to the config option 'env' when configuring a model using the sagemaker python sdk.

  1. Note: These properties only apply in a single model inference on SageMaker. For multi-model endpoint, a user still needs to bake-in the config.properties file, and list the models in the config file.

Logs

When run in SageMaker, the model config is correctly picked up from the environment when specified as follows:

Input

from sagemaker.pytorch.model import PyTorchModel

env_variables_dict = {
    "SAGEMAKER_TS_BATCH_SIZE": "3",
    "SAGEMAKER_TS_MAX_BATCH_DELAY": "100000"
}

pytorch_model = PyTorchModel(
    model_data=model_artifact,
    role=role,
    image_uri=image_uri,
    source_dir="code",
    framework_version='1.9',
    entry_point="inference.py",
    env=env_variables_dict
)

Output:

2n5r6rur8a-algo-1-33bni | WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
2n5r6rur8a-algo-1-33bni | ['torchserve', '--start', '--model-store', '/.sagemaker/ts/models', '--ts-config', '/etc/sagemaker-ts.properties', '--log-config', '/sagemaker-pytorch-inference-toolkit/src/sagemaker_pytorch_serving_container/etc/log4j.properties', '--models', 'model.mar']
2n5r6rur8a-algo-1-33bni | 2021-09-27 19:06:42,737 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2n5r6rur8a-algo-1-33bni | 2021-09-27 19:06:42,927 [INFO ] main org.pytorch.serve.ModelServer - 
2n5r6rur8a-algo-1-33bni | Torchserve version: 0.4.2
2n5r6rur8a-algo-1-33bni | TS Home: /usr/local/lib/python3.6/dist-packages
2n5r6rur8a-algo-1-33bni | Current directory: /
2n5r6rur8a-algo-1-33bni | Temp directory: /tmp
2n5r6rur8a-algo-1-33bni | Number of GPUs: 0
2n5r6rur8a-algo-1-33bni | Number of CPUs: 32
2n5r6rur8a-algo-1-33bni | Max heap size: 30688 M
2n5r6rur8a-algo-1-33bni | Python executable: /usr/bin/python3
2n5r6rur8a-algo-1-33bni | Config file: /etc/sagemaker-ts.properties
2n5r6rur8a-algo-1-33bni | Inference address: http://0.0.0.0:8080
2n5r6rur8a-algo-1-33bni | Management address: http://0.0.0.0:8080
2n5r6rur8a-algo-1-33bni | Metrics address: http://127.0.0.1:8082
2n5r6rur8a-algo-1-33bni | Model Store: /.sagemaker/ts/models
2n5r6rur8a-algo-1-33bni | Initial Models: model.mar
2n5r6rur8a-algo-1-33bni | Log dir: /logs
2n5r6rur8a-algo-1-33bni | Metrics dir: /logs
2n5r6rur8a-algo-1-33bni | Netty threads: 0
2n5r6rur8a-algo-1-33bni | Netty client threads: 0
2n5r6rur8a-algo-1-33bni | Default workers per model: 32
2n5r6rur8a-algo-1-33bni | Blacklist Regex: N/A
2n5r6rur8a-algo-1-33bni | Maximum Response Size: 6553500
2n5r6rur8a-algo-1-33bni | Maximum Request Size: 6553500
2n5r6rur8a-algo-1-33bni | Prefer direct buffer: false
2n5r6rur8a-algo-1-33bni | Allowed Urls: [file://.*|http(s)?://.*]
2n5r6rur8a-algo-1-33bni | Custom python dependency for model allowed: false
2n5r6rur8a-algo-1-33bni | Metrics report format: prometheus
2n5r6rur8a-algo-1-33bni | Enable metrics API: true
2n5r6rur8a-algo-1-33bni | Workflow Store: /.sagemaker/ts/models
2n5r6rur8a-algo-1-33bni | Model config: {"model": {"1.0": {"defaultVersion": true, "marName": "model.mar", "minWorkers": 1, "maxWorkers": 4, "batchSize": 3, "maxBatchDelay": 100000, "responseTimeout": 120}}}

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
  • Commit ID: 82c0919
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

lxning
lxning previously approved these changes Sep 27, 2021
@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
  • Commit ID: bfe053d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
  • Commit ID: 9b36af5
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
  • Commit ID: 28a1b6b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
  • Commit ID: f093170
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
  • Commit ID: 61a7d0b
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
  • Commit ID: 2ccf3b6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-pytorch-inference-toolkit-pr
  • Commit ID: a14faf1
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@maaquib maaquib merged commit 27b667f into aws:master Oct 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants