-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Description
Describe the bug
PytorchModel doesn't deploy in serverless mode.
To reproduce
image_uri = sagemaker.image_uris.retrieve(
framework="pytorch",
region="eu-central-1",
py_version="py38",
version='1.10',
instance_type="ml.t2.xlarge",
image_scope="inference"
)
# not being used at the moment due to AWS bug
serverless_config = ServerlessInferenceConfig(
memory_size_in_mb=4096,
max_concurrency=10,
)
pytorch_model = PyTorchModel(model_data=path_to_s3,
role=role,
image_uri=image_uri,
source_dir="src",
entry_point="inference.py",
framework_version='1.10',
py_version='py38')
predictor = pytorch_model.deploy(initial_instance_count=1, serverless_inference_config=serverless_config, endpoint_name=SERVERLESS_ENDPOINT_NAME)
Expected behavior
Model deployed in serverless mode
Screenshots or logs
ClientError Traceback (most recent call last)
<ipython-input-54-593adb15b1aa> in <module>
38 framework_version='1.10',
39 py_version='py38')
---> 40 predictor = pytorch_model.deploy(initial_instance_count=1, serverless_inference_config=serverless_config, endpoint_name=SERVERLESS_ENDPOINT_NAME)
41 else:
42 pytorch_model = PyTorchModel(model_data='s3://detector-sagemaker/model.tar.gz',
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, **kwargs)
1024 wait=wait,
1025 data_capture_config_dict=data_capture_config_dict,
-> 1026 async_inference_config_dict=async_inference_config_dict,
1027 )
1028
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sagemaker/session.py in endpoint_from_production_variants(self, name, production_variants, tags, kms_key, wait, data_capture_config_dict, async_inference_config_dict)
3542 self.sagemaker_client.create_endpoint_config(**config_options)
3543
-> 3544 return self.create_endpoint(endpoint_name=name, config_name=name, tags=tags, wait=wait)
3545
3546 def expand_role(self, role):
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sagemaker/session.py in create_endpoint(self, endpoint_name, config_name, tags, wait)
3038
3039 self.sagemaker_client.create_endpoint(
-> 3040 EndpointName=endpoint_name, EndpointConfigName=config_name, Tags=tags
3041 )
3042 if wait:
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
399 "%s() only accepts keyword arguments." % py_operation_name)
400 # The "self" in this scope is referring to the BaseClient.
--> 401 return self._make_api_call(operation_name, kwargs)
402
403 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
729 error_code = parsed_response.get("Error", {}).get("Code")
730 error_class = self.exceptions.from_code(error_code)
--> 731 raise error_class(parsed_response, operation_name)
732 else:
733 return parsed_response
ClientError: An error occurred (ValidationException) when calling the CreateEndpoint operation: One or more endpoint features are not supported using this configuration
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.80.0
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): PyTorch
- Framework version: 1.10
- Python version: 3.8
- CPU or GPU: CPU and GPU
- Custom Docker image (Y/N): N
Additional context
The error happens also when specifying a CPU instance type (as I have read that GPU is not supported in serverless, not sure if this is still the case). Also logs are quite cryptic so I am not sure how to pinpoint the issue.