PyTorchModel serverless deploy raises ClientError

**Describe the bug**
PytorchModel doesn't deploy in serverless mode.

**To reproduce**
```
image_uri = sagemaker.image_uris.retrieve(
        framework="pytorch",
        region="eu-central-1",
        py_version="py38",
        version='1.10',
        instance_type="ml.t2.xlarge",
        image_scope="inference"
    )
# not being used at the moment due to AWS bug
serverless_config = ServerlessInferenceConfig(
  memory_size_in_mb=4096,
  max_concurrency=10,
 )

  pytorch_model = PyTorchModel(model_data=path_to_s3,
                               role=role,
                               image_uri=image_uri,
                               source_dir="src",
                               entry_point="inference.py",
                               framework_version='1.10',
                               py_version='py38')
  predictor = pytorch_model.deploy(initial_instance_count=1, serverless_inference_config=serverless_config, endpoint_name=SERVERLESS_ENDPOINT_NAME)
```

**Expected behavior**
Model deployed in serverless mode

**Screenshots or logs**
```
ClientError                               Traceback (most recent call last)
<ipython-input-54-593adb15b1aa> in <module>
     38                                  framework_version='1.10',
     39                                  py_version='py38')
---> 40     predictor = pytorch_model.deploy(initial_instance_count=1, serverless_inference_config=serverless_config, endpoint_name=SERVERLESS_ENDPOINT_NAME)
     41 else:
     42     pytorch_model = PyTorchModel(model_data='s3://detector-sagemaker/model.tar.gz',

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, **kwargs)
   1024             wait=wait,
   1025             data_capture_config_dict=data_capture_config_dict,
-> 1026             async_inference_config_dict=async_inference_config_dict,
   1027         )
   1028 

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sagemaker/session.py in endpoint_from_production_variants(self, name, production_variants, tags, kms_key, wait, data_capture_config_dict, async_inference_config_dict)
   3542         self.sagemaker_client.create_endpoint_config(**config_options)
   3543 
-> 3544         return self.create_endpoint(endpoint_name=name, config_name=name, tags=tags, wait=wait)
   3545 
   3546     def expand_role(self, role):

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/sagemaker/session.py in create_endpoint(self, endpoint_name, config_name, tags, wait)
   3038 
   3039         self.sagemaker_client.create_endpoint(
-> 3040             EndpointName=endpoint_name, EndpointConfigName=config_name, Tags=tags
   3041         )
   3042         if wait:

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    399                     "%s() only accepts keyword arguments." % py_operation_name)
    400             # The "self" in this scope is referring to the BaseClient.
--> 401             return self._make_api_call(operation_name, kwargs)
    402 
    403         _api_call.__name__ = str(py_operation_name)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    729             error_code = parsed_response.get("Error", {}).get("Code")
    730             error_class = self.exceptions.from_code(error_code)
--> 731             raise error_class(parsed_response, operation_name)
    732         else:
    733             return parsed_response

ClientError: An error occurred (ValidationException) when calling the CreateEndpoint operation: One or more endpoint features are not supported using this configuration
```

**System information**
A description of your system. Please provide:
- **SageMaker Python SDK version**: 2.80.0
- **Framework name (eg. PyTorch) or algorithm (eg. KMeans)**: PyTorch
- **Framework version**: 1.10
- **Python version**: 3.8
- **CPU or GPU**: CPU and GPU
- **Custom Docker image (Y/N)**: N

**Additional context**
The error happens also when specifying a CPU instance type (as I have read that GPU is not supported in serverless, not sure if this is still the case). Also logs are quite cryptic so I am not sure how to pinpoint the issue.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PyTorchModel serverless deploy raises ClientError #3059

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PyTorchModel serverless deploy raises ClientError #3059

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions