-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the feature you'd like
The default behaviour of SageMaker Python SDK when updating an inference endpoint is to throw away whatever value for desired instance count there is currently at runtime for the endpoint (according the endpoint autoscaling policy). Please add support for the SageMaker UpdateEndpoint API boolean parameter RetainAllVariantProperties
as a way to solve this issue, see https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpoint.html#API_UpdateEndpoint_RequestSyntax
How would this feature be used? Please describe.
I would provide RetainAllVariantProperties=True to update an endpoint while retaining whatever is the current runtime autoscaling policy desired instance count.
Describe alternatives you've considered
Use boto3 as a workaround
Additional context
Add any other context or screenshots about the feature request here.