Skip to content

Support retaining current desired instance count when updating endpoint #3126

@torsjonas

Description

@torsjonas

Describe the feature you'd like
The default behaviour of SageMaker Python SDK when updating an inference endpoint is to throw away whatever value for desired instance count there is currently at runtime for the endpoint (according the endpoint autoscaling policy). Please add support for the SageMaker UpdateEndpoint API boolean parameter RetainAllVariantProperties as a way to solve this issue, see https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UpdateEndpoint.html#API_UpdateEndpoint_RequestSyntax

How would this feature be used? Please describe.
I would provide RetainAllVariantProperties=True to update an endpoint while retaining whatever is the current runtime autoscaling policy desired instance count.

Describe alternatives you've considered
Use boto3 as a workaround

Additional context
Add any other context or screenshots about the feature request here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions