(sagemaker):  Missing support for "AsyncInferenceConfig" in `create_endpoint_config` and `describe_endpoint_config`

The SageMaker client does not fully support handling of the "AsyncInferenceConfig" parameter for creating endpoint configs. 

While investigating a test failure in another project (which relies on async inference), I found that `create_endpoint_config` did not actually take "AsyncInferenceConfig" into account, since `describe_endpoint_config` did not return it. This missing functionality caused tests using async inference to break unexpectedly.

**Steps to Reproduce**

<details>

```py
import boto3
from moto import mock_aws
import pytest

@pytest.fixture
def mock_sagemaker():
    with mock_aws():
        yield boto3.client("sagemaker", region_name="eu-central-1")

@pytest.mark.parametrize(
    "prefix, endpoint_cfg, expected_result",
    [
        (
            "async", {
                "AsyncInferenceConfig": {
                    "ClientConfig": {"MaxConcurrentInvocationsPerInstance": 3},
                    "OutputConfig": {"S3OutputPath": "s3://output-bucket", "NotificationConfig": {}},
                }
            },
            True,
        ),
        ("", {}, False),
    ],
)
def test_is_async_endpoint(mock_sagemaker, prefix, endpoint_cfg, expected_result):
    ## given
    _MODEL_NAME = f"{prefix}test"
    _ENDPOINT_NAME = f"{prefix}test_endpoint_name"
    sm = mock_sagemaker
    sm.create_model(
        ModelName=_MODEL_NAME,
        PrimaryContainer={
            "Image": "test_image",
            "ModelDataUrl": f"s3://test_bucket/model.zip",
        },
    )
    sm.create_endpoint_config(
        EndpointConfigName=_ENDPOINT_NAME,
        ProductionVariants=[
            {
                "VariantName": "AllTraffic",
                "ModelName": _MODEL_NAME,
                "InitialInstanceCount": 1,
                "InstanceType": "ml.m5.large",
            }
        ],
        **endpoint_cfg,
    )
    sm.create_endpoint(
        EndpointName=_ENDPOINT_NAME,
        EndpointConfigName=_ENDPOINT_NAME,
    )
    ## when & then
    config_name = sm.describe_endpoint(EndpointName=_ENDPOINT_NAME)['EndpointConfigName']
    endpoint_config = sm.describe_endpoint_config(EndpointConfigName=config_name)
    assert ("AsyncInferenceConfig" in endpoint_config) is expected_result
```
</details>


**Expected Behavior**
- When specifying "AsyncInferenceConfig" in `create_endpoint_config`, Moto should store this parameter and return it when `describe_endpoint_config` is called.

**Actual Behavior**
- Moto ignores the "AsyncInferenceConfig" parameter in `create_endpoint_config` and does not surface this info in `describe_endpoint_config`, leading to test failures for async endpoint usage.

**Proposed Solution**
- Update [`FakeEndpointConfig.__init__`](https://github.com/getmoto/moto/blob/b29a9c8492e8b8caccf328214ecd0a832147e511/moto/sagemaker/models.py#L624) & [`SageMakerModelBackend.create_endpoint_config`](https://github.com/getmoto/moto/blob/b29a9c8492e8b8caccf328214ecd0a832147e511/moto/sagemaker/models.py#L3559) to include "AsyncInferenceConfig" params 
- Update [`SageMakerResponse.create_endpoint_config`](https://github.com/getmoto/moto/blob/b29a9c8492e8b8caccf328214ecd0a832147e511/moto/sagemaker/responses.py#L141) to include "AsyncInferenceConfig" params 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

(sagemaker): Missing support for "AsyncInferenceConfig" in `create_endpoint_config` and `describe_endpoint_config` #8783

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

(sagemaker): Missing support for "AsyncInferenceConfig" in create_endpoint_config and describe_endpoint_config #8783

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

(sagemaker): Missing support for "AsyncInferenceConfig" in `create_endpoint_config` and `describe_endpoint_config` #8783