Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions sample/sagemaker/2017-07-24/service-2.json
Original file line number Diff line number Diff line change
Expand Up @@ -33607,7 +33607,7 @@
},
"InferenceAmiVersion":{
"shape":"ProductionVariantInferenceAmiVersion",
"documentation":"<p>Specifies an option from a collection of preconfigured Amazon Machine Image (AMI) images. Each image is configured by Amazon Web Services with a set of software and driver versions. Amazon Web Services optimizes these configurations for different machine learning workloads.</p> <p>By selecting an AMI version, you can ensure that your inference environment is compatible with specific software requirements, such as CUDA driver versions, Linux kernel versions, or Amazon Web Services Neuron driver versions.</p> <p>The AMI version names, and their configurations, are the following:</p> <dl> <dt>al2-ami-sagemaker-inference-gpu-2</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 535.54.03</p> </li> <li> <p>CUDA driver version: 12.2</p> </li> <li> <p>Supported instance types: ml.g4dn.*, ml.g5.*, ml.g6.*, ml.p3.*, ml.p4d.*, ml.p4de.*, ml.p5.*</p> </li> </ul> </dd> </dl>"
"documentation":"<p>Specifies an option from a collection of preconfigured Amazon Machine Image (AMI) images. Each image is configured by Amazon Web Services with a set of software and driver versions. Amazon Web Services optimizes these configurations for different machine learning workloads.</p> <p>By selecting an AMI version, you can ensure that your inference environment is compatible with specific software requirements, such as CUDA driver versions, Linux kernel versions, or Amazon Web Services Neuron driver versions.</p> <p>The AMI version names, and their configurations, are the following:</p> <dl> <dt>al2-ami-sagemaker-inference-gpu-2</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 535.54.03</p> </li> <li> <p>CUDA version: 12.2</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-gpu-2-1</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 535.54.03</p> </li> <li> <p>CUDA driver version: 12.2</p> </li> <li> <p>CUDA Container Toolkit with disabled CUDA-compat mounting</p> </li> </ul> </dd> <dt>al2-ami-sagemaker-inference-gpu-3-1</dt> <dd> <ul> <li> <p>Accelerator: GPU</p> </li> <li> <p>NVIDIA driver version: 550.144.01</p> </li> <li> <p>CUDA version: 12.4</p> </li> <li> <p>Container Toolkit with disabled CUDA-compat mounting</p> </li> </ul> </dd> </dl>"
}
},
"documentation":"<p> Identifies a model that you want to host and the resources chosen to deploy for hosting it. If you are deploying multiple models, tell SageMaker how to distribute traffic among the models by specifying variant weights. For more information on production variants, check <a href=\"https://docs.aws.amazon.com/sagemaker/latest/dg/model-ab-testing.html\"> Production variants</a>. </p>"
Expand Down Expand Up @@ -33645,7 +33645,11 @@
},
"ProductionVariantInferenceAmiVersion":{
"type":"string",
"enum":["al2-ami-sagemaker-inference-gpu-2"]
"enum":[
"al2-ami-sagemaker-inference-gpu-2",
"al2-ami-sagemaker-inference-gpu-2-1",
"al2-ami-sagemaker-inference-gpu-3-1"
]
},
"ProductionVariantInstanceType":{
"type":"string",
Expand Down
2 changes: 1 addition & 1 deletion src/sagemaker_core/main/shapes.py
Original file line number Diff line number Diff line change
Expand Up @@ -4858,7 +4858,7 @@ class ProductionVariant(Base):
enable_ssm_access: You can use this parameter to turn on native Amazon Web Services Systems Manager (SSM) access for a production variant behind an endpoint. By default, SSM access is disabled for all production variants behind an endpoint. You can turn on or turn off SSM access for a production variant behind an existing endpoint by creating a new endpoint configuration and calling UpdateEndpoint.
managed_instance_scaling: Settings that control the range in the number of instances that the endpoint provisions as it scales up or down to accommodate traffic.
routing_config: Settings that control how the endpoint routes incoming traffic to the instances that the endpoint hosts.
inference_ami_version: Specifies an option from a collection of preconfigured Amazon Machine Image (AMI) images. Each image is configured by Amazon Web Services with a set of software and driver versions. Amazon Web Services optimizes these configurations for different machine learning workloads. By selecting an AMI version, you can ensure that your inference environment is compatible with specific software requirements, such as CUDA driver versions, Linux kernel versions, or Amazon Web Services Neuron driver versions. The AMI version names, and their configurations, are the following: al2-ami-sagemaker-inference-gpu-2 Accelerator: GPU NVIDIA driver version: 535.54.03 CUDA driver version: 12.2 Supported instance types: ml.g4dn.*, ml.g5.*, ml.g6.*, ml.p3.*, ml.p4d.*, ml.p4de.*, ml.p5.*
inference_ami_version: Specifies an option from a collection of preconfigured Amazon Machine Image (AMI) images. Each image is configured by Amazon Web Services with a set of software and driver versions. Amazon Web Services optimizes these configurations for different machine learning workloads. By selecting an AMI version, you can ensure that your inference environment is compatible with specific software requirements, such as CUDA driver versions, Linux kernel versions, or Amazon Web Services Neuron driver versions. The AMI version names, and their configurations, are the following: al2-ami-sagemaker-inference-gpu-2 Accelerator: GPU NVIDIA driver version: 535.54.03 CUDA version: 12.2 al2-ami-sagemaker-inference-gpu-2-1 Accelerator: GPU NVIDIA driver version: 535.54.03 CUDA driver version: 12.2 CUDA Container Toolkit with disabled CUDA-compat mounting al2-ami-sagemaker-inference-gpu-3-1 Accelerator: GPU NVIDIA driver version: 550.144.01 CUDA version: 12.4 Container Toolkit with disabled CUDA-compat mounting
"""

variant_name: str
Expand Down