-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Description
Hi Team,
Greetings!!
We are able to deploy on real-time endpoint with elastic inference accelerators. But SM Elastic Inference Accelerators are not available during inference, could you please have a look?
Note:
- We are bringing our own trained model using Pytorch = 1.10.2 on SM.
- We don't find any errors in cloud watch logs.
- Converted our trained model to TorchScript and able to load that model as well during inference.
- Tried both Torchscript's trace and script modes but no luck.
SageMaker version: 2.76.0
Code"
from sagemaker.pytorch import PyTorchModel
from sagemaker import get_execution_role
endpoint_name = 'ner-bert'
pytorch = PyTorchModel(entry_point='deploy_ei.py',
source_dir='code',
model_data=model_data,
role=get_execution_role(),
framework_version='1.3.1',
py_version='py3',
sagemaker_session=sagemaker_session)
predictor = pytorch.deploy(initial_instance_count=1,
instance_type='ml.m5.large',
accelerator_type='ml.eia2.xlarge',
endpoint_name=endpoint_name)
Thanks,
Vinayak