-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
Description
Discussed in #3638
Originally posted by monika-prajapati February 6, 2023
I have a model that I want to deploy as a sagemaker endpoint. I followed this documentation
and did the following:
- Create inference.py script with model_fn, input_fn, predict_fn, and output_fn using this as reference
- Make file/folder structure according to documentation and make model.tar.gz file
.
├── code
│ ├── inference.py
│ └── requirements.txt
└── model.pth
I created model.tar.gz with . as root, while in a directory containing code
folder.
My code in the sagemaker notebook looks like this
import boto3
import sagemaker
from sagemaker.pytorch import PyTorchModel
session = boto3.Session()
sagemaker_client = session.client('sagemaker')
role = sagemaker.get_execution_role()
# Define the model data location in S3
model_data = 's3://speech2textmodel/model.tar.gz'
# Define the model architecture
model1 = PyTorchModel(model_data=model_data,
role=role,
entry_point='inference.py',
framework_version='1.6.0',
py_version='py3')
predictor = model1.deploy(instance_type='ml.m5.xlarge', initial_instance_count=1)
I got error
UnexpectedStatusException: Error hosting endpoint pytorch-inference-2023-02-06-09-28-21-891: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..
This is error in cloudwatch
ERROR - /.sagemaker/ts/models/model.mar already exists.
```</div>
evankozliner