Skip to content

Can't deploy pretrained model even after following the documentation #3640

@bhattbhuwan13

Description

@bhattbhuwan13

Discussed in #3638

Originally posted by monika-prajapati February 6, 2023
I have a model that I want to deploy as a sagemaker endpoint. I followed this documentation
and did the following:

  • Create inference.py script with model_fn, input_fn, predict_fn, and output_fn using this as reference
  • Make file/folder structure according to documentation and make model.tar.gz file

.
├── code
│ ├── inference.py
│ └── requirements.txt
└── model.pth

I created model.tar.gz with . as root, while in a directory containing code folder.

My code in the sagemaker notebook looks like this

import boto3
import sagemaker
from sagemaker.pytorch import PyTorchModel

session = boto3.Session()
sagemaker_client = session.client('sagemaker')
role = sagemaker.get_execution_role()

# Define the model data location in S3
model_data = 's3://speech2textmodel/model.tar.gz'

# Define the model architecture
model1 = PyTorchModel(model_data=model_data,
                     role=role,
                     entry_point='inference.py',
                    framework_version='1.6.0',
                    py_version='py3')

predictor = model1.deploy(instance_type='ml.m5.xlarge', initial_instance_count=1)

I got error

UnexpectedStatusException: Error hosting endpoint pytorch-inference-2023-02-06-09-28-21-891: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint..

This is error in cloudwatch

ERROR - /.sagemaker/ts/models/model.mar already exists.
```</div>

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions