-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
I create model and run model.deploy(initial_instance_count=1, instance_type=instance_type) as per standard docs. It completes fine, but when I am trying to run predict command I get a time out error even though I expect predictions to take few seconds. Exactly the same code works well when run locally! More strangely I don't see any logs related to failing predict, but there are some hard to debug errors coming from the deploy step that actually didn't give any errors.
2021-05-19 18:54:41,249 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model model loaded.
2021-05-19 18:54:50,760 [INFO ] W-9000-model_1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9000.
--- but then
2021-05-19 18:54:52,112 [WARN ] W-9000-model_1 org.pytorch.serve.wlm.BatchAggregator - Load model failed: model, error: Worker died.
2021-05-19 18:54:52,112 [INFO ] W-9000-model_1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.6/site-packages/ts/model_service_worker.py", line 182, in <module>
2021-05-19 18:54:56,578 [INFO ] W-9000-model_1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process died.
It repeats it several times in logs, so it seems that the actual model deploy fails but I don't get any reports - only when running predict it times out.
To reproduce
Use sagemaker-2.39.1
TORCH_VERSION='1.7.1'
s3_uri='s3://...model.tar.gz'
model = PyTorchModel(
model_data=s3_uri,
role=role,
py_version='py3',
framework_version=TORCH_VERSION,
entry_point='serve.py',
source_dir='s3://..../sourcedir.tar.gz',
Here's serve.py.
import json
import torch
import numpy as np
from transformers import AutoTokenizer
from forms_sorter.modules.data_module import LabelMapper
from forms_sorter.modules.model import FormsSorterModel
"""
This is the sagemaker inference entry script
"""
CSV_CONTENT_TYPE = 'text/csv'
JSON_CONTENT_TYPE = 'text/json'
softmax = torch.nn.Softmax()
def model_fn(model_dir):
"""
For serving with SageMaker
SageMaker deployment presumes that
all required files are provided within
model_dir.
Here we require classes.txt for index-to-label mapping and
a model named last.ckpt
model_fn is a reserved keyword
"""
device = get_device()
model_name = 'bert-base-cased'
preprocessor = AutoTokenizer.from_pretrained(model_name)
label_mapper = LabelMapper('classes.txt')
model = FormsSorterModel.load_from_checkpoint('last.ckpt')
model = model.to(device)
model.eval()
return preprocessor, model, label_mapper
def predict(
input,
checkpoint_file='last.ckpt',
model_name='bert-base-cased',
labels='classes.txt'
):
"""
For model serving (outside SageMaker)
"""
device = get_device()
model_name = 'bert-base-cased'
preprocessor = AutoTokenizer.from_pretrained(model_name)
label_mapper = LabelMapper(labels)
model = FormsSorterModel.load_from_checkpoint(checkpoint_file)
model = model.to(device)
model_artifacts = (preprocessor, model, label_mapper)
results = predict_fn(input, model_artifacts)
return results
def get_device():
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
return device
def input_fn(input, content_type):
if content_type == CSV_CONTENT_TYPE:
records = input.split('\n')
return records
else:
raise ValueError(
'Content type {} not supported. The supported type is {}'.format(
content_type, CSV_CONTENT_TYPE
)
)
def preprocess(input, preprocessor):
r = []
for i in input:
x = preprocessor(i, padding='max_length', truncation=True)
x = np.array(x['input_ids'])
r.append(torch.tensor(x).unsqueeze(dim=0))
result = torch.cat(r)
return result
def predict_fn(input, model_artifacts):
preprocessor, model, label_mapper = model_artifacts
# Pre-process
input_tensor = preprocess(input, preprocessor)
# Copy input to gpu if available
device = get_device()
input_tensor = input_tensor.to(device=device)
# Invoke
with torch.no_grad():
output_tensor = model(input_tensor)
# Convert to probabilities
softmax = torch.nn.Softmax()
output_tensor = softmax(output_tensor.logits)
probs, predictions = torch.max(output_tensor, dim=1)
classes = label_mapper.reverse_map(predictions)
return classes, probs
def output_fn(output, accept=JSON_CONTENT_TYPE):
if accept == JSON_CONTENT_TYPE:
prediction = json.dumps(output)
return prediction, accept
else:
raise ValueError(
'Content type {} not supported. The only types supported are {}'.format(
accept, JSON_CONTENT_TYPE
)
)
Expected behavior
Model deploy fails with clear error.
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.39.1
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): PyTorch
- Framework version: 1.7.1
- Python version: 3.7
- CPU or GPU: Both
- Custom Docker image (Y/N): N
Additional context
Add any other context about the problem here.