Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running torchscript exported model in Triton throws InferenceServerException #2594

Closed
arunsu opened this issue Mar 4, 2021 · 1 comment
Closed

Comments

@arunsu
Copy link

arunsu commented Mar 4, 2021

Description
A clear and concise description of what the bug is.
Inferencing a pretrained TorchVision model in Triton 21.02 is throwing following exception. However the same model runs fine on PyTorch 1.7.1

InferenceServerException: PyTorch execute failure: isTensor() INTERNAL ASSERT FAILED at "/opt/tritonserver/include/torch/ATen/core/ivalue_inl.h":152, please report a bug to PyTorch. Expected Tensor but got GenericDict
Exception raised from toTensor at /opt/tritonserver/include/torch/ATen/core/ivalue_inl.h:152 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x6c (0x7fc2e007044c in /opt/tritonserver/backends/pytorch/libc10.so)
frame #1: + 0x8073 (0x7fc2e00a3073 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #2: + 0x15882 (0x7fc2e00b0882 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #3: TRITONBACKEND_ModelInstanceExecute + 0x411 (0x7fc2e00b1d71 in /opt/tritonserver/backends/pytorch/libtriton_pytorch.so)
frame #4: + 0x2e4047 (0x7fc32292b047 in /opt/tritonserver/bin/../lib/libtritonserver.so)
frame #5: + 0xf88a0 (0x7fc32273f8a0 in /opt/tritonserver/bin/../lib/libtritonserver.so)
frame #6: + 0xd6d84 (0x7fc322181d84 in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #7: + 0x9609 (0x7fc32261c609 in /usr/lib/x86_64-linux-gnu/libpthread.so.0)
frame #8: clone + 0x43 (0x7fc321e6f293 in /usr/lib/x86_64-linux-gnu/libc.so.6)

Triton Information
What version of Triton are you using?
21.02-py3

Are you using the Triton container or did you build it yourself?
nvcr.io/nvidia/tritonserver:21.02-py3

To Reproduce
Steps to reproduce the behavior.
Get the pretrained TorchVision Retinanet model

import torch
import torchvision.models as models
retina50 = models.detection.retinanet_resnet50_fpn(pretrained=True)
retina50.eval()
retina50_scripted = torch.jit.script(retina50)

Test the model with dummy input
dummy_input = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)] # We should run a quick test
scripted_output = retina50_scripted(dummy_input)

Now save this model and run in Triton
retina50_model_scripted.save('retinanet50/1/model.pt')

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
name: "retinanet50"
platform: "pytorch_libtorch"
input [
{
name: "input__0"
data_type: TYPE_FP32
dims: [3, 480, 640]
}
]
output [
{
name: "output__boxes"
data_type: TYPE_FP32
dims: [93, 4]
},
{
name: "output__scores"
data_type: TYPE_FP32
dims: [93]
},
{
name: "output__labels"
data_type: TYPE_FP32
dims: [93]
}
]

Expected behavior
A clear and concise description of what you expected to happen.
The output from Triton Inference server same as the output of the model run in pytorch.

@CoderHam
Copy link
Contributor

CoderHam commented Mar 5, 2021

Duplicated by #2593
Closing in favor of #2593

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants