Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pytorch bert model error #900

Closed
taomiao opened this issue Nov 20, 2019 · 2 comments
Closed

pytorch bert model error #900

taomiao opened this issue Nov 20, 2019 · 2 comments

Comments

@taomiao
Copy link

taomiao commented Nov 20, 2019

Description
A clear and concise description of what the bug is.
while loading pytorch bert jit model, a error happened.

E1120 06:45:27.081567 410 model_repository_manager.cc:813] failed to load 'bert_pt_cws' version 
1: Internal: load failed for libtorch model -> 'bert_pt_cws': [enforce fail at inline_container.cc:137] . PytorchStreamReader failed reading zip archive: failed finding central directory
frame #0: c10::ThrowEnforceNotMet(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void const*) + 0x78 (0x7f862ddc7c38 in /opt/tensorrtserver/lib/libc10.so)
frame #1: caffe2::serialize::PyTorchStreamReader::valid(char const*) + 0x8d (0x7f863293b65d in /opt/tensorrtserver/lib/libtorch.so)
frame #2: caffe2::serialize::PyTorchStreamReader::init() + 0xa6 (0x7f863293f966 in /opt/tensorrtserver/lib/libtorch.so)
frame #3: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::unique_ptr<caffe2::serialize::ReadAdapterInterface, std::default_delete<caffe2::serialize::ReadAdapterInterface> >) + 0x53 (0x7f8632943813 in /opt/tensorrtserver/lib/libtorch.so)
frame #4: <unknown function> + 0x59c020f (0x7f86339a120f in /opt/tensorrtserver/lib/libtorch.so)
frame #5: torch::jit::load(std::unique_ptr<caffe2::serialize::ReadAdapterInterface, std::default_delete<caffe2::serialize::ReadAdapterInterface> >, c10::optional<c10::Device>, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&) + 0x3a (0x7f86339a004a in /opt/tensorrtserver/lib/libtorch.so)
frame #6: torch::jit::load(std::istream&, c10::optional<c10::Device>, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&) + 0x79 (0x7f86339a02f9 in /opt/tensorrtserver/lib/libtorch.so)
frame #7: <unknown function> + 0x1e98e2 (0x7f86cbac18e2 in /opt/tensorrtserver/lib/libtrtserver.so)
frame #8: <unknown function> + 0x1ea623 (0x7f86cbac2623 in /opt/tensorrtserver/lib/libtrtserver.so)
frame #9: <unknown function> + 0x1e28d4 (0x7f86cbaba8d4 in /opt/tensorrtserver/lib/libtrtserver.so)
frame #10: <unknown function> + 0xe2b34 (0x7f86cb9bab34 in /opt/tensorrtserver/lib/libtrtserver.so)
frame #11: <unknown function> + 0xe38c5 (0x7f86cb9bb8c5 in /opt/tensorrtserver/lib/libtrtserver.so)
frame #12: <unknown function> + 0xbd66f (0x7f86cafc866f in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #13: <unknown function> + 0x76db (0x7f86cb6c06db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #14: clone + 0x3f (0x7f86ca68588f in /lib/x86_64-linux-gnu/libc.so.6)

TRTIS Information
What version of TRTIS are you using? 19.09-py3
Are you using the TRTIS container or did you build it yourself? container

To Reproduce
Steps to reproduce the behavior:
jit a bert model (pretrained + a dense layer)
start trtserver

Expected behavior
A clear and concise description of what you expected to happen.

@CoderHam
Copy link
Contributor

CoderHam commented Nov 20, 2019

A few possible causes:

  • The pytorch version used to create the model is newer/does not match the version used in that version of the TRTIS container
  • Verify the type of model being used. TRTIS (since it uses the C++ Pytorch Backend) requires a traced/jitted model

@CoderHam
Copy link
Contributor

CoderHam commented Dec 2, 2019

@taomiao I am closing this issue for now due to inactivity. Please re-open if you are still facing this issue.

@CoderHam CoderHam closed this as completed Dec 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants