Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bert->onnx ->caffe2 weird error #633

Closed
maeotaku opened this issue May 23, 2019 · 2 comments
Closed

bert->onnx ->caffe2 weird error #633

maeotaku opened this issue May 23, 2019 · 2 comments
Labels

Comments

@maeotaku
Copy link

maeotaku commented May 23, 2019

So really not sure if i should post this here but im having this problem with the pretrained bert for seq classification in particular when i try to consume the ONNX version of the model with Caffe2, I get this output:

File "/usr/local/lib/python3.6/dist-packages/caffe2/python/onnx/workspace.py", line 63, in f
return getattr(workspace, attr)(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/caffe2/python/workspace.py", line 250, in RunNet
StringifyNetName(name), num_iter, allow_fail,
File "/usr/local/lib/python3.6/dist-packages/caffe2/python/workspace.py", line 211, in CallWithExceptionIntercept
return func(args, kwargs)
RuntimeError: [enforce fail at pow_op.h:100] A.sizes() == B.sizes(). [4, 512, 768] vs []. Dimension mismatch - did you forget to set broadcast=1?
Error from operator:
input: "222" input: "223" output: "224" name: "" type: "Pow" device_option { device_type: 1 device_id: 3 }frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f9c8cdaf441 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #1: c10::ThrowEnforceNotMet(char const
, int, char const
, std::string const&, void const
) + 0x49 (0x7f9c8cdaf259 in /usr/local/lib/python3.6/dist-packages/torch/lib/libc10.so)
frame #2: + 0x2b63861 (0x7f9c44eed861 in /usr/local/lib/python3.6/dist-packages/torch/lib/libcaffe2_gpu.so)
frame #3: + 0x15a3555 (0x7f9c4392d555 in /usr/local/lib/python3.6/dist-packages/torch/lib/libcaffe2_gpu.so)
frame #4: caffe2::SimpleNet::Run() + 0x161 (0x7f9c396ac101 in /usr/local/lib/python3.6/dist-packages/torch/lib/libcaffe2.so)
frame #5: caffe2::Workspace::RunNet(std::string const&) + 0x3a (0x7f9c396e35aa in /usr/local/lib/python3.6/dist-packages/torch/lib/libcaffe2.so)
frame #6: + 0x4e38a (0x7f9bbe6fd38a in /usr/local/lib/python3.6/dist-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-36m-x86_64-linux-gnu.so)
frame #7: + 0x9368e (0x7f9bbe74268e in /usr/local/lib/python3.6/dist-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-36m-x86_64-linux-gnu.so)
frame #8: PyCFunction_Call + 0xf9 (0x4aeb29 in /usr/bin/python3)
frame #9: _PyEval_EvalFrameDefault + 0x7e42 (0x54d092 in /usr/bin/python3)
frame #10: /usr/bin/python3() [0x543f21]
frame #11: /usr/bin/python3() [0x54421f]
frame #12: _PyEval_EvalFrameDefault + 0xc5b (0x545eab in /usr/bin/python3)
frame #13: /usr/bin/python3() [0x543f21]
frame #14: PyEval_EvalCodeEx + 0x6d (0x544cfd in /usr/bin/python3)
frame #15: /usr/bin/python3() [0x485857]
frame #16: PyObject_Call + 0x60 (0x4557a0 in /usr/bin/python3)
frame #17: _PyEval_EvalFrameDefault + 0x19e8 (0x546c38 in /usr/bin/python3)
frame #18: /usr/bin/python3() [0x543f21]
frame #19: /usr/bin/python3() [0x54421f]
frame #20: _PyEval_EvalFrameDefault + 0xc5b (0x545eab in /usr/bin/python3)
frame #21: /usr/bin/python3() [0x543f21]
frame #22: /usr/bin/python3() [0x54421f]
frame #23: _PyEval_EvalFrameDefault + 0xc5b (0x545eab in /usr/bin/python3)
frame #24: /usr/bin/python3() [0x5432b1]
frame #25: /usr/bin/python3() [0x544447]
frame #26: _PyEval_EvalFrameDefault + 0xc5b (0x545eab in /usr/bin/python3)
frame #27: /usr/bin/python3() [0x5432b1]
frame #28: /usr/bin/python3() [0x544447]
frame #29: _PyEval_EvalFrameDefault + 0xc5b (0x545eab in /usr/bin/python3)
frame #30: /usr/bin/python3() [0x543f21]
frame #31: PyEval_EvalCodeEx + 0x6d (0x544cfd in /usr/bin/python3)
frame #32: /usr/bin/python3() [0x485857]
frame #33: PyObject_Call + 0x60 (0x4557a0 in /usr/bin/python3)
frame #34: _PyEval_EvalFrameDefault + 0x19e8 (0x546c38 in /usr/bin/python3)
frame #35: /usr/bin/python3() [0x543f21]
frame #36: PyEval_EvalCodeEx + 0x6d (0x544cfd in /usr/bin/python3)
frame #37: /usr/bin/python3() [0x485857]
frame #38: PyObject_Call + 0x60 (0x4557a0 in /usr/bin/python3)
frame #39: _PyEval_EvalFrameDefault + 0x19e8 (0x546c38 in /usr/bin/python3)
frame #40: /usr/bin/python3() [0x5432b1]
frame #41: /usr/bin/python3() [0x544447]
frame #42: _PyEval_EvalFrameDefault + 0xc5b (0x545eab in /usr/bin/python3)
frame #43: /usr/bin/python3() [0x5432b1]
frame #44: /usr/bin/python3() [0x544447]
frame #45: _PyEval_EvalFrameDefault + 0xc5b (0x545eab in /usr/bin/python3)
frame #46: /usr/bin/python3() [0x5432b1]
frame #47: _PyFunction_FastCallDict + 0x236 (0x54d8c6 in /usr/bin/python3)
frame #48: _PyObject_FastCallDict + 0x1ef (0x455acf in /usr/bin/python3)
frame #49: _PyObject_Call_Prepend + 0xcb (0x455bcb in /usr/bin/python3)
frame #50: PyObject_Call + 0x60 (0x4557a0 in /usr/bin/python3)
frame #51: /usr/bin/python3() [0x4c9d13]
frame #52: _PyObject_FastCallDict + 0xa2 (0x455982 in /usr/bin/python3)
frame #53: /usr/bin/python3() [0x544075]
frame #54: _PyEval_EvalFrameDefault + 0xc5b (0x545eab in /usr/bin/python3)
frame #55: /usr/bin/python3() [0x5432b1]
frame #56: /usr/bin/python3() [0x544447]
frame #57: _PyEval_EvalFrameDefault + 0xc5b (0x545eab in /usr/bin/python3)
frame #58: /usr/bin/python3() [0x5432b1]
frame #59: /usr/bin/python3() [0x544447]
frame #60: _PyEval_EvalFrameDefault + 0xc5b (0x545eab in /usr/bin/python3)
frame #61: /usr/bin/python3() [0x5432b1]
frame #62: _PyFunction_FastCallDict + 0x236 (0x54d8c6 in /usr/bin/python3)
frame #63: _PyObject_FastCallDict + 0x1ef (0x455acf in /usr/bin/python3)

Does any of you know if the pretrained model is using something not supported by Caffe2?
I have also tried with several tensor shapes ( like (1, 512), (1, 128), (1,512, 786) in both long anf float with no luck. Also i used (4, 512), (4,128), (4,512,768) just in case since my input when i exported the ONNX file used some 4 samples.

Any pointers would be highly appreciated :)

@maeotaku maeotaku changed the title bert->onnx ->caffe2 weird bert->onnx ->caffe2 weird error May 23, 2019
@stale
Copy link

stale bot commented Jul 22, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 22, 2019
@stale stale bot closed this as completed Jul 29, 2019
@kkaehler
Copy link

kkaehler commented Oct 8, 2019

@maeotaku were you ever able to figure this out? I'd be curious to see what numbers you were seeing when running in caffe2.

If you didn't figure this out, seems similar this pytorch issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants