Static Linking C++, Op not available at runtime #111654
Labels
enhancement
Not as big of a feature, but technically not a bug. Should be easy to fix
has workaround
module: build
Build system issues
module: cpp
Related to C++ API
module: vision
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
🐛 Describe the bug
When linking with static libtorch and torchvision libraries, I am able to build, but at runtime, I get an error about an
Unknown builtin op: aten::mul
.I have found references indicating that including <torchvision/vision.h> should cause the operators to be registered so they are linked in, but that doesn't seem to do the trick.
I've also found references indicating that forcing the linker to link the "whole archive" for libtorch_cpu.a should force it to include all the operators in the linked executable. I have done this, and it does overcome the problem - however, this feels a bit like a workaround, and we aren't able to use that as a long-term solution. When I link in the whole archive, the executable jumps from 87MB to 339MB.
I've also found some references suggesting calling
c10::RegisterOps
, ortorch::RegisterOps
, neither of which seem to exist. I found bothc10::RegisterOperators
andtorch::RegisterOperators
, but calling them doesn't seem to have any effect - admittedly, I might be using them incorrectly, all I did was add a call totorch::RegisterOperators();
which didn't cause any build errors, but did not overcome the runtime "Unknown builtin op: aten::mul" error.I tried to make a minimal example:
To build this, I use the following command:
As I said, this will build successfully, but it does give a warning when building:
When I run the executable, though, I get the following error:
The minimal example runs as expected, without error, if I link the
libtorch_cpu.a
whole archive, by changing the corresponding line in the build command to:but as I said, the size of the executable jumps way higher, and seems like overkill.
I wasn't sure if this should be a forum post or an issue report, but given that I thought the include of <vision.h> was supposed to manage this, it felt more like an issue report to me.
Versions
I'm not sure this is especially valuable in this situation. The example is running on an old OS with CPU-only support. The conversion to torchscript was done on a more modern machine with python and pytorch installed, but the machine I am running on is a severely stripped-down machine without python at all.
If I run the minimalExample.exe on the modern machine, it performs the same way though (i.e. errors at runtime without the whole-archive stuff, but runs successfully with the whole-archive stuff). So, here's the env for that machine in case its helpful:
cc @malfet @seemethere @jbschlosser @datumbox @vfdev-5 @pmeier
The text was updated successfully, but these errors were encountered: