Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnx and ORT mismatch in their handling of Protobuf_USE_STATIC_LIBS and ORT can produce incorrect warnings and failures #12867

Open
sstamenova opened this issue Sep 6, 2022 · 12 comments
Labels
build build issues; typically submitted using template feature request request for unsupported feature or enhancement

Comments

@sstamenova
Copy link
Contributor

Describe the bug
onnx supports setting ONNX_USE_PROTOBUF_SHARED_LIBS to ON or OFF on Windows and by extension setting Protobuf_USE_STATIC_LIBS to ON or OFF. The default is ONNX_USE_PROTOBUF_SHARED_LIBS set to OFF and Protobuf_USE_STATIC_LIBS set to ON.
onnxruntime, on the other hand, does not change Protobuf_USE_STATIC_LIBS, so it is set to OFF by default in the onnxruntime build and set to ON by default in the onnx build within onnxruntime resulting in build failures.
The onnxruntime cmake also produces a warning if Protobuf_USE_STATIC_LIBS is not set to ON regardless of what it is set to for onnx, so it will produce a warning even if it is intended to be set to OFF for both onnx and onnxruntime.
Instead, onnxruntime should use the same defaults as onnx, or possibly even set ONNX_USE_PROTOBUF_SHARED_LIBS itself.

For a related issue in onnx, which has been fixed, see: onnx/onnx#3345.

Urgency
None

System information

To Reproduce
Build ORT and onnx together on Windows without setting Protobuf_USE_STATIC_LIBS to ON

Expected behavior
The build succeeds without any protobuf related issues or warnings.

Instead we get warnings and failures like:

LINK : warning LNK4286: symbol '??0Message@protobuf@google@@QEAA@XZ (public: __cdecl google::protobuf::Message::Message(void))' defined in 'libprotobuf.lib(wrappers.pb.cc.obj)' is imported by 'onnx_proto.lib(onnx-data.pb.cc.obj)'
LINK : warning LNK4217: symbol '??0Message@protobuf@google@@IEAA@PEAVArena@12@_N@Z (protected: __cdecl google::protobuf::Message::Message(class google::protobuf::Arena *,bool))' defined in 'libprotobuf.lib(wrappers.pb.cc.obj)' is imported by 'onnx_proto.lib(onnx-ml.pb.cc.obj)' in function '"public: virtual class onnx::ValueInfoProto * __cdecl onnx::ValueInfoProto::New(void)const " (?New@ValueInfoProto@onnx@@UEBAPEAV12@XZ)'
LINK : warning LNK4286: symbol '??0Message@protobuf@google@@IEAA@PEAVArena@12@_N@Z (protected: __cdecl google::protobuf::Message::Message(class google::protobuf::Arena *,bool))' defined in 'libprotobuf.lib(wrappers.pb.cc.obj)' is imported by 'onnx_proto.lib(onnx-data.pb.cc.obj)'
LINK : warning LNK4217: symbol '?MaybeComputeUnknownFieldsSize@Message@protobuf@google@@IEBA_K_KPEAVCachedSize@internal@23@@Z (protected: unsigned __int64 __cdecl google::protobuf::Message::MaybeComputeUnknownFieldsSize(unsigned __int64,class google::protobuf::internal::CachedSize *)const )' defined in 'libprotobuf.lib(message.cc.obj)' is imported by 'onnx_proto.lib(onnx-ml.pb.cc.obj)' in function '"public: virtual unsigned __int64 __cdecl onnx::OperatorSetIdProto::ByteSizeLong(void)const " (?ByteSizeLong@OperatorSetIdProto@onnx@@UEBA_KXZ)'
LINK : warning LNK4286: symbol '?MaybeComputeUnknownFieldsSize@Message@protobuf@google@@IEBA_K_KPEAVCachedSize@internal@23@@Z (protected: unsigned __int64 __cdecl google::protobuf::Message::MaybeComputeUnknownFieldsSize(unsigned __int64,class google::protobuf::internal::CachedSize *)const )' defined in 'libprotobuf.lib(message.cc.obj)' is imported by 'onnx_proto.lib(onnx-data.pb.cc.obj)'
LINK : warning LNK4217: symbol '??1Message@protobuf@google@@UEAA@XZ (public: virtual __cdecl google::protobuf::Message::~Message(void))' defined in 'libprotobuf.lib(wrappers.pb.cc.obj)' is imported by 'onnx_proto.lib(onnx-ml.pb.cc.obj)' in function '"int `private: static class onnx::OperatorSetIdProto * __cdecl google::protobuf::Arena::CreateMaybeMessage<class onnx::OperatorSetIdProto>(class protobuf::Arena::dtor$0 *)'::`1'::dtor$0" (?dtor$0@?0???$CreateMaybeMessage@VOperatorSetIdProto@onnx@@$$V@Arena@protobuf@google@@CAPEAVOperatorSetIdProto@onnx@@PEAV012@@Z@4HA)'
LINK : warning LNK4286: symbol '??1Message@protobuf@google@@UEAA@XZ (public: virtual __cdecl google::protobuf::Message::~Message(void))' defined in 'libprotobuf.lib(wrappers.pb.cc.obj)' is imported by 'onnx_proto.lib(onnx-data.pb.cc.obj)'
LINK : warning LNK4217: symbol '?InternalSerializeUnknownFieldsToArray@WireFormat@internal@protobuf@google@@SAPEAEAEBVUnknownFieldSet@34@PEAEPEAVEpsCopyOutputStream@io@34@@Z (public: static unsigned char * __cdecl google::protobuf::internal::WireFormat::InternalSerializeUnknownFieldsToArray(class google::protobuf::UnknownFieldSet const &,unsigned char *,class google::protobuf::io::EpsCopyOutputStream *))' defined in 'libprotobuf.lib(wire_format.cc.obj)' is imported by 'onnx_proto.lib(onnx-ml.pb.cc.obj)' in function '"public: virtual unsigned char * __cdecl onnx::OperatorSetIdProto::_InternalSerialize(unsigned char *,class google::protobuf::io::EpsCopyOutputStream *)const " (?_InternalSerialize@OperatorSetIdProto@onnx@@UEBAPEAEPEAEPEAVEpsCopyOutputStream@io@protobuf@google@@@Z)'
LINK : warning LNK4286: symbol '?InternalSerializeUnknownFieldsToArray@WireFormat@internal@protobuf@google@@SAPEAEAEBVUnknownFieldSet@34@PEAEPEAVEpsCopyOutputStream@io@34@@Z (public: static unsigned char * __cdecl google::protobuf::internal::WireFormat::InternalSerializeUnknownFieldsToArray(class google::protobuf::UnknownFieldSet const &,unsigned char *,class google::protobuf::io::EpsCopyOutputStream *))' defined in 'libprotobuf.lib(wire_format.cc.obj)' is imported by 'onnx_proto.lib(onnx-data.pb.cc.obj)'
onnx_proto.lib(onnx-ml.pb.cc.obj) : error LNK2019: unresolved external symbol "__declspec(dllimport) public: __cdecl google::protobuf::internal::ArenaStringPtr::ArenaStringPtr(class google::protobuf::internal::ExplicitlyConstructed<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > > const *)" (__imp_??0ArenaStringPtr@internal@protobuf@google@@QEAA@PEBV?$ExplicitlyConstructed@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@123@@Z) referenced in function "public: __cdecl onnx::TensorProto::TensorProto(struct google::protobuf::internal::ConstantInitialized)" (??0TensorProto@onnx@@QEAA@UConstantInitialized@internal@protobuf@google@@@Z)
onnx_proto.lib(onnx-data.pb.cc.obj) : error LNK2001: unresolved external symbol "__declspec(dllimport) public: __cdecl google::protobuf::internal::ArenaStringPtr::ArenaStringPtr(class google::protobuf::internal::ExplicitlyConstructed<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > > const *)" (__imp_??0ArenaStringPtr@internal@protobuf@google@@QEAA@PEBV?$ExplicitlyConstructed@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@123@@Z)
@sstamenova sstamenova changed the title onnx and ORT mismatch in their handling of Protobuf_USE_STATIC_LIBS and ORT can produce incorrect warnings onnx and ORT mismatch in their handling of Protobuf_USE_STATIC_LIBS and ORT can produce incorrect warnings and failures Sep 6, 2022
@snnn
Copy link
Member

snnn commented Sep 6, 2022

You are right. Neither ORT nor protobuf use this Protobuf_USE_STATIC_LIBS variable. If you search it in https://github.com/protocolbuffers/protobuf, you will not find anything. So, in normal usages you should not need to worry about this macro.

@skottmckay skottmckay added the build build issues; typically submitted using template label Sep 7, 2022
@skottmckay
Copy link
Contributor

To Reproduce
Build ORT and onnx together on Windows without setting Protobuf_USE_STATIC_LIBS to ON

Protobuf_USE_STATIC_LIBS seems to only be used in a status message in the onnx cmake setup. https://github.com/onnx/onnx/search?q=Protobuf_USE_STATIC_LIBS. It doesn't seem to have any influence over how protobuf is built.

Based on that, I'm not following how not setting that value would affect a build and reproduce the issue.

@sstamenova
Copy link
Contributor Author

To Reproduce
Build ORT and onnx together on Windows without setting Protobuf_USE_STATIC_LIBS to ON

Protobuf_USE_STATIC_LIBS seems to only be used in a status message in the onnx cmake setup. https://github.com/onnx/onnx/search?q=Protobuf_USE_STATIC_LIBS. It doesn't seem to have any influence over how protobuf is built.

Based on that, I'm not following how not setting that value would affect a build and reproduce the issue.

CMake uses the value to determine which libs to use for protobuf: https://github.com/Kitware/CMake/blob/master/Modules/FindProtobuf.cmake

I recommend trying it locally to see it reproduce.

@skottmckay
Copy link
Contributor

Oh ok. I didn't realize it was a value used internally by cmake.

Could you please share the full command you're using to build?

We generally build using build.py, and there's only one place that sets Protobuf_USE_STATIC_LIBS, and that requires additional options for it to be hit. In my typical Windows build I don't use those additional options so Protobuf_USE_STATIC_LIBS is not explicitly set and I have no issues.

e.g. this is the cmake command build.py generates

2022-09-07 11:26:22,636 util.run [INFO] - Running subprocess in 'build\Windows.vs22\Debug'
  'C:\Program Files\CMake\bin\cmake.EXE' 'D:\src\github\ort\cmake' -Donnxruntime_RUN_ONNX_TESTS=OFF -Donnxruntime_GENERATE_TEST_REPORTS=ON '-DPython_EXECUTABLE=C:\Program Files\Python39\python.exe' '-DPYTHON_EXECUTABLE=C:\Program Files\Python39\python.exe' -Donnxruntime_USE_MIMALLOC=OFF -Donnxruntime_ENABLE_PYTHON=ON -Donnxruntime_BUILD_CSHARP=OFF -Donnxruntime_BUILD_JAVA=OFF -Donnxruntime_BUILD_NODEJS=OFF -Donnxruntime_BUILD_OBJC=OFF -Donnxruntime_BUILD_SHARED_LIB=ON -Donnxruntime_BUILD_APPLE_FRAMEWORK=OFF -Donnxruntime_USE_DNNL=OFF -Donnxruntime_USE_NNAPI_BUILTIN=OFF -Donnxruntime_USE_RKNPU=OFF -Donnxruntime_USE_NUPHAR_TVM=OFF -Donnxruntime_USE_LLVM=OFF -Donnxruntime_ENABLE_MICROSOFT_INTERNAL=OFF -Donnxruntime_USE_VITISAI=OFF -Donnxruntime_USE_NUPHAR=OFF -Donnxruntime_USE_TENSORRT=OFF -Donnxruntime_USE_TENSORRT_BUILTIN_PARSER=OFF -Donnxruntime_TENSORRT_PLACEHOLDER_BUILDER=OFF -Donnxruntime_USE_TVM=OFF -Donnxruntime_TVM_CUDA_RUNTIME=OFF -Donnxruntime_TVM_USE_HASH=OFF -Donnxruntime_USE_MIGRAPHX=OFF -Donnxruntime_CROSS_COMPILING=OFF -Donnxruntime_DISABLE_CONTRIB_OPS=OFF -Donnxruntime_DISABLE_ML_OPS=OFF -Donnxruntime_DISABLE_RTTI=OFF -Donnxruntime_DISABLE_EXCEPTIONS=OFF -Donnxruntime_MINIMAL_BUILD=OFF -Donnxruntime_EXTENDED_MINIMAL_BUILD=OFF -Donnxruntime_MINIMAL_BUILD_CUSTOM_OPS=OFF -Donnxruntime_REDUCED_OPS_BUILD=OFF -Donnxruntime_ENABLE_LANGUAGE_INTEROP_OPS=OFF -Donnxruntime_USE_DML=OFF -Donnxruntime_USE_WINML=OFF -Donnxruntime_BUILD_MS_EXPERIMENTAL_OPS=OFF -Donnxruntime_USE_TELEMETRY=OFF -Donnxruntime_ENABLE_LTO=OFF -Donnxruntime_USE_ACL=OFF -Donnxruntime_USE_ACL_1902=OFF -Donnxruntime_USE_ACL_1905=OFF -Donnxruntime_USE_ACL_1908=OFF -Donnxruntime_USE_ACL_2002=OFF -Donnxruntime_USE_ARMNN=OFF -Donnxruntime_ARMNN_RELU_USE_CPU=ON -Donnxruntime_ARMNN_BN_USE_CPU=ON -Donnxruntime_ENABLE_NVTX_PROFILE=OFF -Donnxruntime_ENABLE_TRAINING=OFF -Donnxruntime_ENABLE_TRAINING_OPS=OFF -Donnxruntime_ENABLE_TRAINING_TORCH_INTEROP=OFF -Donnxruntime_ENABLE_TRAINING_ON_DEVICE=OFF -Donnxruntime_ENABLE_CPU_FP16_OPS=OFF -Donnxruntime_USE_NCCL=OFF -Donnxruntime_BUILD_BENCHMARKS=OFF -Donnxruntime_USE_ROCM=OFF -DOnnxruntime_GCOV_COVERAGE=OFF -Donnxruntime_USE_MPI=ON -Donnxruntime_ENABLE_MEMORY_PROFILE=OFF -Donnxruntime_ENABLE_CUDA_LINE_NUMBER_INFO=OFF -Donnxruntime_BUILD_WEBASSEMBLY=OFF -Donnxruntime_BUILD_WEBASSEMBLY_STATIC_LIB=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_EXCEPTION_CATCHING=ON -Donnxruntime_ENABLE_WEBASSEMBLY_EXCEPTION_THROWING=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_THREADS=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_DEBUG_INFO=OFF -Donnxruntime_ENABLE_WEBASSEMBLY_PROFILING=OFF -Donnxruntime_ENABLE_EAGER_MODE=OFF -Donnxruntime_ENABLE_LAZY_TENSOR=OFF -Donnxruntime_ENABLE_EXTERNAL_CUSTOM_OP_SCHEMAS=OFF -Donnxruntime_ENABLE_CUDA_PROFILING=OFF -Donnxruntime_ENABLE_ROCM_PROFILING=OFF -Donnxruntime_USE_XNNPACK=ON -Donnxruntime_DEV_MODE=ON -DONNX_USE_MSVC_STATIC_RUNTIME=OFF -Dprotobuf_MSVC_STATIC_RUNTIME=OFF -Dgtest_force_shared_crt=ON -Donnxruntime_PYBIND_EXPORT_OPSCHEMA=OFF -A x64 -T host=x64 -G 'Visual Studio 17 2022' -Donnxruntime_ENABLE_MEMLEAK_CHECKER=ON -DCMAKE_BUILD_TYPE=Debug

@snnn snnn added the feature request request for unsupported feature or enhancement label Sep 7, 2022
@sstamenova
Copy link
Contributor Author

Our team includes ort through our cmake file like so:

  set(onnxruntime_BUILD_SHARED_LIB ON)
  set(onnxruntime_BUILD_UNIT_TESTS OFF)
  set(onnxruntime_ENABLE_TRAINING ON)
  set(onnxruntime_USE_FULL_PROTOBUF ON)
  set(onnxruntime_PREFER_SYSTEM_LIB ON)
  set(Protobuf_USE_STATIC_LIBS ON) <- This was added to avoid the errors from onnx
  add_subdirectory(onnxruntime/cmake EXCLUDE_FROM_ALL)

and passing -DCMAKE_PREFIX_PATH=path\to\protobuf-install-release on the command line.

You can reproduce it by invoking cmake directly:

cmake -G Ninja \path\to\onnxruntime\cmake -Donnxruntime_BUILD_SHARED_LIB=ON -Donnxruntime_BUILD_UNIT_TESTS=OFF -Donnxruntime_ENABLE_TRAINING=ON -Donnxruntime_USE_FULL_PROTOBUF=ON -Donnxruntime_PREFER_SYSTEM_LIB=ON -DCMAKE_PREFIX_PATH=path\to\protobuf-install-release

Or using build.bat (though, at least for me, that fails with warnings treated as error in unrelated parts):

./build.bat --config Release --build_shared_lib --parallel --enable_training --skip_tests --use_full_protobuf --cmake_generator "Visual Studio 17 2022" --cmake_extra_defines CMAKE_PREFIX_PATH=path\to\protobuf-install-release onnxruntime_PREFER_SYSTEM_LIB=ON

@skottmckay
Copy link
Contributor

Got it. build.py does not set the experimental onnxruntime_PREFER_SYSTEM_LIB, and it has this comment in the cmake file.

#The onnxruntime_PREFER_SYSTEM_LIB is mainly designed for package managers like apt/yum/vcpkg.
#Please note, by default Protobuf_USE_STATIC_LIBS is OFF but it's recommended to turn it ON on Windows. You should set it properly when onnxruntime_PREFER_SYSTEM_LIB is ON otherwise you'll hit linkage errors.
#If you have already installed protobuf(or the others) in your system at the default system paths(like /usr/include), then it's better to set onnxruntime_PREFER_SYSTEM_LIB ON. Otherwise onnxruntime may see two different protobuf versions and we won't know which one will be used, the worst case could be onnxruntime picked up header files from one of them but the binaries from the other one.
#It's default OFF because it's experimental now.
option(onnxruntime_PREFER_SYSTEM_LIB "Experimental: Build with the preinstalled libraries in your system" OFF)

My initial thought would be we could change this to make the ONNX build consistent.

#Here we support two build mode:
#1. if ONNX_CUSTOM_PROTOC_EXECUTABLE is set, build Protobuf from source, except protoc.exe. This mode is mainly
# for cross-compiling
#2. if ONNX_CUSTOM_PROTOC_EXECUTABLE is not set, Compile everything(including protoc) from source code.
if (onnxruntime_PREFER_SYSTEM_LIB)
find_package(Protobuf)
endif()
if (Protobuf_FOUND)
message("Use protobuf from preinstalled system lib")
if (onnxruntime_USE_FULL_PROTOBUF)
set(PROTOBUF_LIB protobuf::libprotobuf)
#We have a check here but most of the cmake users don't know the Protobuf_USE_STATIC_LIBS
# variable exists and may leave it in a wrong state.
if (NOT Protobuf_USE_STATIC_LIBS)
# ONNX Runtime itself can work in such a setting but it may cause compatibility issues
# when ONNX Runtime is integrated with the other ONNX ecosystem softwares.
message(WARNING "Use Protobuf_USE_STATIC_LIBS to ensure compatibility with other ONNX ecosystem components.")
endif()
else()
set(PROTOBUF_LIB protobuf::libprotobuf-lite)
endif()
else()
message("Use protobuf from submodule")

by changing it to also set ONNX_USE_PROTOBUF_SHARED_LIBS

    # We have a check here but most of the cmake users don't know the Protobuf_USE_STATIC_LIBS
    # variable exists and may leave it in a wrong state.
    # Also need to make onnx build params consistent as (currently) that uses the value of ONNX_USE_PROTOBUF_SHARED_LIBS
    # to set Protobuf_USE_STATIC_LIBS. This is somewhat fragile as changes in the ONNX cmake setup may break this.
    if(NOT Protobuf_USE_STATIC_LIBS)
      # ONNX Runtime itself can work in such a setting but it may cause compatibility issues
      # when ONNX Runtime is integrated with the other ONNX ecosystem softwares.
      message(WARNING "Use Protobuf_USE_STATIC_LIBS to ensure compatibility with other ONNX ecosystem components.")
      set(ONNX_USE_PROTOBUF_SHARED_LIBS ON)
    else()
      set(ONNX_USE_PROTOBUF_SHARED_LIBS OFF)
    endif()

However this is only in a branch that is taken when onnxruntime_PREFER_SYSTEM_LIB is set. @snnn do we need to make sure ONNX_USE_PROTOBUF_SHARED_LIBS is consistent with Protobuf_USE_STATIC_LIBS for all builds?

@sstamenova
Copy link
Contributor Author

Setting ONNX_USE_PROTOBUF_SHARED_LIBS to ON if Protobuf_USE_STATIC_LIBS is not set won't work right either. For example, in our setup, we do want to use static libs and that's the version of protobuf we are linking against, so setting ONNX_USE_PROTOBUF_SHARED_LIBS to ON will break us.

It would be better, I think, to expose something like ORT_USE_PROTOBUF_SHARED_LIBS (set to OFF by default) and use that to set ONNX_USE_PROTOBUF_SHARED_LIBS and possibly Protobuf_USE_STATIC_LIBS. This will ensure that everything is consistent.

@skottmckay
Copy link
Contributor

According to https://cmake.org/cmake/help/latest/module/FindProtobuf.html the default of Protobuf_USE_STATIC_LIBS is OFF. So if that is not set or OFF wouldn't it be using the shared protobuf libs and ONNX_USE_PROTOBUF_SHARED_LIBS should be ON to make the onnx build consistent?

With that setup, is it not the case that If you wish to use static libs you can set Protobuf_USE_STATIC_LIBS. What value would adding ORT_USE_PROTOBUF_SHARED_LIBS provide apart from having a different default?

@snnn
Copy link
Member

snnn commented Sep 10, 2022

do we need to make sure ONNX_USE_PROTOBUF_SHARED_LIBS is consistent with Protobuf_USE_STATIC_LIBS for all builds?

The variable Protobuf_USE_STATIC_LIBS takes effect only when onnxruntime_PREFER_SYSTEM_LIB is ON. And the onnxruntime_PREFER_SYSTEM_LIB variable was added only for Linux package managers. Linux only. Most of the code was added by the community and maintained by the community. So, everyone is welcomed to make it better. Also, everyone is welcomed to review such changes. On the other hand, Microsoft teams should use https://github.com/microsoft/vcpkg .

Besides, the var Protobuf_USE_STATIC_LIBS was added and used by the cmake community, not Google staffs. I think from the very beginning it shouldn't exist. The dev team of protobuf team do not suggest using protobuf as a dynamical library. There are tons of issues related to it and simply nobody won't fix them. Or, won't be able to fix them. In my opinion, nobody have the need to dynamically link to protobuf. If someone does it, they are wrong.

In some very limited usage cases, it may still work. But, if it doesn't work, don't report it Google and don't report it to us. One such case is: If you want to use the full version of protobuf(onnxruntime_USE_FULL_PROTOBUF=ON), you should only use the static version of protobuf.

In ORT's CMakeLists.txt we have:

if (onnxruntime_USE_FULL_PROTOBUF)
    set(PROTOBUF_LIB protobuf::libprotobuf)
    #We have a check here but most of the cmake users don't know the Protobuf_USE_STATIC_LIBS
    # variable exists and may leave it in a wrong state.
    if (NOT Protobuf_USE_STATIC_LIBS)
      # ONNX Runtime itself can work in such a setting but it may cause compatibility issues
      # when ONNX Runtime is integrated with the other ONNX ecosystem softwares.
      message(WARNING "Use Protobuf_USE_STATIC_LIBS to ensure compatibility with other ONNX ecosystem components.")
    endif()
endif()

But initially it wasn't a warning. It was a fatal error. Then a 3rd-party user asked me to relax the check because he believed he could make it work in his use cases. So I let him do it. But I would not help fix any problem coming from it. Because I know there are many blockers there. Even the dev team of protobuf wouldn't want to touch them, why would I? And what value does it add?

@sstamenova
Copy link
Contributor Author

@snnn a couple of things:

onnxruntime_PREFER_SYSTEM_LIB is not documented as Linux only. If it is meant to only be used on Linux, I think it should be defined in such a way that it cannot be used on other platforms. Otherwise, it will be used elsewhere and it will run into issues. In our case, it was very convenient on both Windows and Linux to avoid having to build things we had prebuilt.

Protobuf_USE_STATIC_LIBS: whether we agree with the existence of this variable or not, both cmake and onnx currently leverage it which means that by extension it impacts onnxruntime. While in our case we want to statically linked against protobuf, we ran into issues because the default for the variable is different between cmake and onnx and there's no coordination between how it is used in onnxruntime (through cmake, set to OFF) and onnx (where it is explicitly set to ON by default).

I think it's a great idea to enforce that it is explicitly set to ON (and consequently that ONNX_USE_PROTOBUF_SHARED_LIBS is explicitly set to OFF) for the onnxruntime scenarios since that will ensure that users of onnxruntime don't run into random and hard diagnose issues because of mismatches between onnxruntime and onnx.

However, even if onnxruntime is not going to enforce any specific setting (as it currently doesn't), as the primary product that users are going to interact with, it still needs to be the one to ensure that there is coordination between its own usage and that of onnx. As such, it makes sense to introduce a variable (such as onnxruntime_USE_PROTOBUF_SHARED_LIBS or similar) which can be set to OFF by default and which would control both Protobuf_USE_STATIC_LIBS and ONNX_USE_PROTOBUF_SHARED_LIBS.

@snnn
Copy link
Member

snnn commented Sep 10, 2022

If we step back a bit and assume there was just one library consuming libprotobuf, then who should set this Protobuf_USE_STATIC_LIBS variable? I think it should be the user who invoke the cmake command. Only they know whether the preinstalled protobuf lib is static or dynamic. For example, if I wrote a simple C/C++ program which just read a ONNX model and print out all the names of all the ops, I can do it by directly interact with protobuf, without the need of using onnx or onnxruntime. And that simple program could either use a static protobuf library or a dynamic one. Both work good enough. So I believe the program's CMakeFile.txt should not try to set a default value for Protobuf_USE_STATIC_LIBS . I think the name "Protobuf_USE_STATIC_LIBS " is a bit misleading. It is not an ON/OFF control option. When you set it to ON, it doesn't mean cmake will help you find a static library of protobuf. Conversely, it is the way to let cmake know the lib cmake found is static or not. If you don't tell it, cmake would assume you have installed a dynamic one. So, my opinion is whenever you use https://github.com/Kitware/CMake/blob/master/Modules/FindProtobuf.cmake you should always manually specify -DProtobuf_USE_STATIC_LIBS=xxx. The software you are building doesn't know how you installed protobuf.

Then back to ONNX: the var ONNX_USE_PROTOBUF_SHARED_LIBS is just another name of Protobuf_USE_STATIC_LIBS . ONNX expect you using ONNX_USE_PROTOBUF_SHARED_LIBS instead of Protobuf_USE_STATIC_LIBS. I don't know the reason. Probably because they want to support the case that you have both static protobuf lib and dynamic protobuf in the same application. For example, in ORT while the core framework prefer to static link to protobuf, an ORT EP could still either statically link to protobuf or dynamically. Then it might be good to have two different variables to control the two different things. But, it is just my guessing. I don't know why ONNX decided to introduce a new variable to control this.

@snnn
Copy link
Member

snnn commented Sep 10, 2022

If you set onnxruntime_USE_CUDA = ON, you build ONNX Runtime with CUDA. If you set onnxruntime_USE_CUDA = OFF, you build ONNX Runtime without CUDA. But Protobuf_USE_STATIC_LIBS doesn't work in that way. On Windows You should set Protobuf_USE_STATIC_LIBS to ON if you know the protobuf lib cmake will find will definitely be a static lib. And you set it to OFF if you know it would be definitely dynamic. If you are not sure, you run cmake and see what it find, then do it again and set Protobuf_USE_STATIC_LIBS correctly according to the previous run's result. It is not intuitive but it is how this cmake module works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template feature request request for unsupported feature or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants