Static initialization issue when building LibTorch (CPU) as static lib on Windows #83255

1enn0 · 2022-08-11T13:31:13Z

🐛 Describe the bug

When compiling LibTorch as a static lib (BUILD_SHARED_LIBS=OFF) on Windows 11, there is an issue with the static initialization. Here is a small example how to reproduce this issue:

Create Conda env

conda create -n pytorch_build python=3.10
conda activate pytorch_build
conda install pyyaml typing_extensions

Build LibTorch and e.g. `op_registration_test`

For simplicity, we are building LibTorch (CPU only, without MKL etc.) and a single test target from the Caffe2_CPU_TEST_SRCS. I am using plain CMake, Ninja and MSVC compiler.

# clone repo
git clone --recursive https://github.com/pytorch/pytorch

# configure and build
cmake -G Ninja -S pytorch -B pytorch-build -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=pytorch-install -DUSE_CUDA=OFF -DUSE_DISTRIBUTED=OFF -DUSE_TENSORPIPE=OFF -USE_MAGMA=OFF -DUSE_NUMPY=OFF -DUSE_OPENMP=OFF -DUSE_KINETO=OFF -DUSE_QNNPACK=OFF -DBUILD_SHARED_LIBS=OFF -DBUILD_PYTHON=OFF -DBUILD_CAFFE2_OPS=OFF -DBUILD_TEST=ON -DUSE_FBGEMM=OFF -Wno-dev -DMSVC_Z7_OVERRIDE=OFF -DCAFFE2_USE_MSVC_STATIC_RUNTIME=OFF -Dprotobuf_MSVC_STATIC_RUNTIME=OFF
cmake --build pytorch-build --target op_registration_test

When running pytorch-build/bin/op_registration_test.exe in the console, you will see no output. When having a closer look by opening the executable in Visual Studio and running it under the debugger, we get the error message

"Type class c10::intrusive_ptr<struct LinearPackedParamsBase,struct c10::detail::intrusive_target_default_null_type<struct LinearPackedParamsBase> > could not be converted to any of the known types."

For more detail, see the full call stack here.

The problem seems to be that some of the statically initialized classes (LinearPackedParamsBase and ConvPackedParamsBase to be specific) are not initialized before some of their usage in a TORCH_LIBRARY_IMPL definition. These problematic sections are:

aten/src/ATen/native/quantized/qlinear_unpack.cpp
aten/src/ATen/native/quantized/qconv_unpack.cpp
aten/src/ATen/native/quantized/cpu/qlinear_prepack.cpp
aten/src/ATen/native/quantized/cpu/qlinear_dynamic.cpp
aten/src/ATen/native/quantized/cpu/qlinear.cpp
aten/src/ATen/native/ao_sparse/quantized/cpu/qlinear_unpack.cpp
aten/src/ATen/native/ao_sparse/quantized/cpu/qlinear_prepack.cpp
aten/src/ATen/native/ao_sparse/quantized/cpu/qlinear.cpp

Specifically calling the construct on first use initialization functions (register_linear_params() and register_conv_params<>()) from the corresponding namespace at the top of the TORCH_LIBRARY_IMPL body in above files fixes the issue. I have a PR ready that can fix this but I am wondering if there might be a better solution to this problem 😃

Versions

The collect_env.py script does not give any useful information here, as I am trying to build libtorch. Instead, I am including the summary output of cmake:

-- ******** Summary ********
-- General:
--   CMake version         : 3.23.1
--   CMake command         : C:/Program Files/CMake/bin/cmake.exe
--   System                : Windows
--   C++ compiler          : C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.33.31629/bin/Hostx64/x64/cl.exe
--   C++ compiler id       : MSVC
--   C++ compiler version  : 19.33.31629.0
--   Using ccache if found : OFF
--   CXX flags             : /DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO
--   Build type            : Debug
--   Compile definitions   : WIN32_LEAN_AND_MEAN;ONNX_ML=1;ONNXIFI_ENABLE_EXT=1;ONNX_NAMESPACE=onnx_torch;_CRT_SECURE_NO_DEPRECATE=1;USE_EXTERNAL_MZCRC;MINIZ_DISABLE_ZIP_READER_CRC32_CHECKS
--   CMAKE_PREFIX_PATH     :
--   CMAKE_INSTALL_PREFIX  : [...]/pytorch-install
--   USE_GOLD_LINKER       : OFF
--
--   TORCH_VERSION         : 1.13.0
--   CAFFE2_VERSION        : 1.13.0
--   BUILD_CAFFE2          : OFF
--   BUILD_CAFFE2_OPS      : OFF
--   BUILD_CAFFE2_MOBILE   : OFF
--   BUILD_STATIC_RUNTIME_BENCHMARK: OFF
--   BUILD_TENSOREXPR_BENCHMARK: OFF
--   BUILD_NVFUSER_BENCHMARK: OFF
--   BUILD_BINARY          : OFF
--   BUILD_CUSTOM_PROTOBUF : ON
--     Protobuf compiler   :
--     Protobuf includes   :
--     Protobuf libraries  :
--   BUILD_DOCS            : OFF
--   BUILD_PYTHON          : OFF
--   BUILD_SHARED_LIBS     : OFF
--   CAFFE2_USE_MSVC_STATIC_RUNTIME     : OFF
--   BUILD_TEST            : ON
--   BUILD_JNI             : OFF
--   BUILD_MOBILE_AUTOGRAD : OFF
--   BUILD_LITE_INTERPRETER: OFF
--   INTERN_BUILD_MOBILE   :
--   USE_BLAS              : 0
--   USE_LAPACK            : 0
--   USE_ASAN              : OFF
--   USE_CPP_CODE_COVERAGE : OFF
--   USE_CUDA              : OFF
--   USE_ROCM              : OFF
--   USE_EIGEN_FOR_BLAS    : ON
--   USE_FBGEMM            : OFF
--     USE_FAKELOWP          : OFF
--   USE_KINETO            : OFF
--   USE_FFMPEG            : OFF
--   USE_GFLAGS            : OFF
--   USE_GLOG              : OFF
--   USE_LEVELDB           : OFF
--   USE_LITE_PROTO        : OFF
--   USE_LMDB              : OFF
--   USE_METAL             : OFF
--   USE_PYTORCH_METAL     : OFF
--   USE_PYTORCH_METAL_EXPORT     : OFF
--   USE_MPS               : OFF
--   USE_FFTW              : OFF
--   USE_MKL               : OFF
--   USE_MKLDNN            : ON
--   USE_MKLDNN_ACL        : OFF
--   USE_MKLDNN_CBLAS      : OFF
--   USE_UCC               : OFF
--   USE_ITT               : ON
--   USE_NCCL              : OFF
--   USE_NNPACK            : OFF
--   USE_NUMPY             : OFF
--   USE_OBSERVERS         : OFF
--   USE_OPENCL            : OFF
--   USE_OPENCV            : OFF
--   USE_OPENMP            : OFF
--   USE_TBB               : OFF
--   USE_VULKAN            : OFF
--   USE_PROF              : OFF
--   USE_QNNPACK           : OFF
--   USE_PYTORCH_QNNPACK   : OFF
--   USE_XNNPACK           : ON
--   USE_REDIS             : OFF
--   USE_ROCKSDB           : OFF
--   USE_ZMQ               : OFF
--   USE_DISTRIBUTED       : OFF
--   USE_DEPLOY           : OFF
--   Public Dependencies  : caffe2::Threads
--   Private Dependencies : pthreadpool;cpuinfo;XNNPACK;ittnotify;fp16;foxi_loader;fmt::fmt-header-only
--   USE_COREML_DELEGATE     : OFF
--   BUILD_LAZY_TS_BACKEND   : ON
-- Configuring done
-- Generating done

cc @peterjc123 @mszhanyi @skyline75489 @nbcsm

The text was updated successfully, but these errors were encountered:

xsacha · 2022-08-11T23:43:28Z

Thanks for that. I gave up on building it statically earlier. There's a few CUDA ops that don't get registered too.

1enn0 · 2022-08-12T08:43:34Z

I haven't tried building with CUDA, yet. I might get around to it next week and check if this approach would fix those issues, as well.

ppwwyyxx · 2022-10-09T17:48:49Z

I ran into the same issue and confirmed that #83258 fixed it for me.

This is a static initialization problem and not specific tied to windows (I encountered it on linux).
cc @malfet

ppwwyyxx · 2022-11-18T20:51:19Z

@cristianPanaite could you help @1enn0 merge the above PR?

cristianPanaite · 2022-11-21T10:26:44Z

Hello @1enn0 @ppwwyyxx!

To be able to proceed with merging the PR you have to sign the EasyCLA.

here you can notice that the EasyCLA is not covered.

All you have to do is press that red button that will redirect you to the platform where you can sign the EasyCLA.

Keep in mind that you must use the same email you used to do this PR, and also set your email visibility as public (You can do that in Settings->Emails). Check these things before signing the CLA.

After these steps, the PR should be ready to merge.

ppwwyyxx · 2022-12-07T23:13:04Z

@cristianPanaite I made a new PR #90133 for this. Could you help merge it?

Fixes pytorch#83255 Code comes from pytorch#83258 after fixing merge conflicts. Pull Request resolved: pytorch#90133 Approved by: https://github.com/soumith, https://github.com/malfet

The `TORCH_LIBRARY_IMPL` registrations in `OpsImpl.cpp` needs to happen after `ProcessGroup` is registered as a torch class -- which happens in `Ops.cpp`. However, the order of the registrations is undefined between the two files. If the registration in `OpsImpl.cpp` runs before `Ops.cpp`, we get a crash at program launch similar to #83255 . This happens in our internal build. This PR moves `OpsImpl.cpp` to the end of `Oops.cpp`. Because according to the omniscient lord of chatGPT: <img width="600" alt="2022-12-04_19-25" src="https://user-images.githubusercontent.com/1381301/205542847-3535b319-3c2a-4e8e-bc11-27913f6afb39.png"> Pull Request resolved: #90149 Approved by: https://github.com/kwen2501, https://github.com/H-Huang, https://github.com/soumith

…)" This reverts commit 6a01394.

1enn0 added a commit to 1enn0/pytorch that referenced this issue Aug 11, 2022

Fix static initialization issue for static build (pytorch#83255)

6a01394

1enn0 mentioned this issue Aug 11, 2022

Fix static initialization issue for static build (#83255) #83258

Closed

soulitzer added module: windows Windows support for PyTorch module: abi libtorch C++ ABI related problems triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Aug 11, 2022

Blackhex assigned cristianPanaite Nov 9, 2022

This was referenced Dec 4, 2022

Fix static initialization issue for static build #90133

Closed

Fix a static initialization order fiasco in c10d #90149

Closed

pytorchmergebot closed this as completed in ff5a359 Dec 9, 2022

1enn0 pushed a commit to 1enn0/pytorch that referenced this issue Jan 4, 2023

Revert "Fix static initialization issue for static build (pytorch#83255…

66524c4

…)" This reverts commit 6a01394.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Static initialization issue when building LibTorch (CPU) as static lib on Windows #83255

Static initialization issue when building LibTorch (CPU) as static lib on Windows #83255

1enn0 commented Aug 11, 2022 •

edited

xsacha commented Aug 11, 2022

1enn0 commented Aug 12, 2022

ppwwyyxx commented Oct 9, 2022 •

edited

ppwwyyxx commented Nov 18, 2022

cristianPanaite commented Nov 21, 2022

ppwwyyxx commented Dec 7, 2022

Static initialization issue when building LibTorch (CPU) as static lib on Windows #83255

Static initialization issue when building LibTorch (CPU) as static lib on Windows #83255

Comments

1enn0 commented Aug 11, 2022 • edited

🐛 Describe the bug

Create Conda env

Build LibTorch and e.g. op_registration_test

Versions

xsacha commented Aug 11, 2022

1enn0 commented Aug 12, 2022

ppwwyyxx commented Oct 9, 2022 • edited

ppwwyyxx commented Nov 18, 2022

cristianPanaite commented Nov 21, 2022

ppwwyyxx commented Dec 7, 2022

1enn0 commented Aug 11, 2022 •

edited

Build LibTorch and e.g. `op_registration_test`

ppwwyyxx commented Oct 9, 2022 •

edited