Skip to content

fix(mkl): remove libiomp5.so from dynamic linking to avoid OpenMP conflict#1785

Open
LHT129 wants to merge 1 commit intoantgroup:mainfrom
LHT129:2026-03-31-修复-mkl与openmp双重链接导致崩溃

Hidden character warning

The head ref may contain hidden characters: "2026-03-31-\u4fee\u590d-mkl\u4e0eopenmp\u53cc\u91cd\u94fe\u63a5\u5bfc\u81f4\u5d29\u6e83"
Open

fix(mkl): remove libiomp5.so from dynamic linking to avoid OpenMP conflict#1785
LHT129 wants to merge 1 commit intoantgroup:mainfrom
LHT129:2026-03-31-修复-mkl与openmp双重链接导致崩溃

Conversation

@LHT129
Copy link
Copy Markdown
Collaborator

@LHT129 LHT129 commented Mar 31, 2026

Summary

This PR fixes a runtime crash caused by linking both Intel OpenMP (libiomp5.so) and GNU OpenMP (libgomp.so) when using MKL as the BLAS backend. The issue occurs because diskann and other components link GNU OpenMP, while MKL's dynamic linking configuration links Intel OpenMP, creating conflicting OpenMP implementations at runtime.

Changes

  • extern/mkl/mkl.cmake: Removed libiomp5.so from the BLAS_LIBRARIES list in dynamic linking mode
  • Now only links libmkl_rt.so, allowing MKL to use the system's GNU OpenMP runtime (libgomp.so) already linked by other components

Root Cause

When using MKL as BLAS backend with dynamic linking:

  1. MKL configuration links both libmkl_rt.so and libiomp5.so (Intel OpenMP)
  2. Diskann links libgomp.so (GNU OpenMP)
  3. Two different OpenMP implementations coexist, causing:

Solution

Remove explicit linking of libiomp5.so, allowing MKL's libmkl_rt.so to use the GNU OpenMP runtime that diskann and other components already link. This ensures a single OpenMP implementation throughout the application.

Testing

  • Build verification: make release passes
  • Link check: Only libgomp.so appears in ldd output, no libomp.so.5 or libiomp5.so
  • Functional tests: All tests pass (pending full test suite execution)

Related Issues

Fixes #1774 - core dump on eval_performance for Rabitq Build

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the explicit linking of libiomp5.so from the MKL configuration in extern/mkl/mkl.cmake. Feedback suggests cleaning up redundant dependency checks for OMP_PATH that are no longer necessary, addressing potential runtime loading issues for the Intel threading layer, and consolidating redundant installation loops for MKL_INSTALL_LIBS to ensure consistent behavior.

Comment thread extern/mkl/mkl.cmake

set (BLAS_LIBRARIES
"${MKL_PATH}/libmkl_rt.so"
"${OMP_PATH}/libiomp5.so"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The removal of libiomp5.so from BLAS_LIBRARIES is correct for avoiding OpenMP conflicts, but it introduces a few points for improvement:

  1. Redundant Dependency Check: The search for OMP_PATH (lines 79-86) and the mandatory check at line 96 are now unnecessary for the dynamic linking configuration. This may cause build failures on systems where MKL is present but the standalone Intel OpenMP library is not, even though it is no longer required for linking.
  2. Runtime Configuration: Since libmkl_rt.so defaults to the Intel threading layer, it will still attempt to load libiomp5.so at runtime unless the MKL_THREADING_LAYER environment variable is set to GNU (or mkl_set_threading_layer is called). Without this configuration, the application might fail to find the library at runtime since it is no longer bundled.
  3. Redundant Installation: MKL_INSTALL_LIBS is currently processed by two redundant install loops (lines 110 and 120), which should be consolidated to avoid duplicate installation rules and ensure consistent behavior (e.g., the if (EXISTS ...) check is missing in the global loop).

@LHT129 LHT129 force-pushed the 2026-03-31-修复-mkl与openmp双重链接导致崩溃 branch 2 times, most recently from d31129c to eab55b3 Compare April 2, 2026 08:35
@pull-request-size pull-request-size Bot added size/M and removed size/S labels Apr 2, 2026
@LHT129
Copy link
Copy Markdown
Collaborator Author

LHT129 commented Apr 2, 2026

感谢 @gemini-code-assist 的反馈!已完成以下修改:

  1. ✅ 移除了动态链接分支中冗余的 OMP_PATH 查找(原79-86行)
  2. ✅ 移除了对 OMP_PATH 的 mandatory check
  3. ✅ 移除了冗余的 MKL_INSTALL_LIBS 安装循环

静态链接分支(MKL_STATIC_LINK)保持不变,因为它确实需要 Intel OpenMP。

修改已提交并推送:commit eab55b3

@gemini-code-assist
Copy link
Copy Markdown
Contributor

收到,感谢你的确认与更新。这些修改看起来非常合理,通过移除对 libiomp5.so 的强制依赖并切换到系统默认的 OpenMP 运行时,能够有效解决与 libgomp.so 的冲突问题,同时保持了静态链接分支的兼容性。我会对这些改动进行最终确认。

@LHT129 LHT129 force-pushed the 2026-03-31-修复-mkl与openmp双重链接导致崩溃 branch from eab55b3 to ac72943 Compare April 2, 2026 09:34
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 17, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Require kind label

Wonderful, this rule succeeded.
  • label~=^kind/

🟢 Require version label

Wonderful, this rule succeeded.
  • label~=^version/

Copilot AI review requested due to automatic review settings April 21, 2026 02:51
@LHT129 LHT129 force-pushed the 2026-03-31-修复-mkl与openmp双重链接导致崩溃 branch 5 times, most recently from 11657d4 to 5f01bf3 Compare April 21, 2026 02:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR updates the Intel MKL dynamic-link configuration to avoid runtime crashes caused by simultaneously linking Intel OpenMP (libiomp5.so) and GNU OpenMP (libgomp.so) in the same process (issue #1774).

Changes:

  • Remove explicit dynamic linking against libiomp5.so from the MKL BLAS link list.
  • Add OpenMP runtime library linkage (gomp/omp) and adjust installed MKL library list.
  • Update TSAN workflow suppressions and set MKL_THREADING_LAYER=GNU.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
extern/mkl/mkl.cmake Removes libiomp5.so from dynamic linking and adjusts OpenMP/MKL library linkage + install behavior.
.github/workflows/tsan_build_and_test.yml Adds GNU OpenMP + MKL suppressions and forces MKL to use GNU threading layer during TSAN runs.

Comment thread extern/mkl/mkl.cmake
Comment on lines +105 to +110
# Add GNU OpenMP for compatibility with diskann and other components
if (CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
list (APPEND BLAS_LIBRARIES omp)
else ()
list (APPEND BLAS_LIBRARIES gomp)
endif ()
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linking omp when compiling with Clang can reintroduce the same class of OpenMP runtime conflicts this PR is trying to avoid (e.g., if other components link libgomp). Also, Clang does not imply libomp is the OpenMP runtime in use (Clang can target libgomp). Prefer using CMake’s OpenMP integration (e.g., find_package(OpenMP) and linking OpenMP::OpenMP_CXX) so the correct runtime/library flags are selected consistently across the whole build.

Copilot uses AI. Check for mistakes.
Comment thread extern/mkl/mkl.cmake
Comment on lines +93 to 95
if (NOT MKL_PATH OR NOT MKL_INCLUDE_PATH)
message (FATAL_ERROR "Could not find Intel MKL (dynamic) libraries/headers. "
"Please check your MKL installation or disable ENABLE_INTEL_MKL.")
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dynamic mode, the config now appends an OpenMP runtime library (gomp/omp) but no longer validates that OpenMP is available/discoverable. This can turn what used to be a clear configure-time failure into a later link-time failure. If you switch to find_package(OpenMP), you can also fail fast with a targeted error when OpenMP isn’t found.

Copilot uses AI. Check for mistakes.
Comment thread extern/mkl/mkl.cmake
Comment on lines +67 to +68
if (EXISTS ${mkllib})
install (FILES ${mkllib} DESTINATION ${CMAKE_INSTALL_LIBDIR})
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EXISTS check and install(FILES ...) arguments should be quoted to avoid issues when paths contain spaces or semicolons (CMake list expansion). For example, use if (EXISTS \"${mkllib}\") and install(FILES \"${mkllib}\" ...).

Suggested change
if (EXISTS ${mkllib})
install (FILES ${mkllib} DESTINATION ${CMAKE_INSTALL_LIBDIR})
if (EXISTS "${mkllib}")
install (FILES "${mkllib}" DESTINATION ${CMAKE_INSTALL_LIBDIR})

Copilot uses AI. Check for mistakes.
@@ -69,8 +69,13 @@ jobs:
run: |
echo race:libomp.so > omp.supp
echo race:libomp.so.5 >> omp.supp
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suppression adds only the SONAME-suffixed libgomp.so.1. Depending on distro/toolchain, TSAN may report the library as libgomp.so (or another SONAME). Consider adding both race:libgomp.so and race:libgomp.so.1 (or using a supported wildcard pattern) to make the suppression robust across environments.

Suggested change
echo race:libomp.so.5 >> omp.supp
echo race:libomp.so.5 >> omp.supp
echo race:libgomp.so >> omp.supp

Copilot uses AI. Check for mistakes.
…flict

Signed-off-by: LHT129 <tianlan.lht@antgroup.com>

Co-authored-by: opencode <opencode@users.noreply.github.com>
@LHT129 LHT129 force-pushed the 2026-03-31-修复-mkl与openmp双重链接导致崩溃 branch from 5f01bf3 to 2d02e1f Compare April 21, 2026 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

core dump on eval_performance for Rabitq Build

3 participants