Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libarcher does not work in the libomp-10-dev package #45290

Closed
jprotze opened this issue May 15, 2020 · 9 comments
Closed

libarcher does not work in the libomp-10-dev package #45290

jprotze opened this issue May 15, 2020 · 9 comments
Labels
bugzilla Issues migrated from bugzilla openmp packaging

Comments

@jprotze
Copy link
Collaborator

jprotze commented May 15, 2020

Bugzilla Link 45945
Version unspecified
OS Linux
Attachments Example OpenMP code with no data race, but false positive report in absense of Archer
CC @sylvestre

Extended Description

The idea of Archer is to complement ThreadSanitizer and provide the synchronization semantics for OpenMP. In a vanilla build of LLVM with openmp, libarcher is loaded by the OpenMP runtime (libomp.so) automatically and checks whether the TSan library is present in the execution:

$ export ARCHER_OPTIONS=verbose=1
$ clang -fopenmp -fsanitize=thread red-norace.c -g
$ OMP_NUM_THREADS=2 ./a.out
Archer detected OpenMP application with TSan, supplying OpenMP synchronization semantics
Sum: 4999950000

Using clang-10 from the package, there are several issues. First, the archer library is not found as the library is not in the dl search path (as also mentioned in bug 45909)
$ clang-10 -fopenmp -fsanitize=thread red-norace.c -g
$ OMP_NUM_THREADS=2 ./a.out
... WARNING: ThreadSanitizer: data race ...

Creating a link in /lib/x86_64-linux-gnu should help for this.

The next issue (after I added the link):
$ OMP_NUM_THREADS=2 ./a.out
Archer detected OpenMP application without TSan stopping operation
... WARNING: ThreadSanitizer: data race ...

I could not figure out, why the archer library does not use the exported dynamic symbols from the tsan runtime, which is statically linked into the application:

$ readelf --dyn-syms a.out | grep RunningOnValgrind
529: 000000000048f6f0 18 FUNC GLOBAL DEFAULT 13 RunningOnValgrind
$ readelf --dyn-syms /usr/lib/llvm-10/lib/libarcher.so | grep RunningOnValgrind
61: 0000000000002770 10 FUNC WEAK DEFAULT 12 RunningOnValgrind
$ readelf --dyn-syms /home/.../clang/10.0/lib/libarcher.so | grep RunningOnValgrind
64: 0000000000002590 10 FUNC WEAK DEFAULT 12 RunningOnValgrind

I can successfully use clang-10 with my vanilla build of libarcher. So, it seems like there is something funny going on with libarcher from the deb package.

Finally, I'm not sure why the library should not be installed if compiler and OpenMP runtime are installed. I can understand that developers packages are not installed by default, but a compiler is typically installed by developers?

@jprotze
Copy link
Collaborator Author

jprotze commented Jun 18, 2020

I looked into the options used for building the library in the debian build.
The problem is the flag "-Wl,-Bsymbolic-functions" during the linking of libarcher.so.
Due to this flag, Archer uses the functions

This is a default linker flag according to:

dpkg-buildflags --get LDFLAGS

The following removes the flag from the LDFLAGS
export DEB_LDFLAGS_MAINT_STRIP=-Wl,-Bsymbolic-functions

I cannot find any mentioning of this flag in the llvm-build directory, so cmake is not aware of this flag. Therefore, I think we cannot remove the flag just for libarcher.so using cmake magic.

@jprotze
Copy link
Collaborator Author

jprotze commented Aug 11, 2021

I looked into the options used for building the library in the debian build.
The problem is the flag "-Wl,-Bsymbolic-functions" during the linking of
libarcher.so.
Due to this flag, Archer uses the functions ...

Due to this flag, Archer uses the weak implementation of the functions rather than calling the implementation of the function available in the application, when the statically linked TSan runtime library is present in the application.

This is a default linker flag according to:

dpkg-buildflags --get LDFLAGS

The following removes the flag from the LDFLAGS
export DEB_LDFLAGS_MAINT_STRIP=-Wl,-Bsymbolic-functions

I cannot find any mentioning of this flag in the llvm-build directory, so
cmake is not aware of this flag. Therefore, I think we cannot remove the
flag just for libarcher.so using cmake magic.

@sylvestre
Copy link
Collaborator

Sorry, I only see this bug.

the archer library is not found as the library is not in the dl search path
Is there an error message? On my system
$ export ARCHER_OPTIONS=verbose=1

$ clang-14 -fopenmp -fsanitize=thread red-norace.c -g
$ OMP_NUM_THREADS=2 ./a.out

LLVMSymbolizer: error reading file: No such file or directory

WARNING: ThreadSanitizer: data race (pid=1521076)
Read of size 8 at 0x7ffffbe71660 by main thread:
#​0 main /tmp/red-norace.c:11:23 (a.out+0x4c5bde)

Previous atomic write of size 8 at 0x7ffffbe71660 by thread T1:
#​0 .omp_outlined.debug_ /tmp/red-norace.c:6:3 (a.out+0x4c5e8f)
#​1 .omp_outlined. /tmp/red-norace.c:6:3 (a.out+0x4c5f95)
#​2 __kmp_invoke_microtask (libomp.so.5+0xc8112)

Location is stack of main thread.

Location is global '??' at 0x7ffffbe53000 ([stack]+0x00000001e660)

Thread T1 (tid=1521078, running) created by main thread at:
#​0 pthread_create (a.out+0x428e8d)
#​1 (libomp.so.5+0xa5b53)

SUMMARY: ThreadSanitizer: data race /tmp/red-norace.c:11:23 in main

Sum: 4999950000
ThreadSanitizer: reported 1 warnings

@jprotze
Copy link
Collaborator Author

jprotze commented Aug 12, 2021

OMP_TOOL_VERBOSE_INIT=stdout enables verbose output (including dlerror) for the dynamic loading of OpenMP tools. This should help figuring out the issues.

I just installed the llvm-12 packages from the repository.

For my tests, I set LD_LIBARARY_PATH to include the directory with libarcher.so. According to apt-file search libarcher:
libomp-12-dev: /usr/lib/llvm-12/lib/libarcher.so

$ clang-12 -fopenmp -fsanitize=thread red-norace.c
$ OMP_TOOL_VERBOSE_INIT=stdout LD_LIBRARY_PATH=/usr/lib/llvm-12/lib/ ARCHER_OPTIONS=verbose=1 ./a.out
----- START LOGGING OF TOOL REGISTRATION -----
Search for OMP tool in current address space... Failed.
No OMP_TOOL_LIBRARIES defined.
...searching tool libraries failed. Using archer tool.
Opening libarcher.so... Success.
Searching for ompt_start_tool in libarcher.so... Archer detected OpenMP application without TSan stopping operation
Found but not using the OMPT interface.
No OMP tool loaded.
----- END LOGGING OF TOOL REGISTRATION -----

libarcher is loaded, but the call to RunningOnValgrind ends up in the weak implementation in libarcher instead of using the function in a.out.

For comparison, I have the clang+llvm-12.0.0-x86_64-linux-gnu-ubuntu-20.04.tar.xz release build from github unpacked.

Using the OMP_TOOL_LIBRARIES env, I can explicitly select to use libarcher from the release tarball:
$ OMP_TOOL_VERBOSE_INIT=stdout LD_LIBRARY_PATH=/usr/lib/llvm-12/lib/ OMP_TOOL_LIBRARIES=/home/jprotze/sw/UTIL/clang/12.0/lib/libarcher.so ARCHER_OPTIONS=verbose=1 ./a.out
----- START LOGGING OF TOOL REGISTRATION -----
Search for OMP tool in current address space... Failed.
Searching tool libraries...
OMP_TOOL_LIBRARIES = /home/jprotze/sw/UTIL/clang/12.0/lib/libarcher.so
Opening /home/jprotze/sw/UTIL/clang/12.0/lib/libarcher.so... Success.
Searching for ompt_start_tool in /home/jprotze/sw/UTIL/clang/12.0/lib/libarcher.so... Archer detected OpenMP application with TSan, supplying OpenMP synchronization semantics
Success.
Tool was started and is using the OMPT interface.
----- END LOGGING OF TOOL REGISTRATION -----

I did not rebuild the application and still use libomp from the debian package.

@jprotze
Copy link
Collaborator Author

jprotze commented Nov 27, 2021

mentioned in issue llvm/llvm-bugzilla-archive#51117

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
@vchigrin
Copy link
Contributor

If somebody interested:
For me helped switching to dlsym based approach - I just changed #ifdefs here
https://github.com/llvm/llvm-project/blob/main/openmp/tools/archer/ompt-tsan.cpp#L151
replaced #if (defined __APPLE__ && defined __MACH__) with #if 1 (in that line and in few more lines with the same condition in that file).

I don't know full consequences of that though - may be this solution has some drawbacks.

@hahnjo
Copy link
Member

hahnjo commented Jan 24, 2023

jprotze added a commit that referenced this issue Jan 24, 2023
This patch fix issues reported for Ubuntu and possibly other platforms:
#45290

The latest comment on this issue points out that using dlsym rather than
the weak symbol approach to call TSan annotation functions fixes the issue
for Ubuntu.

Differential Revision: https://reviews.llvm.org/D142378
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 24, 2023

@llvm/issue-subscribers-openmp

@jprotze
Copy link
Collaborator Author

jprotze commented Jan 24, 2023

After 7fbf122 all tests pass, when setting CMAKE_SHARED_LINKER_FLAGS:STRING=-Wl,-Bsymbolic-functions. Therefore, the patch should solve the issue for Ubuntu-packages.

@jprotze jprotze closed this as completed Jan 24, 2023
CarlosAlbertoEnciso pushed a commit to SNSystems/llvm-debuginfo-analyzer that referenced this issue Jan 25, 2023
This patch fix issues reported for Ubuntu and possibly other platforms:
llvm/llvm-project#45290

The latest comment on this issue points out that using dlsym rather than
the weak symbol approach to call TSan annotation functions fixes the issue
for Ubuntu.

Differential Revision: https://reviews.llvm.org/D142378
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla openmp packaging
Projects
None yet
Development

No branches or pull requests

6 participants