Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm] Disable HandleLLVMOptions in runtimes mode #73031

Merged
merged 1 commit into from Nov 27, 2023

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Nov 21, 2023

Summary:
There are a few default options that LLVM adds that can be problematic
for runtimes builds. These options are generally intended to handle
building LLVM itself, but are also added when building in a runtimes
mode. One such issue I've run into is that in libc we deliberately use
--target to use a different device toolchain, which doesn't support
some linker arguments passed via -Wl. This is observed in
#73030 when attempting to use
these options.

This patch completely removes these default arguments.

The consensus is that any issues created by this patch should ultimately be solved on a per-runtime basis.

@jhuber6 jhuber6 requested a review from a team as a code owner November 21, 2023 20:09
@llvmbot llvmbot added the cmake Build system in general and CMake in particular label Nov 21, 2023
@@ -151,6 +151,9 @@ endif()
# Avoid checking whether the compiler is working.
set(LLVM_COMPILER_CHECKED ON)

# This can be used to detect whether we're in the runtimes build.
set(LLVM_RUNTIMES_BUILD ON)

# Handle common options used by all runtimes.
include(AddLLVM)
include(HandleLLVMOptions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't know what we're picking up from HandleLLVMOptions but it all sounds like just dangerous to do in the context of the runtimes build. A naive build locally seems to work if we remove include(HandleLLVMOptions) entirely -- this is what I would try doing instead.

@petrhosek might know more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my first inclination, but I figured this would be the less destructive option. I'm assuming that many of these options are not required for projects that consume the LLVM static / shared libraries such as openmp, but I don't know the intention here strictly enough to make an accurate judgement there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know for certain that I ran into a lot of issues back in the days where we used the LLVM_ENABLE_PROJECTS build caused by the rest of LLVM sneaking in flags via these kinds of CMake files. This is actively harmful. I think it's much better to remove it if we can, and normally our CI would tell us if we missed something.

So I'd support making the naive change that takes us closer to where we should be, and then figuring out where we went wrong in case it breaks something. You can always blame it on me :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a few years since I last looked into this, but I tried removing it (since I agree that it can be actively harmful) and the issue I hit was that many of the runtimes builds were assuming variables and options set by HandleLLVMOptions because those were always present when building with LLVM_ENABLE_PROJECTS.

I'm not sure how much the situation has improved since then, I expect it won't be a big issue for libunwind, libc++abi, libc++ which no longer support LLVM_ENABLE_PROJECTS, but might still be an issue for compiler-rt and libc which do support building with LLVM_ENABLE_PROJECTS (although I hope that won't be the case for too long).

Probably the best path forward would be to try and remove it, see what fails and address each of those cases separately. The main issue is that many of those failures would be silent (e.g. a check that was passing before might be failing now but the build itself would still continue, except the build outputs might be different).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add the include in compiler-rt and in libc if we expect those to rely on it perhaps? For libc++, libc++abi and libunwind, I am fairly confident that if we remove it and our CI doesn't break, we're okay. It might break something, but if that's the case and if it's actively being used then we have pretty good systems in place to fix it quickly, and I'm willing to take responsibility to fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was initially to unblock me on some issues with GPU compilation, but I'm happy to expand the scope if others are in agreement.

There's a handful of clang options we may want to keep to remove some regressions? Here's the set of flags that the libc project has right now:

-fPIC                                                                                                                                                                                                                   
-fno-semantic-interposition                                                                                                                                                                                             
-fvisibility-inlines-hidden                                                                                                                                                                                             
-Werror=date-time                                                                                                                                                                                                              
-Werror=unguarded-availability-new                                                                                                                                                                                             
-Wall -Wextra                                                                                                                                                                                                                  
-Wno-unused-parameter                                                                                                                                                                                                          
-Wwrite-strings -Wcast-qual                                                                                                                                                                                                    
-Wmissing-field-initializers                                                                                                                                                                                                   
-Wimplicit-fallthrough                                                                                                                                                                                                         
-Wcovered-switch-default                                                                                                                                                                                                       
-Wno-noexcept-type                                                                                                                                                                                                             
-Wnon-virtual-dtor                                                                                                                                                                                                             
-Wdelete-non-virtual-dtor                                                                                                                                                                                                      
-Wsuggest-override                                                                                                                                                                                                             
-Wstring-conversion                                                                                                                                                                                                            
-Wmisleading-indentation                                                                                                                                                                                                       
-Wctad-maybe-unsupported                                                                                                                                                                                                       
-fdiagnostics-color                                                                                                                                                                                                            
-ffunction-sections                                                                                                                                                                                                            
-fdata-sections

Do any of these look integral? We probably want to keep -fPIC -ffunction-sections -fdata-sections if nothing else. Warnings should probably also be somewhat consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add the include in compiler-rt and in libc if we expect those to rely on it perhaps? For libc++, libc++abi and libunwind, I am fairly confident that if we remove it and our CI doesn't break, we're okay. It might break something, but if that's the case and if it's actively being used then we have pretty good systems in place to fix it quickly, and I'm willing to take responsibility to fix it.

The impetus for this was attempting to get math tests working in libc for the NVPTX GPU build. The automatic inclusion of -Wl, caused the linker to error as Nvidia's linker doesn't support pretty much anything. The other project I'd be interested in is openmp, which is a full runtime that statically links a lot of the LLVM libraries. I appreciate the help on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think this is the right way to go. We'll handle the fallout as it shows up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelrj-google Any thoughts about libc?

Basically, I want to avoid creating a haunted graveyard around these flags. We all agree they're undesirable and can be harmful, so let's do our best to remove them and see what needs to be fixed in addition. Let's not do the easy thing (this PR as-is) because it just sweeps future problems under the rug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That depends whether we just want to do the minimal thing now and clean up later or whether we want to try and do both at the same time. For example, I'd like to leverage native CMake option where possible, so instead of manually adding -fPIC to compiler flags, we should set POSITION_INDEPENDENT_CODE property on targets that actually need it. Feel free to make only the necessary minimum of changes you need and leave a TODO; we can follow up with more clean up changes later.

@jhuber6 jhuber6 changed the title [llvm] Disable some LLVM arguments in runtimes mode [llvm] Disable HandleLLVMOptions in runtimes mode Nov 21, 2023
@@ -151,9 +151,11 @@ endif()
# Avoid checking whether the compiler is working.
set(LLVM_COMPILER_CHECKED ON)

# This can be used to detect whether we're in the runtimes build.
set(LLVM_RUNTIMES_BUILD ON)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still need this part of the patch now? I don't think you do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't hurt to set it earlier, but I can remove it to make it cleaner.

@jhuber6 jhuber6 force-pushed the DisableLinkerRuntimes branch 2 times, most recently from 7330ba4 to c357105 Compare November 21, 2023 23:03
@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 21, 2023

Seems some of the tests fail, but I'm not seeing any information from the CI. Any ideas?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 22, 2023

When I try to investigate the CI output it just says "error", seems to be a libc++ thing.

@ldionne
Copy link
Member

ldionne commented Nov 22, 2023

The CI issues kind of look like infrastructure difficulties to me. Do you mind re triggering the checks by pushing to your branch again?

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 22, 2023

The CI issues kind of look like infrastructure difficulties to me. Do you mind re triggering the checks by pushing to your branch again?

Seems to fail with the same issue.

@ldionne
Copy link
Member

ldionne commented Nov 22, 2023

The problem is that we don't install a linker script for libc++.so in that case, since LLVM_HAVE_LINK_VERSION_SCRIPT is not detected properly. I'm working on a patch to fix that, I was able to reproduce locally in our Docker image.

Edit: BTW this is exactly the reason why I think this is the right way to go about this patch. It shakes out these kinds of dependencies that we didn't know about and really shouldn't have.

Edit: See #73151.

@ldionne
Copy link
Member

ldionne commented Nov 23, 2023

If you rebase onto main, I think this should be unblocked from the libc++ side now.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 23, 2023

If you rebase onto main, I think this should be unblocked from the libc++ side now.

I'll give it a shot.

Summary:
There are a few default options that LLVM adds that can be problematic
for runtimes builds. These options are generally intended to handle
building LLVM itself, but are also added when building in a runtimes
mode. One such issue I've run into is that in `libc` we deliberately use
`--target` to use a different device toolchain, which doesn't support
some linker arguments passed via `-Wl`. This is observed in
llvm#73030 when attempting to use
these options.
@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 24, 2023

Seems to have passed with the exception of the sanitizers. Assuming that's unrelated.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 27, 2023

I'll land this, and see how it goes.

@jhuber6 jhuber6 merged commit 79b0330 into llvm:main Nov 27, 2023
35 of 38 checks passed
jhuber6 added a commit that referenced this pull request Nov 27, 2023
This reverts commit 79b0330.

I am seeing very confusing errors with the `libomptarget` test suite
when using dlopen / other flags. Reverting to investigate why this is.
jhuber6 added a commit that referenced this pull request Nov 27, 2023
Summary:
There are a few default options that LLVM adds that can be problematic
for runtimes builds. These options are generally intended to handle
building LLVM itself, but are also added when building in a runtimes
mode. One such issue I've run into is that in `libc` we deliberately use
`--target` to use a different device toolchain, which doesn't support
some linker arguments passed via `-Wl`. This is observed in
#73030 when attempting to use
these options.

This patch completely removes these default arguments.

The consensus is that any issues created by this patch should ultimately
be solved on a per-runtime basis.
@dyung
Copy link
Collaborator

dyung commented Nov 28, 2023

@jhuber6 This change seems to be causing a linking failure in a sanitizer test UBSan-ThreadSanitizer-x86_64 :: TestCases/Misc/Linux/sigaction.cpp.

https://lab.llvm.org/buildbot/#/builders/247/builds/11523

RUN: at line 1: /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/./bin/clang  --driver-mode=g++ -fsanitize=thread  -m64  -fsanitize=undefined -shared-libsan /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/compiler-rt/test/ubsan/TestCases/Misc/Linux/sigaction.cpp -o /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/runtimes/runtimes-bins/compiler-rt/test/ubsan/ThreadSanitizer-x86_64/TestCases/Misc/Linux/Output/sigaction.cpp.tmp &&  /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/runtimes/runtimes-bins/compiler-rt/test/ubsan/ThreadSanitizer-x86_64/TestCases/Misc/Linux/Output/sigaction.cpp.tmp 2>&1 | FileCheck /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/compiler-rt/test/ubsan/TestCases/Misc/Linux/sigaction.cpp
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/./bin/clang --driver-mode=g++ -fsanitize=thread -m64 -fsanitize=undefined -shared-libsan /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/compiler-rt/test/ubsan/TestCases/Misc/Linux/sigaction.cpp -o /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/runtimes/runtimes-bins/compiler-rt/test/ubsan/ThreadSanitizer-x86_64/TestCases/Misc/Linux/Output/sigaction.cpp.tmp
/usr/bin/ld: /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/lib/clang/18/lib/x86_64-unknown-linux-gnu/libclang_rt.tsan.so: siglongjmp: invalid version 21 (max 0)
/usr/bin/ld: /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/lib/clang/18/lib/x86_64-unknown-linux-gnu/libclang_rt.tsan.so: error adding symbols: bad value
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Note that the first build with your original change (79b0330) the test started to fail (https://lab.llvm.org/buildbot/#/builders/247/builds/11523). Then the first build after the revert, the test started to pass again (https://lab.llvm.org/buildbot/#/builders/247/builds/11529). And finally when the change was re-applied, the test started to fail again (https://lab.llvm.org/buildbot/#/builders/247/builds/11533).

This is causing issues on 2 similarly configured build bots:

Can you take a look fix the test or revert if you need time to investigate?

@EricWF
Copy link
Member

EricWF commented Nov 28, 2023

I've also bisected issues with sanitizer builds back to this commit.

This is likely because part of the handling of LLVM_USE_SANITIZER happens in HandleLLVMOptions.

@zmodem
Copy link
Collaborator

zmodem commented Nov 28, 2023

Chromium is seeing build failures in runtimes builds (https://crbug.com/1505689) and this commit is in the range. I'm trying to come up with some kind of reproducer.

@zmodem
Copy link
Collaborator

zmodem commented Nov 28, 2023

Here's a reproducer for Chromium's problem. On Linux, at ee922e6, in a build directory under llvm-project, e.g. llvm-projects/build/:

$ curl -LO https://commondatastorage.googleapis.com/chrome-linux-sysroot/toolchain/0e28d9832614729bb5b731161ff96cb4d516f345/debian_bullseye_arm64_sysroot.tar.xz
$ mkdir sysroot
$ tar -C sysroot -Jxf debian_bullseye_arm64_sysroot.tar.xz
$ curl -LO https://commondatastorage.googleapis.com/chromium-browser-clang/Linux_x64/clang-llvmorg-17-init-16420-g0c545a44-1.tar.xz
$ mkdir bootstrap
$ tar -C bootstrap -Jxf clang-llvmorg-17-init-16420-g0c545a44-1.tar.xz
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON "-DLLVM_ENABLE_PROJECTS=clang;lld" -DLLVM_ENABLE_RUNTIMES=compiler-rt "-DLLVM_TARGETS_TO_BUILD=AArch64;X86" -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=ON -DLLVM_ENABLE_LLD=ON -DCMAKE_C_COMPILER=$PWD/bootstrap/bin/clang -DCMAKE_CXX_COMPILER=$PWD/bootstrap/bin/clang++ -DRUNTIMES_aarch64-unknown-linux-gnu_CMAKE_SYSROOT=$PWD/sysroot -DBUILTINS_aarch64-unknown-linux-gnu_CMAKE_SYSROOT=$PWD/sysroot -DRUNTIMES_aarch64-unknown-linux-gnu_LLVM_INCLUDE_TESTS=OFF -DBUILTINS_aarch64-unknown-linux-gnu_LLVM_INCLUDE_TESTS=OFF -DRUNTIMES_aarch64-unknown-linux-gnu_COMPILER_RT_BUILD_CRT=ON -DRUNTIMES_aarch64-unknown-linux-gnu_COMPILER_RT_BUILD_LIBFUZZER=OFF -DRUNTIMES_aarch64-unknown-linux-gnu_COMPILER_RT_BUILD_MEMPROF=OFF -DRUNTIMES_aarch64-unknown-linux-gnu_COMPILER_RT_BUILD_ORC=OFF -DRUNTIMES_aarch64-unknown-linux-gnu_COMPILER_RT_BUILD_PROFILE=ON -DRUNTIMES_aarch64-unknown-linux-gnu_COMPILER_RT_BUILD_SANITIZERS=ON -DRUNTIMES_aarch64-unknown-linux-gnu_COMPILER_RT_BUILD_XRAY=OFF "-DRUNTIMES_aarch64-unknown-linux-gnu_COMPILER_RT_SANITIZERS_TO_BUILD=asan;dfsan;msan;hwasan;tsan;cfi" -DRUNTIMES_aarch64-unknown-linux-gnu_COMPILER_RT_DEFAULT_TARGET_ONLY=ON "-DLLVM_BUILTIN_TARGETS=aarch64-unknown-linux-gnu" "-DLLVM_RUNTIME_TARGETS=aarch64-unknown-linux-gnu" ../llvm
$ ninja runtimes-aarch64-unknown-linux-gnu-build
...
FAILED: /work/llvm-project/build.nanana/lib/clang/18/lib/aarch64-unknown-linux-gnu/libclang_rt.hwasan.so 
...
/usr/bin/ld: unrecognised emulation mode: aarch64linux
Supported emulations: elf_x86_64 elf32_x86_64 elf_i386 elf_iamcu i386pep i386pe
clang++: error: linker command failed with exit code 1 (use -v to see invocation)

I think the clue is in the last line: /usr/bin/ld. It's supposed to be using lld, but probably -DLLVM_ENABLE_LLD=ON isn't getting forwarded to the runtimes build as it was before.

zmodem added a commit that referenced this pull request Nov 28, 2023
This appears to have caused a variety of breakages, see comments on the PR.

> Summary:
> There are a few default options that LLVM adds that can be problematic
> for runtimes builds. These options are generally intended to handle
> building LLVM itself, but are also added when building in a runtimes
> mode. One such issue I've run into is that in `libc` we deliberately use
> `--target` to use a different device toolchain, which doesn't support
> some linker arguments passed via `-Wl`. This is observed in
> #73030 when attempting to use
> these options.
>
> This patch completely removes these default arguments.
>
> The consensus is that any issues created by this patch should ultimately
> be solved on a per-runtime basis.

This reverts commit ee922e6.
@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 28, 2023

I've also bisected issues with sanitizer builds back to this commit.

* https://github.com/llvm/llvm-project/actions/runs/7014942157/job/19085214160

This is likely because part of the handling of LLVM_USE_SANITIZER happens in HandleLLVMOptions.

Seems so, anything that appended flags will cause a regression since they are no longer handled. So, things like LLVM_USE_SANITIZER or LLVM_USE_LINKER aren't adding the flags they normally do. I think the consensus here was that the runtimes should handle this stuff separately. Maybe @ldionne can suggest how to handle this.

For now, we could try to fix it in-place, or we can revert it to get the bots green and we can try again later.

@zmodem
Copy link
Collaborator

zmodem commented Nov 28, 2023

or we can revert it to get the bots green and we can try again later.

I reverted in ea64047 already (I thought GitHub would notify this thread, but maybe it didn't do that.)

@jhuber6
Copy link
Contributor Author

jhuber6 commented Nov 28, 2023

or we can revert it to get the bots green and we can try again later.

I reverted in ea64047 already (I thought GitHub would notify this thread, but maybe it didn't do that.)

Github's really awful about handling reverted patches, the PR just stays closed. We can look into how to solve this better and track the regressions.

@petrhosek
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmake Build system in general and CMake in particular
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants