-
Notifications
You must be signed in to change notification settings - Fork 5.9k
8343698: Linux x86_64 lto build gives a lot of warnings and fails lto-wrapper: fatal error: make returned 2 exit status #22069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back mbaesken! A progress list of the required criteria for merging this PR into |
@MBaesken This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 214 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Webrevs
|
Could we simply disable the attribute warning for the affected area with a comment explaining why the warning is disabled as a workaround rather than disabling LTO for the whole file? This current solution would cause decline in the quality of the resulting assembly when LTO is enabled |
We can try but I am not sure how to do it (and not sure if this will fix the build error) . |
From a cursory glance it seems that the problem is with trim_queue_to_threshold. I see 3 possible solutions, from most likely to work to least likely. I'll try to test them and come back to you // See 8343698 for why this is NOINLINE
// FIXME: This worsens code quality by inhibiting compiler inlining heuristics, but is the only solution for now
NOINLINE void* os::malloc(size_t size, MemTag mem_tag, const NativeCallStack& stack) DISABLED_WARNINGS_gcc_linux_g1ParScanThreadState.cpp := attribute-warning, \ // See 8343698 for the reason this workaround is in place
PRAGMA_DIAG_PUSH
PRAGMA_DISABLE_GCC_WARNING("-Wattribute-warning")
// trim_queue_to_threshold definition here
PRAGMA_DIAG_POP |
Hi Julian ,
thanks for looking into it! Maybe you find something better than disabling lto for the file (or disabling the flatten attributes in the file). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the approach being taken by the proposed changes is not the right way to
address the issue.
# disable lto in g1ParScanThreadState because of special inlining/flattening used there | ||
ifeq ($(call check-jvm-feature, link-time-opt), true) | ||
BUILD_LIBJVM_g1ParScanThreadState.cpp_CXXFLAGS := -fno-lto | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with this change I still get ~200 warnings when building with gcc13.2,
mostly -Wattribute-warning. This is consistent with what I reported previously
with ATTRIBUTE_WARNING as a nop. So this change isn't sufficient.
I'm also not sure it's desirable. We don't know the impact of LTO on top of
flattening. It might be beneficial. This change ought to come with some
performance measurements, if made at all. And not under the rubric of a
warnings fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are quite a number of things in HotSpot that seemingly wouldn't work with LTO enabled. One notable example is os::current_stack_pointer, which on many platforms simply returns the frame pointer of os::current_stack_pointer or the address of a local inside os::current_stack_pointer. With LTO enabled this becomes eligible for inlining, and the "stack pointer" ends up becoming the frame pointer of the caller, which is not correct. Fixing these issues and making LTO viable is one of the things I hope to do, although not now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm ... I thought we had LTO enabled in the past for Java SE Embedded ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And not under the rubric of a warnings fix.
Maybe the title is a bit misleading, it is more about a build fix (lto build did not work for me with gcc 11.3.0; with this fix it works).
If the lot stuff is broken and there is no intent to fix it, we might consider removing it (but fixing it would probably be better).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would hope that LTO won't be removed. I plan to try making it viable, and I also use it downstream on Windows. I'm having trouble downgrading my gcc package on WSL to reproduce the warning, so might take me some time to find a better fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, I'm seeing 3 warnings, 2 of which are for the same code. Both
are -Walloc-size-larger-than= warnings for calls to operator new[].
In gtestMain.cpp, in init_jvm, new JavaVMOption[num_jvm_options]
is being
warned about. The warning is obviously wrong, as the value it is complaining
about uint64_max, while the variable has int type.
The other is in (non-Windows) CreateArgvFromArgs in gtest-death-test.cc, and
is being too conservative about value bounds, so also wrong.
So both of these appear to be LTO bugs.
Of course, it's not possible to suppress these warnings via pragmas, because
of the bug that LTO doesn't honor such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching up with the discussion...
LTO has been working before, and was indeed enabled for the old embedded platform. Since then, it has not been regularly tested and has certainly bit-rottet.
I think it is a good thing to try to keep it alive. There is potential for big performance improvements, even if that will only be utilized for special builds. The additional build time is the main culprit here. But it is possible that, as Kim says, the entire build is perhaps "just" slowed down by a factor of 2x, and that might be acceptable, even if the Hotspot build is slowed down several magnitudes.
Also, there are many promising efforts going on right now, that is trying to get some or all of the LTO benefits in different ways. In particular, iirc, the LLVM linker lld has some kind of "partial LTO" which, iiuc, tries to pick the most important aspeect of LTOs but implemented in a way that has minimal build performance impact.
It is not entirely clear what correspondence there is between keeping the traditional gcc LTO alive, and being able to use this new clang partial LTO, but as a reasonable first guess it will be helpful to keep it working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old LTO used to set -O3 directly on both LDFLAGS and CFLAGS, bypassing OPTIMIZATION variable entirely. Just wanted to put this out there, since the topic of traditional gcc LTO came up. I changed that to set OPTIMIZATION to HIGHEST_JVM some time ago, and I'm not sure whether my change played any part in breaking LTO
LTO is currently only supported for VC and gcc. I can look into how to implement it for clang, though not now. First I'm committed to fixing all the issues with LTO, and making sure Kim's worry of poisoning virtually disappearing when LTO is enabled doesn't become a reality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, we should support LTO for gcc AND clang on Linux/macOS.
Regarding the longer link/build times , I can add some info how much longer it takes in our environment (SUSE Linux/gcc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add some info here: On Windows, with LTO enabled it takes gcc about 15-20 minutes to link the JVM, without it takes anywhere between 8-10 minutes to do so. ld is very slow on Windows due to a bug, so the numbers are skewed towards the high end. I don't have any numbers for VC, but I can assume that even LTO with VC and link.exe is very fast, because VC in general is an extremely fast compiler
make/hotspot/lib/JvmFeatures.gmk
Outdated
@@ -170,12 +170,12 @@ ifeq ($(call check-jvm-feature, link-time-opt), true) | |||
# later on if desired | |||
JVM_OPTIMIZATION := HIGHEST_JVM | |||
ifeq ($(call isCompiler, gcc), true) | |||
JVM_CFLAGS_FEATURES += -flto=auto -fuse-linker-plugin -fno-strict-aliasing \ | |||
JVM_CFLAGS_FEATURES += -DINCLUDE_LTO=1 -flto=auto -fuse-linker-plugin -fno-strict-aliasing \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this was included in this PR (and similarly below), since
there's no use of it anywhere. However, it would be useful for
conditionalizing the definition of FORBID_C_FUNCTION, as suggested in JBS. But
just adding the defines here is not sufficient. In utilities/macros.hpp, there
should be a conditional #define to the default value (0) if the macro is not
already defined, so that we can consistently use #if INCLUDE_LTO
and
#if !INCLUDE_LTO
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was included because all JVM features have a related define (e.g. INCLUDE_JFR, INCLUDE_JVMCI) and I missed that one here.
Or is it already available and I just missed the define related to link-time-optimization ?
But in a way you are correct , currently the added define it not needed technically in this PR .
And we might think about the naming (INCLUDE_LINK_TIME_OPTIMIZATION vs. INCLUDE_LTO).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer a more meaningful name. LTO seems a little generic and opaque. If you know then you
know, but if you don't then you are puzzled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed the define to INCLUDE_LINK_TIME_OPTIMIZATION.
If you want me to change that I can add the handling of INCLUDE_LINK_TIME_OPTIMIZATION to macros.hpp . But unlikely many other macros the LTO JVM feature is usually not enabled, so this would mean we would set for every compilation unit |
make/hotspot/lib/JvmFeatures.gmk
Outdated
@@ -170,12 +170,12 @@ ifeq ($(call check-jvm-feature, link-time-opt), true) | |||
# later on if desired | |||
JVM_OPTIMIZATION := HIGHEST_JVM | |||
ifeq ($(call isCompiler, gcc), true) | |||
JVM_CFLAGS_FEATURES += -flto=auto -fuse-linker-plugin -fno-strict-aliasing \ | |||
JVM_CFLAGS_FEATURES += -DINCLUDE_LINK_TIME_OPTIMIZATION=1 -flto=auto -fuse-linker-plugin -fno-strict-aliasing \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this. I find no pre-existing INCLUDE_LINK_TIME_OPTIMIZATION ifdefs, and you are not adding any. What is the purpose of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked the same question. But as further discussion has indicated, we might need this define for
disabling the poisoning of selected C library functions when building with LTO, because LTO doesn't
support the mechanism used for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all JVM features have these INCLUDE macros so I wanted to add it it for this JVM feature too. But I can remove it from the PR if it is just wanted for the other JVM features but not this one for some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the link-time-opt
feature is a bit different from the others -- it does not select a different set of sources to be compiled, but affects how the JVM is compiled/linked. I don't think there is a need to define these preprocessor macros just out of pure symmetry reason.
If it turns out in the future that we need to compile code differently, then we can add these defines at that time.
There's no need to set it for every compilation unit. Like other feature handling in macros.hpp, something like this
And of course, flip things if we later change to make LTO the default. |
As suggested, I moved INCLUDE_LINK_TIME_OPTIMIZATION to macros.hpp . Regarding build time (of a scratch build) I saw no significant decrease for lto compared to my normal build , it was even a little faster (but it is on a server where more stuff is running than my build so probably the results are a bit influenced by this). |
Btw. regarding the disabling of lto for g1ParScanThreadState.cpp |
Those were the build time numbers (JOBS=16, SUSE 15 Linux x86_64, gcc 11.3.0 devkit)
non-lto
both builds were opt scratch builds, without gtest (--with-gtest=... removed from my build script). |
A first nice advantage seems to be the smaller size of libjvm.so coming from the lto build;
normal build
This is in line to what TI claims here for some lto vs non-lto builds (but at TI for aarch/arm and not for x86_64) |
When switching to size-optimization (adding
normal build
|
There's no need to add the =yes at the end there, simply --enable-jvm-feature-link-time-opt or --enable-jvm-feature-opt-size will do |
On Linux ppc64le (uses also a gcc 11.3.0 devkit) the size improvements can be observed too when using lto .
lto enabled
lto enabled + opt-size enabled
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, the PR looks good from a build perspective. I think you need buy-in from at least one Hotspot developer as well.
/reviewers 2 |
Hi Magnus, thanks for the review ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if I count as a HotSpot developer, and admittedly I am still not very fond of these changes since they adversely affect the quality of the resulting assembly when LTO is active, but this gets a reluctant ok from me since without it LTO doesn't even work on Linux, and Kim doesn't seem very happy about disabling the attribute warning when LTO is on. I still think you should let @kimbarrett have the final say on whether this gets the go ahead or not. I hope we can work on removing this workaround as fast as possible
Hi Julian, thanks for the review ! |
Yes, I agree that getting it working is more important. In the meantime, I am trying to find a solution to the FORBID_C_FUNCTION and LTO problem. It is my hope that I can find something that makes this workaround not needed, and quickly. But if you asked me if I had any ideas now, the only thing I can think of is to NOINLINE os::malloc, since that it where the inlined call to malloc originates from. But I cannot replicate the LTO failure on my WSL install for some reason, so have no way of verifying if it works. Either way, this is probably ok, though might want to ask Kim for the final go ahead, since I don't know if I count as a HotSpot developer |
/integrate |
Going to push as commit 9576546.
Your commit was automatically rebased without conflicts. |
I spent a lot of time trying to come up with a different and more cross platform way of poisoning code within HotSpot that doesn't break on LTO. I think I'm almost there, but a few kinks need to be worked out first: https://godbolt.org/z/o1WPPexad |
When trying LTO (configure flag --enable-jvm-feature-link-time-opt=yes) on Linux x86_64, gcc 11.3.0, we run into a lot of warnings and finally into this error :
.. tons of free and malloc related warnings ...
Looks like the flattening attribute usage in src/hotspot/share/gc/g1/g1ParScanThreadState.cpp causes the issues.
So it was discussed to remove the flattening attributes in this file but this was not seen as a good idea. So better disable lto when building this special file with gcc .
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22069/head:pull/22069
$ git checkout pull/22069
Update a local copy of the PR:
$ git checkout pull/22069
$ git pull https://git.openjdk.org/jdk.git pull/22069/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 22069
View PR using the GUI difftool:
$ git pr show -t 22069
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22069.diff
Using Webrev
Link to Webrev Comment