-
Notifications
You must be signed in to change notification settings - Fork 12.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clang fails to pass validation when compiled with gcc-9 in release build. #39893
Comments
The problem lies in the initialization of the bitfield members of ABIArgInfo. |
+hans for release bugs. |
Fix available in https://reviews.llvm.org/D57523 |
CGCall.cpp:2060 at 8.0.0-rc1 is this line:
that returns the InReg bitfield. But that bitfield does get initialized (to false) in both of ABIArgInfo's constructors. So I don't understand what the error is here exactly? |
Can you provide more information about exactly what gcc version you're using (I believe 9 doesn't even have a release branch yet), how your llvm build is configured, and in what way "validation" fails, i.e. an example of a test that's failing. |
Sure. GCC version is from the gcc-9 branch, that's the one used on fedora rawhide: svn export svn://gcc.gnu.org/svn/gcc/branches/redhat/gcc-9-branch@268371 The llvm build uses RelWithDebInfo You can get the whole setup here: https://koji.fedoraproject.org/koji/taskinfo?taskID=32372028 |
EDIT: the above link is for a rebuild of 7.01, but I had the same issue with 8.0.0rc1. |
Perfect, I'll take a look. |
@hans: same problem when rebuilding from master with the following setup: gcc built from svn export svn://gcc.gnu.org/svn/gcc/branches/redhat/gcc-9-branch@268371 llvm master CXXFLAGS -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection CFLAGS=CXXFLAGS cmake options: ../llvm-project/llvm -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS=clang -DCMAKE_CXX_COMPILER=/opt/notnfs/sergesanspaille/install/bin/g++ -DCMAKE_C_COMPILER=/opt/notnfs/sergesanspaille/install/bin/gcc |
I was able to reproduce this with the specified gcc version, except I only got four test failures: Failing Tests (4): I'm still not sure what's going on though. Maybe the next step is to try bisecting gcc. |
Does the redhat/gcc-9-branch branch exist in gcc's git mirror? |
Never mind, I can reproduce with trunk gcc. |
My bisection last night failed, and I'll probably not have time to try again today, so if someone else wants to give it a try, please do so. |
I think my previous bisection failed because the test failure isn't completely deterministic. The same test doesn't fail each time. I tried bisecting again, building gcc, then clang, then running check-all at each step, and it points to this: 75a6f948423540b728611660963d1740440d87eb is the first bad commit
:040000 040000 73c34e4b0fc97de49ad228c609d495461e496b90 5882541f2e6fbcd712fd641f535238b4856ff045 M gccThat doesn't seem very interesting. If it is indeed the culprit, I guess it only uncovered some already-existing problem :-/ |
I can't really spend more time on this. Landing https://reviews.llvm.org/D57523 seems like the way to go. |
Patch merged, thanks Hans! |
Looks like it was reverted in r354549. Let's keep this open until it sticks and also makes it to the release_80 branch. |
Serge, do you have any ideas for how to do this without breaking various GCC versions? :-) |
Not quite :-) As gcc 9 is not officially out yet, as of https://gcc.gnu.org/releases.html at least, I guess it's okay to still do the release, and eventually backport a patch if needed? |
As we have an idea of how to work around it, it would still be nice to fix before release if possible. Of course it would be even better if GCC was fixed before released. Do you know if there's a bug tracking this on the GCC side? |
Ping? |
I'll give it another try today :-) |
Unblocking the release from this. |
On Ubuntu 18.08, I'm seeing the following Failing Tests (5):
caused by https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=268448 But these changes only caused failures in clang 7+. Bisecting clan on Alex Bradbury, |
*** Bug llvm/llvm-bugzilla-archive#42927 has been marked as a duplicate of this bug. *** |
I'm still getting around 129 test failures when building clang with gcc-9 on Fedora 31. |
Also, re-applying https://reviews.llvm.org/rL354546 fixes the failures for me. |
The patch changing inliner metrics should not trigger wrong code, so indeed it probably just uncovers latent problem. Is it understood whether this is a GCC bug and if so, do you have a testcase? Note that I have also run into misoptimization issues while compiling clang with GCC 9 and FDO. I had no chance to debug it, but it was crashing after parsing even quite simple source file. It would be great to figure out where the problem is. To compile with profile feedback you need to pass -fprofile-generate in addition to standard flags and build whole tree, then train by testsuite and then rebuilding with -fprofile-use. |
I had trouble re-applying that cleanly due to the changes in https://reviews.llvm.org/D60349 and created the attached patch. I'm not a programmer or coder, please verify the attached patch. |
One more data point. I bisected Clang against: which pointed to: Duncan, |
Interesting! It caused problems in very old (unsupported) versions of clang too, likely due to a bug in the AArch64 backend. I won’t be able to investigate GCC to figure out the bug, but let me know if you want more details about what that commit did. I suspect it’s a recent regression in GCC since that commit landed over a year ago in Clang. |
I took one last stab at bisecting gcc. This time rather trace back a current failure, I bisected against the first validation failures, which yielded: |
It would be nice to get to the bottom of this, but I don't think we can block the release on it. |
I made another attempt to bisect gcc. Rather target a specific failure, this time I bisected against any validation failure. This points to While subsequent gcc tweaks resulted in different tests failing, this commit appears to be the genesis of all of the failures. |
Have you filed a gcc bug for this? |
Done. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91758 Please add any useful information that I may have left out, as I am neither a clang nor gcc developer. |
Ok, I've got a patch. It's an issue in clang where ABIArgInfo constructor does not initialize some of the bit fields and then we end up with usage of an uninitialized variables. Following patch should fix it: diff --git a/clang/include/clang/CodeGen/CGFunctionInfo.h b/clang/include/clang/CodeGen/CGFunctionInfo.h
public:
I'm going to send the patch to Phabricator. |
Have you tested this patch with gcc-7? This similar patch was committed and reverted due to ICE in gcc-7: https://reviews.llvm.org/D57523 |
Ah, you had the patch. Yes, the patch should be re-applied! |
There was a link in the revert commit, but the logs have been deleted now: |
Ok, I'm rebuilding clang with the CC=gcc-7 CXX=g++-7 with the very same instructions like here: The ICE was a normal ICE in GCC compiler during compilation of a clang object file, right? |
I can't see the ICE with: So please reapply the patch back. |
Can you submit your patch? The old patch doesn't apply cleanly any more. |
Proposed patch |
Can you point to where the uninitialized variable gets used? I stared hard at the code but couldn't find it. Maybe initializing all of them is a good idea in any case though. |
It happens in void computeInfo(CGFunctionInfo &FI) const override {: Here there's an assignment which eventually will copy members of the ABIArgInfo.
|
Commited as https://reviews.llvm.org/rGe93aded7f02d661234ad81aac0785ccdef6a79dd |
Verified that r372281 fixes the issue. Both 'make check-all' for clang r372468 with no failures. |
Can we get this backported to 9.0.1 (if it isn't yet)? |
Merged: 7fc9f12 |
mentioned in issue llvm/llvm-bugzilla-archive#42927 |
mentioned in issue #42705 |
Extended Description
Building clang 8.0.0rc1 for fedora using gcc 9 compiles file, but validation fails in ~150 places.
When building with -O0, everything validates fine.
When building with -O1, everything validates fine.
When building with -O2, validation fails.
When building with -O2 -fsanitize=address, everything validates fine.
Running clang under valgrind gives good hint on the error location:
valgrind /builddir/build/BUILD/cfe-8.0.0rc1.src/_build/bin/clang -cc1 -internal-isystem /builddir/build/BUILD/cfe-8.0.0rc1.src/_build/lib64/clang/8.0.0/include -nostdsysteminc -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm /builddir/build/BUILD/cfe-8.0.0rc1.src/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp -o -
==2582== Conditional jump or move depends on uninitialised value(s)
==2582== at 0x880DC79: clang::CodeGen::CodeGenModule::ConstructAttributeList(llvm::StringRef, clang::CodeGen::CGFunctionInfo const&, clang::CodeGen::CGCalleeInfo, llvm::AttributeList&, unsigned int&, bool) (CGCall.cpp:2060)
==2582== by 0x8A1E749: clang::CodeGen::CodeGenModule::SetLLVMFunctionAttributes(clang::GlobalDecl, clang::CodeGen::CGFunctionInfo const&, llvm::Function*) (CodeGenModule.cpp:1171)
==2582== by 0x8A238AA: clang::CodeGen::CodeGenModule::SetFunctionAttributes(clang::GlobalDecl, llvm::Function*, bool, bool) (CodeGenModule.cpp:1548)
==2582== by 0x8A5CA16: clang::CodeGen::CodeGenModule::GetOrCreateLLVMFunction(llvm::StringRef, llvm::Type*, clang::GlobalDecl, bool, bool, bool, llvm::AttributeList, clang::CodeGen::ForDefinition_t) (CodeGenModule.cpp:2824)
==2582== by 0x87F7FAC: clang::CodeGen::CodeGenModule::getAddrOfCXXStructor(clang::CXXMethodDecl const*, clang::CodeGen::StructorType, clang::CodeGen::CGFunctionInfo const*, llvm::FunctionType*, bool, clang::CodeGen::ForDefinition_t) (CGCXX.cpp:253)
==2582== by 0x882255E: clang::CodeGen::CodeGenFunction::EmitCXXConstructorCall(clang::CXXConstructorDecl const*, clang::CXXCtorType, bool, bool, clang::CodeGen::Address, clang::CodeGen::CallArgList&, clang::CodeGen::AggValueSlot::Overlap_t, clang::SourceLocation, bool) (CGClass.cpp:2134)
==2582== by 0x882343C: clang::CodeGen::CodeGenFunction::EmitCXXConstructorCall(clang::CXXConstructorDecl const*, clang::CXXCtorType, bool, bool, clang::CodeGen::Address, clang::CXXConstructExpr const*, clang::CodeGen::AggValueSlot::Overlap_t, bool) (CGClass.cpp:2052)
==2582== by 0x88B1B79: clang::CodeGen::CodeGenFunction::EmitCXXConstructEx
(...)
The text was updated successfully, but these errors were encountered: