New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llc aborts, clang error 70 on specific PPC64 CTR Loop bitcode at -O1 and beyond #36398
Comments
assigned to @nemanjai |
I'll have a look at this. |
This patch should fix the problem. Please let me know whether it does. Of course, it won't fix the attached test case because the attached test case is broken already (i.e. there's no mtctr for the bdnz). But it should prevent CTRLoops followed by the opt pipeline from producing such code. Index: include/llvm/IR/IntrinsicsPowerPC.td--- include/llvm/IR/IntrinsicsPowerPC.td (revision 329903) // Intrinsics used to generate ctr-based loops. These should only be
|
@Nemanja Your proposed patch does indeed fix the testcase that produced this invalid bitcode. It will take several hours to verify that all the other testcases remain unbroken, but I don't expect any new failures and the odds are you've solved the problem. Thank you! |
The entire Numba testsuite passes with no aborts, segfaults or (relevant) failures. Your change solves the problem for Numba. |
Do we need both intrinsics marked as NoDuplicate, or just int_ppc_is_decremented_ctr_nonzero? (it's not obvious to me that mtctr is special in the same way). |
The reason I marked both with the property is in case something attempts to duplicate one and not the other in some unexpected way. I am not sure if this could ever happen but if it did, what would the expected outcome be? I suppose if there is a guarantee that the new call to the mtctr intrinsic cannot possibly end up in the region between the branch target of the BDNZ and the BDNZ instruction itself, it is perfectly safe not to add the property to the mtctr. The provided test case certainly doesn't fail if the property is only on the bdnz intrinsic. Hal, do you know if there's a specific advantage to not marking the mtctr intrinsic as NoDuplicate? Or is it just that we'd prefer not to mark it if there is no clear requirement to do so? Olexa, do you think you'd be able to test out the patch below to see if it is adequate to fix the Numba failures? Index: include/llvm/IR/IntrinsicsPowerPC.td--- include/llvm/IR/IntrinsicsPowerPC.td (revision 329903)
|
I can try out that patch, yes. I take it it's just the second change of your first patch? The PPCCTRLoopsVerify, at https://llvm.org/doxygen/PPCCTRLoops_8cpp_source.html , checks that, and I quote, // Verify that all bdnz/bdz instructions are dominated by a loop mtctr before From my foggy understanding of LLVM, it seems to me that the joint combination of a ppc_is_decremented_ctr_nonzero and a conditional branch are clearly intended to map down to a BDNZ. If so, then:
Observing that
I reason that duplicating mtctr only, while not duplicating ppc_is_decremented_ctr_nonzero, results in strictly stronger domination than duplicating neither. Since duplicating neither does not abort, I would expect your proposed change not to abort either. I will try this out immediately. |
Early signs are encouraging. All of the specific testcases that used to abort when both were duplicated, and were fixed when neither were duplicated, still pass when only mtctr is duplicated. I'll launch the entire testsuite now, but I am more-or-less convinced (as I wrote above) that this will work. |
I don't have anything specific in mind, so mostly the latter. Also, mtctr doesn't seem that special. By the way, what transformation is duplicating the intrinsic? Is it something we'd like to prevent? Is the bug in the verifier? Setting NoDuplicate on these, for example, will prevent unrolling an outer loop with a ctr-based inner loop inside it, and at least that doesn't seem unsound. |
I ran the entire Numba testsuite, and it finished with identical results as with your previous patch: No segfaults, aborts or (relevant) failures. It appears that it is enough to prevent duplication of ppc_is_decremented_ctr_nonzero. |
Excellent news. If Hal agrees, I'll commit that version of the patch. Do you need it backported to LLVM 6? |
It was loop rotation that was causing the break. I agree that unrolling should be fine to do (as long as it copies the mtctr in the preheader as well as the bdnz in the latch). However, I don't know how we would encapsulate the "OK to duplicate if the corresponding mtctr is duplicated as well" constraint. |
I agree that the dominance relation is guaranteed to hold. My initial thought was that since I don't know the loop optimization pipeline well enough, I cannot be sure that we won't produce invalid CTR loops. I was just afraid we would produce a situation such as this:
BT: But clearly this does not happen or you'd likely see run-time problems such as infinite loops, incorrect numbers of iterations, etc. |
Sounds good. Yes, please commit this (with some comment about preventing loop rotation, etc.) - and with a test case if possible. Thanks! |
Yes please. In fact, the following need backporting to LLVM 6.0.1 in order to enable Numba to claim support for PowerPC:
This would mean a lot for the Numba guys, and PowerPC more broadly. At present they make the JITing and execution of linkonce_odr functions spread across several modules quite simply unusable. Thanks ahead of time! |
Hi Hal, Hans, This is one of a series of issues that the Numba PPC port has uncovered and that have now been fixed. It is rather important for us to port the fixes back to LLVM 6. Below is a list of commits and their quick descriptions. r330186 - The fix for this bug I could open separate PR's for the relocations and global entry points if you'd prefer or I can commit these to the release_60 branch if that helps. |
Makes sense to me. |
Hi Tom, the patches listed in comment 16 are approved by the code owner (Hal). Thanks Hal. Please let me know if there's anything else you'd like me to do with regard to merging these. |
Any movement on backporting the above 5 patches to 6.0.1? Once we know that LLVM 6.0.1 will contain the fixes, it would enable Numba to claim full support for PPC64 as of LLVM 6.0.1. Otherwise, we will have to wait till LLVM 7.0.0 and retest against any of its API/ABI-breaking changes. |
I've merged these into the release_60 branch: r331714, r331715, r331716. |
Thank you so much Tom! |
Extended Description
While debugging an "Invalid PPC CTR loop!" problem with LLVM JIT codegen for Numba (numba/numba#2451 (comment)), I managed to isolate an ugly, 186KB LLVM IR module that caused both llc and clang-7.0.0svn trunk to abort() with that "Invalid PPC CTR loop!" message.
Bugpoint then stripped it down to 1.6KB of bitcode that causes Clang to error out (exit code 70) instead, but llc still aborts in the same way.
Bitcode is attached. Console printouts below. All this is also viewable at the Numba Github link above.
The Clang error printout:
$ clang -O3 -v -c crashir-reduced.bc -o crashir-reduced.bc.o
clang version 7.0.0 (https://git.llvm.org/git/clang.git/ f38834898ad991aec557e2a609c6b7c3b288bf10) (https://git.llvm.org/git/llvm.git/ 32d2ff0)
Target: powerpc64le-unknown-linux-gnu
Thread model: posix
InstalledDir: /.local/bin
Found candidate GCC installation: /usr/lib/gcc/ppc64le-redhat-linux/4.8.2
Found candidate GCC installation: /usr/lib/gcc/ppc64le-redhat-linux/4.8.5
Selected GCC installation: /usr/lib/gcc/ppc64le-redhat-linux/4.8.5
Candidate multilib: .;@m64
Selected multilib: .;@m64
"/.local/bin/clang-7" -cc1 -triple powerpc64le-unknown-linux-gnu -emit-obj -disable-free -main-file-name crashir-reduced.bc -mrelocation-model pic -pic-level 2 -mthread-model posix -fmath-errno -masm-verbose -mconstructor-aliases -fuse-init-array -target-cpu ppc64le -mfloat-abi hard -target-abi elfv2 -dwarf-column-info -debugger-tuning=gdb -momit-leaf-frame-pointer -v -coverage-notes-file /NUMBA_SOURCE_PATH/bugpointwdir/crashir-reduced.bc.gcno -resource-dir /.local/lib/clang/7.0.0 -O3 -fdebug-compilation-dir /NUMBA_SOURCE_PATH/bugpointwdir -ferror-limit 19 -fmessage-length 184 -fno-signed-char -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -vectorize-loops -vectorize-slp -o crashir-reduced.bc.o -x ir crashir-reduced.bc
clang -cc1 version 7.0.0 based upon LLVM 7.0.0svn default target powerpc64le-unknown-linux-gnu
fatal error: error in backend: Cannot select: intrinsic %llvm.ppc.is.decremented.ctr.nonzero
clang-7: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 7.0.0 (https://git.llvm.org/git/clang.git/ f38834898ad991aec557e2a609c6b7c3b288bf10) (https://git.llvm.org/git/llvm.git/ 32d2ff0)
Target: powerpc64le-unknown-linux-gnu
Thread model: posix
InstalledDir: /.local/bin
clang-7: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
clang-7: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.
The llc printout:
$ llc -O3 crashir-reduced.bc -o crashir-reduced.bc.o
Invalid PPC CTR loop!
UNREACHABLE executed at /LLVM_SOURCE_PATH/lib/Target/PowerPC/PPCCTRLoops.cpp:777!
LLVMSymbolizer: error reading file: No such file or directory
#0 0x000000001271c7e0 (llc+0x1271c7e0)
#1 0x000000001271c8e4 (llc+0x1271c8e4)
#2 0x000000001271a568 (llc+0x1271a568)
#3 0x000000001271bd88 (llc+0x1271bd88)
#4 0x00003fff9ab50478 0x478 __GI_abort
#5 0x00003fff9ab50478
#6 0x00003fff9ab50478 (+0x478)
#7 0x00003fff9a610d70 (/lib64/libc.so.6+0x40d70)
#8 0x00000000126666e8 (llc+0x126666e8)
#9 0x0000000010c4bcd4 (llc+0x10c4bcd4)
#10 0x0000000011555c24 (llc+0x11555c24)
#11 0x0000000011c87390 (llc+0x11c87390)
#12 0x0000000011c875f4 (llc+0x11c875f4)
#13 0x0000000011c87b80 (llc+0x11c87b80)
#14 0x0000000011c884ec (llc+0x11c884ec)
#15 0x0000000011c888c8 (llc+0x11c888c8)
#16 0x0000000010b6ceac generic_start_main.isra.0 (llc+0x10b6ceac)
#17 0x0000000010b6b3b8 __libc_start_main (llc+0x10b6b3b8)
llc(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x58)[0x1271c7e0]
llc[0x1271c8e4]
llc(_ZN4llvm3sys17RunSignalHandlersEv+0xb4)[0x1271a568]
llc[0x1271bd88]
[0x3fff9ab50478]
/lib64/libc.so.6(abort+0x280)[0x3fff9a610d70]
llc(_ZN4llvm25llvm_unreachable_internalEPKcS1_j+0x104)[0x126666e8]
llc[0x10c4bcd4]
llc(_ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE+0x22c)[0x11555c24]
llc(_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE+0x170)[0x11c87390]
llc(_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE+0x74)[0x11c875f4]
llc[0x11c87b80]
llc(_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE+0x164)[0x11c884ec]
llc(_ZN4llvm6legacy11PassManager3runERNS_6ModuleE+0x38)[0x11c888c8]
llc[0x10b6ceac]
llc(main+0x580)[0x10b6b3b8]
/lib64/libc.so.6(+0x24700)[0x3fff9a5f4700]
/lib64/libc.so.6(__libc_start_main+0xc4)[0x3fff9a5f48f4]
Stack dump:
0. Program arguments: llc -O3 crashir-reduced.bc -o crashir-reduced.bc.o
Aborted
The text was updated successfully, but these errors were encountered: