-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
-g causes different code to be generated due to PostRA scheduling differences #36588
Comments
I was under the impression that there was an existing bug for this, The .cfi_* directives are there to allow generating information that |
It seems to me that we should be able to model the relevant dependencies (e.g. by adding some implicit reads/writes to the CFI_INSTRUCTION) such that only valid re-orderings would occur. We might also need to teach the scheduler to schedule these instructions as soon as they are available, or in some other way prevent their presence from altering scheduling decisions. That being said, we'd need to weigh the complexity of doing all of this against the benefit of having slightly more consistent code generation w.r.t. debug flags. FWIW, this isn't something I am actively pursuing. |
It is a given, among people who do debug info seriously, that changing And if this actually interferes with other optimizations like tail merging Thanks for filing the bug! |
The scheduling regions [I, RegionEnd] will be broken by cfg instructions in PostRA Machine Instruction Scheduler (postmisched) pass. track the code when getSchedRegions:
if (isSchedBoundary(&MI, &*MBB, MF, TII)) isSchedBoundary() TII->isSchedulingBoundary(*MI, MBB, *MF); // llvm/include/llvm/CodeGen/MachineInstr.h#L1024
|
Chris wrote:
From what Paul wrote in comment 1, I believe the risk is that machine instructions will be rescheduled in such a way that the CFI instructions are no longer correct, which will mislead debuggers. Fully fixing this probably means the scheduler needs to understand CFI instructions, so that it can fix up the changes that it makes. An alternative interpretation would be that it's better to create broken debug info than to change the generated code, but that might be difficult to argue. |
In this case, cfi instructions are recognized as SchedulingBoundary and break one Region into two Regions. For example, the su(0) and su(1) are one Region, and su(9) and su(10) is another Region, they should schedule in one Region, but in fact they do not schedule as two Regions caused by cfi barriers. After replaced the MI.isPosition with MI.isLable in isSchedulingBoundary, cfi instructions will not break the Regions then, and those cfi instructions will be pushed in the same Regions with machine instructions and do scheduling together. here is debug information:Instruction sequence before schedule: after acheduling (no code change) - scheduling no change, postmisched fails After scheduling (with code change) - the machine instruction is correct, but cfi moved after su(0). code change:
Could the issue be fixed with this change? In this way the cfi instruction will join scheduling and reserved in code. Not sure if this has risk somewhere. |
I think your idea is not a valid fix for the problem. Although the problem of having differing assembly will certainly be fixed by your change, it would move cfi instructions in an invalid way and away from stack altering instructions. Please see this discussion on the mailing list: http://lists.llvm.org/pipermail/llvm-dev/2019-September/135433.html And this patch on Phabricator: https://reviews.llvm.org/D68076 |
Hi Chris, I've been digging around and, with some help, I think I now There is a TL;DR near the end. The following write-up follows my investigation Clickbait title: CFI_INSTRUCTIONs don't need to be scheduling barriers. Scheduling barriersIf an instruction, implicitly or otherwise, modifies the stack pointer Call Frame Information (CFI) instructionsCFI instructions are used to work out offset of the Canonical Frame Address CFI_INSTRUCTIONs that we care about for this example: x86When we spill registers on x86 we will most likely Assuming the CFI_INSTRUCTIONs correctly always follow the instruction that --- Without CFI Goeff's aarch64 reproducerTaken from this ticket, note the new comments in the square brackets. stp x23, x22, [sp, #-48]! // 8-byte Folded Spill [READ-WRITE SP] Here, there are two spills which do not modify SP but do require [TL;DR] By definition, instructions which modify SP are both scheduling barriers Potential fix (suggested by @jmorse offline)Pull out CFI_INSTRUCTIONS before scheduling. Map them to the instructions [0] http://lists.llvm.org/pipermail/llvm-dev/2019-September/135433.html |
Thanks Orlando for analyzing, not sure if David or someone else are working on this issue. If not, let me have a try to fix. |
Not sure if my solution is workable as below: In getSchedRegions(), the region will be detected from RegionEnd to RegionBegin, for example, original MBB:
After reorder CFI by splice to END:
|
Only aarch64/arm64 meet this issue, and other platform does not, X86 for example disabled post-MI-sched: lib/Target/X86/X86Schedule.td// IssueWidth is analogous to the number of decode units. Core and its // The GenericX86Model contains no instruction schedules
|
Fix available in: https://reviews.llvm.org/D68076 |
mentioned in issue #37076 |
I think this might've been fixed, I am unable to reproduce it with clang-trunk:
|
Extended Description
When -g (or just -gline-tables-only) is passed to clang, CFI_INSTRUCTIONs are added to the function prologue by the Prologue/Epilogue insertion pass. These instructions act as scheduling region barriers when the PostRA scheduler is run. This can lead to different schedules for the prologue block, which can further lead to differences in tail merging/duplication.
A simple example the illustrates the problem:
target triple = "aarch64-linux-gnu"
@X1 = global i32 0, align 4
@X2 = global i32 0, align 4
@X3 = global i32 0, align 4
@X4 = global i32 0, align 4
define void @test(i32 %i) #0 {
entry:
%0 = load i32, i32* @X1, align 4
%x1 = add i32 %0, 1
%x2 = add i32 %0, 2
%x3 = add i32 %0, 3
%x4 = add i32 %0, 4
tail call void @foo()
store i32 %x1, i32* @X1, align 4
store i32 %x2, i32* @X2, align 4
store i32 %x3, i32* @X3, align 4
store i32 %x4, i32* @X4, align 4
ret void
}
declare void @foo()
attributes #0 = { nounwind }
!llvm.dbg.cu = !{#0}
!llvm.module.flags = !{#3, !4, !5}
!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang version 7.0.0 (trunk 330790) (llvm/trunk 330787)", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, enums: !2)
!1 = !DIFile(filename: "test.c", directory: "")
!2 = !{}
!3 = !{i32 2, !"Dwarf Version", i32 4}
!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = !{i32 1, !"wchar_size", i32 4}
when compiled as is (i.e. with debug info), produces the following code:
but when the debug metadata is commented out, it produces the following code instead:
Note the different position in the schedule of the first adrp and ldr instructions.
The text was updated successfully, but these errors were encountered: