[DebugInfo][DivRemPairs] Wrong branch execution caused by hoisting an instruction #85078

Apochens · 2024-03-13T13:52:28Z

This bug caused by hoisting the instruction DivInst to the conditional predecessor without dropping its debug location. Here is the buggy line.

Please check this @SLTozer @dwblaikie . If this is confirmed, I would like to give the patch to it.

int func(int a, int b) {
    int rem = a % b;
    int ret = 3;
    if (rem == 42) {
        ret = a / b;
    } 
    return ret;
}

int main() {
    func(4, 2);
}

$ clang -S -emit-llvm -Xclang -disable-O0-optnone main.c -o main.ll -g
$ opt -S -passes=mem2reg,div-rem-pairs -debug main.ll -o main-opt.ll
$ clang main-opt.ll

$ mylldb a.out 
(lldb) target create "a.out"
Current executable set to '/data1/llvm-test/DivRemPairs/a.out' (x86_64).
(lldb) b func
Breakpoint 1: where = a.out`func + 9 at main.c:2:17, address = 0x0000000000001139
(lldb) r
Process 2794232 launched: '/data1/llvm-test/DivRemPairs/a.out' (x86_64)
Process 2794232 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000555555555139 a.out`func(a=4, b=2) at main.c:2:17
   1    int func(int a, int b) {
-> 2        int rem = a % b;
   3        int ret = 3;
   4        if (rem == 42) {
   5            ret = a / b;
   6        } 
   7        return ret;
(lldb) s
Process 2794232 stopped
* thread #1, name = 'a.out', stop reason = step in
    frame #0: 0x0000555555555142 a.out`func(a=4, b=2) at main.c:5:17
   2        int rem = a % b;
   3        int ret = 3;
   4        if (rem == 42) {
-> 5            ret = a / b;
   6        } 
   7        return ret;
   8    }
(lldb) v rem
(int) rem = 0

The text was updated successfully, but these errors were encountered:

llvmbot · 2024-03-13T14:00:48Z

@llvm/issue-subscribers-debuginfo

Author: Shan Huang (Apochens)

This bug caused by hoisting the instruction `DivInst` to the conditional predecessor without dropping its debug location. Here is the [buggy line](https://github.com/llvm/llvm-project/blob/9fa866020395b215ece6140c2fedc7c31950272c/llvm/lib/Transforms/Scalar/DivRemPairs.cpp#L298).

Please check this @SLTozer @dwblaikie . If this is confirmed, I would like to give the patch to it.

int func(int a, int b) {
    int rem = a % b;
    int ret = 3;
    if (rem == 42) {
        ret = a / b;
    } 
    return ret;
}

int main() {
    func(4, 2);
}

$ clang -S -emit-llvm -Xclang -disable-O0-optnone main.c -o main.ll -g
$ opt -S -passes=mem2reg,div-rem-pairs -debug main.ll -o main-opt.ll
$ clang main-opt.ll

$ mylldb a.out 
(lldb) target create "a.out"
Current executable set to '/data1/llvm-test/DivRemPairs/a.out' (x86_64).
(lldb) b func
Breakpoint 1: where = a.out`func + 9 at main.c:2:17, address = 0x0000000000001139
(lldb) r
Process 2794232 launched: '/data1/llvm-test/DivRemPairs/a.out' (x86_64)
Process 2794232 stopped
* thread #<!-- -->1, name = 'a.out', stop reason = breakpoint 1.1
    frame #<!-- -->0: 0x0000555555555139 a.out`func(a=4, b=2) at main.c:2:17
   1    int func(int a, int b) {
-&gt; 2        int rem = a % b;
   3        int ret = 3;
   4        if (rem == 42) {
   5            ret = a / b;
   6        } 
   7        return ret;
(lldb) s
Process 2794232 stopped
* thread #<!-- -->1, name = 'a.out', stop reason = step in
    frame #<!-- -->0: 0x0000555555555142 a.out`func(a=4, b=2) at main.c:5:17
   2        int rem = a % b;
   3        int ret = 3;
   4        if (rem == 42) {
-&gt; 5            ret = a / b;
   6        } 
   7        return ret;
   8    }
(lldb) v rem
(int) rem = 0

SLTozer · 2024-03-13T17:46:32Z

This looks like a bug - for both the div and the rem it looks like we're necessarily hoisting them past a conditional branch. Though is div/rem is another pair of instructions that could trap? I'm not sure if this is a case where we'd consider the debug locations to be worth preserving - it doesn't help that we don't have a formal "rule" about when to choose accurate backtraces over accurate stepping.

jryans · 2024-03-13T17:55:24Z

it doesn't help that we don't have a formal "rule" about when to choose accurate backtraces over accurate stepping.

I don't want to totally derail this issue, but I've been thinking lately we may need an additional "line table tuning" option so the user could specify their use case ("debugging", "profiling", etc.) which may then allow for different choices in how we produce the line table. Anyway, I'll save further discussion for a separate thread some day. 🙂

jmorse · 2024-03-13T21:55:56Z

I'd guess that a % b dominating the division is relied upon to guarantee that b is nonzero. However, I'd also guess that there are inputs where the mod/rem instruction can be subsequently DCE'd and cause crashy inputs to crash on the wrong line, which would be infuriating (though probably rare).

@Apochens as you're reporting a lot of these faults, would you be able to group them under an umbrella bug. On bugzilla we used to be able to connect bugs as blocking/depending, which might not be possible on github, but we can put a reference to a relevant bug into a comment. That'll help verifying/testing when there's a plan to fix this.

Apochens · 2024-03-14T01:58:34Z

It seems I am trapped into the war between stepping and backtraces. 😂

@Apochens as you're reporting a lot of these faults, would you be able to group them under an umbrella bug. On bugzilla we used to be able to connect bugs as blocking/depending, which might not be possible on github, but we can put a reference to a relevant bug into a comment. That'll help verifying/testing when there's a plan to fix this.

@jmorse I am willing to do this. Are there some examples that could help me do this properly?

SLTozer · 2024-03-14T10:50:58Z

I don't want to totally derail this issue, but I've been thinking lately we may need an additional "line table tuning" option so the user could specify their use case ("debugging", "profiling", etc.) which may then allow for different choices in how we produce the line table.

I agree, this is something that I may be working on in the near future - there are competing concerns for line information that could be better tailored to each consumer's needs. There aren't actually that many callsites where we update DebugLocs on existing instructions AFAICT, so this could be done in the not-too-distant future.

dwblaikie · 2024-03-18T19:11:59Z

This looks like a bug - for both the div and the rem it looks like we're necessarily hoisting them past a conditional branch. Though is div/rem is another pair of instructions that could trap? I'm not sure if this is a case where we'd consider the debug locations to be worth preserving - it doesn't help that we don't have a formal "rule" about when to choose accurate backtraces over accurate stepping.

I don't want to totally derail this issue, but I've been thinking lately we may need an additional "line table tuning" option so the user could specify their use case ("debugging", "profiling", etc.) which may then allow for different choices in how we produce the line table. Anyway, I'll save further discussion for a separate thread some day.

I've missed a step here - I thought we did have a fairly well formalized process here: https://llvm.org/docs/HowToUpdateDebugInfo.html

And that there isn't a tradeoff between accurate backtraces and accurate stepping - the goal is not to show code as reachable that is not reachable. That's as true for human users as it is for machine uses - at least that's been the principle of LLVM's debug info line information for some time now.

How does 'line table tuning'/backtraces-versus-stepping uncertainty apply in this case?

I do agree that it's unfortunate when we might hoist code that subsequently crashed (merge a load from two distinct branches of an if/else, then the load crashes/asan-fails) and can't tell the user either of the lines that caused it. But the alternative of possibly telling them the wrong one seems problematic too. As we've discussed - it'd be valuable to have some way to describe a location that is speculated or sunk or otherwise "this is where the instruction came from but not where it ended up" - both for human users (eg: a crash report, so it could include this nuance in the message) and for machine uses (eg: a profiler, so it could know to ignore such a location). But in the absence of such an ability to differentiate, I don't think it'd be suitable to have this information - for humans or machines - it'd be too misleading. And I thought we were all pretty well in agreement on that for some time/that was/is LLVM's philosophy... - so I'm a bit confused.

SLTozer · 2024-03-19T11:19:35Z

You're correct, the rules for updating debug info do not seek to preserve debug locations for instructions on the basis of backtrace quality. We've recently been looking at debug location updates with these concerns in mind and have tried to give more weight towards preserving locations when they may be important for backtraces (mostly memory instructions). I think there's value in this, but as discussed we may be able to introduce improvements that would make that idea redundant.

Even before that I think @jryans point about adding a flag would be a more suitable solution for what we want; adding a flag that requests that LLVM preserve debug locations for instructions that may trap allows a user who specifically wants improved crash info to get that without introducing incorrect stepping/profiling information by default (I'd be happy to open a patch for that since I've already done an appreciable amount of the work for the purpose of instrumenting dropped debug locations). With that in mind, I'm fine to say we should follow the rules and drop debug locations for these hoists.

jryans · 2024-03-19T12:28:04Z

For this issue, following the existing rules is likely the right thing to do, and they are of course the current policy, so any potential changes should have a larger discussion (e.g. RFC).

My comment suggesting a flag for use cases that are sometimes in conflict was coming from a more general place than just this issue alone (hence my worry about derailing things here). It would be better to discuss that further in its own issue / PR / RFC where multiple justifying examples can be provided and debated.

adrian-prantl · 2024-03-20T16:03:52Z

The idea of a separate tuning mode profiling is something that has come up before.

Depending on what the two alternative locations are, it may be possible to just introduce a bit in the line table: E.g., if the the options are "no location" and "some location", we could have a "backtrace-only" bit that instructs profilers and stepping to ignore this location.

SLTozer · 2024-03-21T11:19:35Z

Depending on what the two alternative locations are, it may be possible to just introduce a bit in the line table

I agree - I've been looking at debug line issues recently and it seems like if we want to preserve as much line information as possible then we need to be able to describe this case, and nothing in the current DWARF line table is suitable; I'm currently trying to estimate the cost-benefit of making that change, so I may open a discussion for this later.

dwblaikie · 2024-03-25T18:04:11Z

Yes, a bit is basically what's needed - though I'd hope/suggest that this isn't a "profiling V interactive debugging" separation - even for human debugger users, I think the distinction is important (that's been how we've framed things so far - it was valuable to Sony to reduce noisy stepping, but also I think beyond that it's useful for users to avoid incorrect conclusions about reachability/truth of conditions, etc). It's as confusing to users as it would be to profilers - the code isn't actually reachable.

So hopefully if/when we add some kind of flag for this, debuggers, sanitizers, etc can render this to a user in such a way that users have a better chance of not being confused by the reachability/truthiness of any condition.

github-actions bot added the new issue label Mar 13, 2024

EugeneZelenko added debuginfo llvm:transforms and removed new issue labels Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DebugInfo][DivRemPairs] Wrong branch execution caused by hoisting an instruction #85078

[DebugInfo][DivRemPairs] Wrong branch execution caused by hoisting an instruction #85078

Apochens commented Mar 13, 2024

llvmbot commented Mar 13, 2024

SLTozer commented Mar 13, 2024

jryans commented Mar 13, 2024 •

edited

Loading

jmorse commented Mar 13, 2024

Apochens commented Mar 14, 2024

SLTozer commented Mar 14, 2024

dwblaikie commented Mar 18, 2024

SLTozer commented Mar 19, 2024

jryans commented Mar 19, 2024

adrian-prantl commented Mar 20, 2024 •

edited

Loading

SLTozer commented Mar 21, 2024

dwblaikie commented Mar 25, 2024

[DebugInfo][DivRemPairs] Wrong branch execution caused by hoisting an instruction #85078

[DebugInfo][DivRemPairs] Wrong branch execution caused by hoisting an instruction #85078

Comments

Apochens commented Mar 13, 2024

llvmbot commented Mar 13, 2024

SLTozer commented Mar 13, 2024

jryans commented Mar 13, 2024 • edited Loading

jmorse commented Mar 13, 2024

Apochens commented Mar 14, 2024

SLTozer commented Mar 14, 2024

dwblaikie commented Mar 18, 2024

SLTozer commented Mar 19, 2024

jryans commented Mar 19, 2024

adrian-prantl commented Mar 20, 2024 • edited Loading

SLTozer commented Mar 21, 2024

dwblaikie commented Mar 25, 2024

jryans commented Mar 13, 2024 •

edited

Loading

adrian-prantl commented Mar 20, 2024 •

edited

Loading