-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DebugInfo][DivRemPairs] Wrong branch execution caused by hoisting an instruction #85078
Comments
@llvm/issue-subscribers-debuginfo Author: Shan Huang (Apochens)
This bug caused by hoisting the instruction `DivInst` to the conditional predecessor without dropping its debug location. Here is the [buggy line](https://github.com/llvm/llvm-project/blob/9fa866020395b215ece6140c2fedc7c31950272c/llvm/lib/Transforms/Scalar/DivRemPairs.cpp#L298).
Please check this @SLTozer @dwblaikie . If this is confirmed, I would like to give the patch to it. int func(int a, int b) {
int rem = a % b;
int ret = 3;
if (rem == 42) {
ret = a / b;
}
return ret;
}
int main() {
func(4, 2);
} $ clang -S -emit-llvm -Xclang -disable-O0-optnone main.c -o main.ll -g
$ opt -S -passes=mem2reg,div-rem-pairs -debug main.ll -o main-opt.ll
$ clang main-opt.ll $ mylldb a.out
(lldb) target create "a.out"
Current executable set to '/data1/llvm-test/DivRemPairs/a.out' (x86_64).
(lldb) b func
Breakpoint 1: where = a.out`func + 9 at main.c:2:17, address = 0x0000000000001139
(lldb) r
Process 2794232 launched: '/data1/llvm-test/DivRemPairs/a.out' (x86_64)
Process 2794232 stopped
* thread #<!-- -->1, name = 'a.out', stop reason = breakpoint 1.1
frame #<!-- -->0: 0x0000555555555139 a.out`func(a=4, b=2) at main.c:2:17
1 int func(int a, int b) {
-> 2 int rem = a % b;
3 int ret = 3;
4 if (rem == 42) {
5 ret = a / b;
6 }
7 return ret;
(lldb) s
Process 2794232 stopped
* thread #<!-- -->1, name = 'a.out', stop reason = step in
frame #<!-- -->0: 0x0000555555555142 a.out`func(a=4, b=2) at main.c:5:17
2 int rem = a % b;
3 int ret = 3;
4 if (rem == 42) {
-> 5 ret = a / b;
6 }
7 return ret;
8 }
(lldb) v rem
(int) rem = 0 |
This looks like a bug - for both the div and the rem it looks like we're necessarily hoisting them past a conditional branch. Though is div/rem is another pair of instructions that could trap? I'm not sure if this is a case where we'd consider the debug locations to be worth preserving - it doesn't help that we don't have a formal "rule" about when to choose accurate backtraces over accurate stepping. |
I don't want to totally derail this issue, but I've been thinking lately we may need an additional "line table tuning" option so the user could specify their use case ("debugging", "profiling", etc.) which may then allow for different choices in how we produce the line table. Anyway, I'll save further discussion for a separate thread some day. 🙂 |
I'd guess that @Apochens as you're reporting a lot of these faults, would you be able to group them under an umbrella bug. On bugzilla we used to be able to connect bugs as blocking/depending, which might not be possible on github, but we can put a reference to a relevant bug into a comment. That'll help verifying/testing when there's a plan to fix this. |
It seems I am trapped into the war between stepping and backtraces. 😂
@jmorse I am willing to do this. Are there some examples that could help me do this properly? |
I agree, this is something that I may be working on in the near future - there are competing concerns for line information that could be better tailored to each consumer's needs. There aren't actually that many callsites where we update DebugLocs on existing instructions AFAICT, so this could be done in the not-too-distant future. |
I've missed a step here - I thought we did have a fairly well formalized process here: https://llvm.org/docs/HowToUpdateDebugInfo.html And that there isn't a tradeoff between accurate backtraces and accurate stepping - the goal is not to show code as reachable that is not reachable. That's as true for human users as it is for machine uses - at least that's been the principle of LLVM's debug info line information for some time now. How does 'line table tuning'/backtraces-versus-stepping uncertainty apply in this case? I do agree that it's unfortunate when we might hoist code that subsequently crashed (merge a load from two distinct branches of an if/else, then the load crashes/asan-fails) and can't tell the user either of the lines that caused it. But the alternative of possibly telling them the wrong one seems problematic too. As we've discussed - it'd be valuable to have some way to describe a location that is speculated or sunk or otherwise "this is where the instruction came from but not where it ended up" - both for human users (eg: a crash report, so it could include this nuance in the message) and for machine uses (eg: a profiler, so it could know to ignore such a location). But in the absence of such an ability to differentiate, I don't think it'd be suitable to have this information - for humans or machines - it'd be too misleading. And I thought we were all pretty well in agreement on that for some time/that was/is LLVM's philosophy... - so I'm a bit confused. |
You're correct, the rules for updating debug info do not seek to preserve debug locations for instructions on the basis of backtrace quality. We've recently been looking at debug location updates with these concerns in mind and have tried to give more weight towards preserving locations when they may be important for backtraces (mostly memory instructions). I think there's value in this, but as discussed we may be able to introduce improvements that would make that idea redundant. Even before that I think @jryans point about adding a flag would be a more suitable solution for what we want; adding a flag that requests that LLVM preserve debug locations for instructions that may trap allows a user who specifically wants improved crash info to get that without introducing incorrect stepping/profiling information by default (I'd be happy to open a patch for that since I've already done an appreciable amount of the work for the purpose of instrumenting dropped debug locations). With that in mind, I'm fine to say we should follow the rules and drop debug locations for these hoists. |
For this issue, following the existing rules is likely the right thing to do, and they are of course the current policy, so any potential changes should have a larger discussion (e.g. RFC). My comment suggesting a flag for use cases that are sometimes in conflict was coming from a more general place than just this issue alone (hence my worry about derailing things here). It would be better to discuss that further in its own issue / PR / RFC where multiple justifying examples can be provided and debated. |
The idea of a separate tuning mode profiling is something that has come up before. Depending on what the two alternative locations are, it may be possible to just introduce a bit in the line table: E.g., if the the options are "no location" and "some location", we could have a "backtrace-only" bit that instructs profilers and stepping to ignore this location. |
I agree - I've been looking at debug line issues recently and it seems like if we want to preserve as much line information as possible then we need to be able to describe this case, and nothing in the current DWARF line table is suitable; I'm currently trying to estimate the cost-benefit of making that change, so I may open a discussion for this later. |
Yes, a bit is basically what's needed - though I'd hope/suggest that this isn't a "profiling V interactive debugging" separation - even for human debugger users, I think the distinction is important (that's been how we've framed things so far - it was valuable to Sony to reduce noisy stepping, but also I think beyond that it's useful for users to avoid incorrect conclusions about reachability/truth of conditions, etc). It's as confusing to users as it would be to profilers - the code isn't actually reachable. So hopefully if/when we add some kind of flag for this, debuggers, sanitizers, etc can render this to a user in such a way that users have a better chance of not being confused by the reachability/truthiness of any condition. |
This bug caused by hoisting the instruction
DivInst
to the conditional predecessor without dropping its debug location. Here is the buggy line.Please check this @SLTozer @dwblaikie . If this is confirmed, I would like to give the patch to it.
The text was updated successfully, but these errors were encountered: