New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Presence of dbg.declare affects result from DeadStoreElimination #58285
Comments
I've done some debugging and found the following: I see a difference with/without debuginfo in that the The problem seems to be related to DSE having some limit(s) where it cuts of searches. static cl::opt<unsigned> MemorySSAUpwardsStepLimit(
"dse-memoryssa-walklimit", cl::init(90), cl::Hidden,
cl::desc("The maximum number of steps while walking upwards to find "
"MemoryDefs that may be killed (default = 90)")); So, what I think happens is that GVN does a rewrite of the code and something happens with internal data structures (memoryssa?). Then when we get to DSE, we get a different iteration order, and we get trying to get dominating access
visiting 18 = MemoryDef(17) ( store i64 %inc.i.i, ptr @i, align 1)
... hit walker step limit but when we don't have the trying to get dominating access
visiting 18 = MemoryDef(17) ( store i64 %inc.i.i, ptr @i, align 1)
Checking for reads of 18 = MemoryDef(17) ( store i64 %inc.i.i, ptr @i, align 1)
20 = MemoryPhi({if.then7.i.i,18},{if.then.i.i,17},{crit_edge,15})
... adding PHI uses
19 = MemoryDef(20) ( store i64 %conv2, ptr @i, align 1)
... skipping killing def/dom access
Checking if we can kill 18 = MemoryDef(17) ( store i64 %inc.i.i, ptr @i, align 1)
DSE: Remove Dead Store:
DEAD: store i64 %inc.i.i, ptr @i, align 1
KILLER: store i64 %conv2, ptr @i, align 1 So in a quite complicated and sort of unlucky way, the |
@mikaelholmen is it possible you attached the wrong file? It's named |
@fhahn: Yes, I messed up, thanks! Also updated the original post to contain the right file. |
This is an interesting one. I think the issue is that Not sure yet what the best fix would be here. |
Slightly reduced reproducer: https://llvm.godbolt.org/z/b6f1a34EM |
@llvm/issue-subscribers-debuginfo |
Would it be ok to not invalidate existing analyses in case ADCE only has removed dbg intrinsics? That would at least be an easy fix. But maybe we really need to invalidate even in that case? |
(shooting from the hip/driveby commentary)
That limit should be applied ignoring debug info intrinsics (eg: should be N non-dbg instructions - there are iterators that expose only these non-dbg instructions)
Probably ADCE should'nt touch the dbg intrinsics if it isn't touching any code? |
Thanks @dwblaikie! Maybe so. I actually tried that first but got a little put off when I saw that in e.g. @f() in
which would otherwise be removed. So I was worried about what kind of bloating would occur if ADCE wouldn't be able to remove them. |
Maybe - I'm not 100% sure either way. Might be that the right solution to all those dbg intrinsics is for them to be cleaned up closer to when they became dead?
That bit's confusing - recomputing an analysis versus using an old one shouldn't produce different results. So an optimization removing only dbg intrinsics and reporting "yeah, I changed stuff", causing an analysis to be invalidated and recomputed, shouldn't change the result of some future non-debug optimization? |
That's the ideal, but the actual requirement is just the the analysis produces correct results, though they might be better or worse than recomputing from scratch. For some analyses, there isn't even any sensible notion of what the "right" results are, because they are query-order dependent (such as SCEV or LVI, not sure whether this affects MemorySSA). |
Ah, fascinating. Not sure what to do with that, then - yeah, not sure what the right call is to remove debug info IR - respond with "this didn't change the IR" perhaps on the rationale that that response should only account for non-debug-info IR... - or some other path. |
Thanks for the discussion and explanations! There are probably several other different solutions to this as well but the most obvious to me look like either
If 1) is correct/allowed I think prefer that, but I don't know if it is? What do you say? |
That was a new thing for me as well. Is that requirement written down somewhere? All these times when I've tried to narrow down pass dependency problems with stale analyses etc, and having tried to bisect by adding invalidations of analysis in various places in the pipeline to find out when the IR differs. All those times I've really had no guarantees that different IR really would indicate a bug or not. Interesting. |
Debug info is of course special as we want to get same codegen independent of debug info or not. So given the news that re-computing MemorySSA isn't guaranteed to give same result as before, then I assume that we either want to invalidate it more consistently regardless of what a pass like ADCE is doing (to make debugging easier and avoiding depending on what earlier passes did) , or we should only invalidate the analysis when doing something that makes the analysis invalid. I mean, if we introduce some analysis that is about debug info., then we can't just say that ADCE did nothing and all analyses should be kept. Unless the pass manager is changed to return two statuses "did not change anything relate to non-dbg" vs "did not change anything related to dbg". |
Since we see this problem fairly often in downstream testing I'd like to try to move it forward in some way. Do you think 1) above is correct/allowed? |
@adrian-prantl @JDevlieghere for debug info thoughts |
given the "analyses just need to be correct, but may be different across computations" state of LLVM, I don't think there's a principled way of handling this; it'll have to be case by case to ensure that removing debug intrinsics causes the same analysis computation as not having debug intrinsics if we just care about MemorySSA, the pass can just say that it preserves MemorySSA if it only removed debug intrinsics (assuming MemorySSA doesn't care about debug intrinsics). if other analyses run into similar issues, we can introduce another but the "don't touch debug intrinsics unless other instructions are modified" approach seems better to me |
In the long run, there's an ongoing effort to eliminate debug-info instructions entirely. |
Thanks everyone. |
…oved We now limit ADCE to only remove debug intrinsics if it does something else that would invalidate cached analyses anyway. As we've seen in #58285 throwing away cached analysis info when only debug instructions are removed can lead to different code when debug info is present or not present. Differential Revision: https://reviews.llvm.org/D145051
Closing as I pushed a fix for this in 8aa9ab3 Thanks! |
Reverted the fix in f5097ed
|
Since I reverted the first fix due to compile-time regressions we need anyother way forward here. @nikic suggested
in https://reviews.llvm.org/D145051#4166656 Does anyone object to that or is that the way to go? @aeubanks ? |
that's fine as a workaround especially if there are efforts to remove debug info instructions altogether (although it would be good to know exactly why that patch caused major compile time regressions) |
Another attempt here then: |
As we've seen in #58285 throwing away MemorySSA when only debug instructions are removed can lead to different code when debug info is present or not present. This is the second attempt at fixing the problem. The first one was https://reviews.llvm.org/D145051 which was reverted due to a 15% compile time regression for tramp3d-v4 in NewPM-ReleaseLTO-g.
As we've seen in #58285 throwing away MemorySSA when only debug instructions are removed can lead to different code when debug info is present or not present. This is the second attempt at fixing the problem. The first one was https://reviews.llvm.org/D145051 which was reverted due to a 15% compile time regression for tramp3d-v4 in NewPM-ReleaseLTO-g. Differential Revision: https://reviews.llvm.org/D145478
I pushed a fix in b9337b1
so I'll close this again and hope it can stay that way. :) Thanks for the help everyone! |
llvm commit: ac47db6
Reproduce with:
opt -passes="default<O2>" -S -o - bbi-74540_2.ll
Result:
However, if we comment out the call to the
dbg.declare
in the input, we iinstead get:So with the
dbg.declare
present, the bbif.then7.i.i
and its contents are left in the code.bbi-74540_2.ll.gz
The text was updated successfully, but these errors were encountered: