Presence of dbg.declare affects result from DeadStoreElimination #58285

mikaelholmen · 2022-10-11T11:35:41Z

llvm commit: ac47db6
Reproduce with:

opt -passes="default<O2>" -S -o - bbi-74540_2.ll

Result:

define i16 @m() local_unnamed_addr #0 {
[...]
if.then.i.i:                                      ; preds = %entry
  store i32 1, ptr getelementptr inbounds ([1 x [1 x [1 x i32]]], ptr @g, i64 1, i64 0, i64 0, i64 0), align 1
  %t7.i.i = load i16, ptr @h, align 1
  %0 = and i16 %t7.i.i, 1
  store i16 %0, ptr @h, align 1
  %tobool.not.i.i = icmp eq i16 %0, 0
  br i1 %tobool.not.i.i, label %j.exit, label %if.then7.i.i

if.then7.i.i:                                     ; preds = %if.then.i.i
  %t8.i.i = load i64, ptr @i, align 1
  %inc.i.i = add i64 %t8.i.i, 1
  store i64 %inc.i.i, ptr @i, align 1
  br label %j.exit

j.exit:                                           ; preds = %entry.j.exit_crit_edge, %if.then.i.i, %if.then7.i.i

However, if we comment out the call to the dbg.declare in the input, we iinstead get:

define i16 @m() local_unnamed_addr #0 {
[...]
if.then.i.i:                                      ; preds = %entry
  store i32 1, ptr getelementptr inbounds ([1 x [1 x [1 x i32]]], ptr @g, i64 1, i64 0, i64 0, i64 0), align 1
  %t7.i.i = load i16, ptr @h, align 1
  %0 = and i16 %t7.i.i, 1
  store i16 %0, ptr @h, align 1
  br label %j.exit

j.exit:                                           ; preds = %if.then.i.i, %entry.j.exit_crit_edge

So with the dbg.declare present, the bb if.then7.i.i and its contents are left in the code.
bbi-74540_2.ll.gz

The text was updated successfully, but these errors were encountered:

mikaelholmen · 2022-10-11T11:39:16Z

I've done some debugging and found the following:

I see a difference with/without debuginfo in that the dbg.declare is actually removed by the ADCE pass, which then makes the debug and non-debug-info cases look the same(!).
But following that, DSE then removes the contents of the if.then7.i.i block in the non-debug case, but in the debug case (where the input to DSE looks the same), the contents of if.then7.i.i are kept.

The problem seems to be related to DSE having some limit(s) where it cuts of searches.
In this case the debug variance goes away if we increase

static cl::opt<unsigned> MemorySSAUpwardsStepLimit(
    "dse-memoryssa-walklimit", cl::init(90), cl::Hidden,
    cl::desc("The maximum number of steps while walking upwards to find "
             "MemoryDefs that may be killed (default = 90)"));

So, what I think happens is that GVN does a rewrite of the code and something happens with internal data structures (memoryssa?). Then when we get to DSE, we get a different iteration order, and we get

  trying to get dominating access
   visiting 18 = MemoryDef(17) (  store i64 %inc.i.i, ptr @i, align 1)
   ...  hit walker step limit

but when we don't have the dbg.declare we instead examine 18 earlier and get:

  trying to get dominating access
   visiting 18 = MemoryDef(17) (  store i64 %inc.i.i, ptr @i, align 1)
  Checking for reads of 18 = MemoryDef(17) (  store i64 %inc.i.i, ptr @i, align 1)
   20 = MemoryPhi({if.then7.i.i,18},{if.then.i.i,17},{crit_edge,15})
    ... adding PHI uses
   19 = MemoryDef(20) (  store i64 %conv2, ptr @i, align 1)
    ... skipping killing def/dom access
 Checking if we can kill 18 = MemoryDef(17) (  store i64 %inc.i.i, ptr @i, align 1)
DSE: Remove Dead Store:
  DEAD:   store i64 %inc.i.i, ptr @i, align 1
  KILLER:   store i64 %conv2, ptr @i, align 1

So in a quite complicated and sort of unlucky way, the dbg.declare indirectly affects what code we get.

fhahn · 2022-10-11T14:44:41Z

@mikaelholmen is it possible you attached the wrong file? It's named bbi-74541.ll not bbi-74540_2.ll like in the command you shared. The one you attached optimizes to unreachable.

mikaelholmen · 2022-10-12T04:43:01Z

@fhahn: Yes, I messed up, thanks!
Now attaching the right file I hope
bbi-74540_2.ll.gz

Also updated the original post to contain the right file.

fhahn · 2022-10-12T10:00:27Z

This is an interesting one. I think the issue is that ADCE invalidates MSSA due to making a change. This means with the dbg intrinsic, DSE will use a freshly computed version of MSSA while otherwise it re-uses the one computed earlier, which may have some optimizations applied.

Not sure yet what the best fix would be here.

fhahn · 2022-10-12T10:07:38Z

Slightly reduced reproducer: https://llvm.godbolt.org/z/b6f1a34EM

llvmbot · 2022-11-22T12:10:52Z

@llvm/issue-subscribers-debuginfo

mikaelholmen · 2023-02-14T13:26:06Z

This is an interesting one. I think the issue is that ADCE invalidates MSSA due to making a change. This means with the dbg intrinsic, DSE will use a freshly computed version of MSSA while otherwise it re-uses the one computed earlier, which may have some optimizations applied.

Would it be ok to not invalidate existing analyses in case ADCE only has removed dbg intrinsics? That would at least be an easy fix. But maybe we really need to invalidate even in that case?

dwblaikie · 2023-02-14T23:32:08Z

(shooting from the hip/driveby commentary)

The problem seems to be related to DSE having some limit(s) where it cuts of searches.

That limit should be applied ignoring debug info intrinsics (eg: should be N non-dbg instructions - there are iterators that expose only these non-dbg instructions)

Would it be ok to not invalidate existing analyses in case ADCE only has removed dbg intrinsics?

Probably ADCE should'nt touch the dbg intrinsics if it isn't touching any code?

mikaelholmen · 2023-02-15T06:24:11Z

Probably ADCE should'nt touch the dbg intrinsics if it isn't touching any code?

Thanks @dwblaikie! Maybe so. I actually tried that first but got a little put off when I saw that in e.g. @f() in
test/Transforms/MergeFunc/mergefunc-preserve-debug-info.ll such a change would leave

         36:  call void @llvm.dbg.value(metadata i32 1, metadata !21, metadata !DIExpression()), !dbg !32 
         37:  call void @llvm.dbg.value(metadata i32 9, metadata !23, metadata !DIExpression()), !dbg !32 
         38:  call void @llvm.dbg.declare(metadata ptr undef, metadata !34, metadata !DIExpression()), !dbg !35 
         39:  call void @llvm.dbg.declare(metadata ptr undef, metadata !36, metadata !DIExpression()), !dbg !37 
         40:  call void @llvm.dbg.value(metadata i32 9, metadata !38, metadata !DIExpression()), !dbg !32

which would otherwise be removed. So I was worried about what kind of bloating would occur if ADCE wouldn't be able to remove them.
But perhaps it's the right thing to do anyway?

dwblaikie · 2023-02-15T19:26:40Z

Maybe - I'm not 100% sure either way.

Might be that the right solution to all those dbg intrinsics is for them to be cleaned up closer to when they became dead?

DSE will use a freshly computed version of MSSA while otherwise it re-uses the one computed earlier, which may have some optimizations applied.

That bit's confusing - recomputing an analysis versus using an old one shouldn't produce different results. So an optimization removing only dbg intrinsics and reporting "yeah, I changed stuff", causing an analysis to be invalidated and recomputed, shouldn't change the result of some future non-debug optimization?

nikic · 2023-02-16T09:48:56Z

That bit's confusing - recomputing an analysis versus using an old one shouldn't produce different results.

That's the ideal, but the actual requirement is just the the analysis produces correct results, though they might be better or worse than recomputing from scratch. For some analyses, there isn't even any sensible notion of what the "right" results are, because they are query-order dependent (such as SCEV or LVI, not sure whether this affects MemorySSA).

dwblaikie · 2023-02-16T19:21:33Z

That bit's confusing - recomputing an analysis versus using an old one shouldn't produce different results.

That's the ideal, but the actual requirement is just the the analysis produces correct results, though they might be better or worse than recomputing from scratch. For some analyses, there isn't even any sensible notion of what the "right" results are, because they are query-order dependent (such as SCEV or LVI, not sure whether this affects MemorySSA).

Ah, fascinating. Not sure what to do with that, then - yeah, not sure what the right call is to remove debug info IR - respond with "this didn't change the IR" perhaps on the rationale that that response should only account for non-debug-info IR... - or some other path.

mikaelholmen · 2023-02-17T09:37:08Z

Thanks for the discussion and explanations!

There are probably several other different solutions to this as well but the most obvious to me look like either

Let ADCE remove debug intrinsics as it does today, but only invalidate analyses if some non-debug instruction was removed as well.
or
Only let ADCE remove debug intrinsics if it also removes some non-debug instruction.

If 1) is correct/allowed I think prefer that, but I don't know if it is?
I've at least managed to run a lot of tests downstream with 1) without seeing problems but that's no real proof.

What do you say?

bjope · 2023-02-17T12:59:56Z

That bit's confusing - recomputing an analysis versus using an old one shouldn't produce different results.

That's the ideal, but the actual requirement is just the the analysis produces correct results, though they might be better or worse than recomputing from scratch. For some analyses, there isn't even any sensible notion of what the "right" results are, because they are query-order dependent (such as SCEV or LVI, not sure whether this affects MemorySSA).

That was a new thing for me as well. Is that requirement written down somewhere?

All these times when I've tried to narrow down pass dependency problems with stale analyses etc, and having tried to bisect by adding invalidations of analysis in various places in the pipeline to find out when the IR differs. All those times I've really had no guarantees that different IR really would indicate a bug or not. Interesting.

bjope · 2023-02-17T13:10:21Z

Debug info is of course special as we want to get same codegen independent of debug info or not.
But still, why would ADCE invalidate analyses like MemorySSA if only removing a dbg intrinsic. And similarly, if there is a dead or-instruction that it removes I guess that won't impact MemorySSA.

So given the news that re-computing MemorySSA isn't guaranteed to give same result as before, then I assume that we either want to invalidate it more consistently regardless of what a pass like ADCE is doing (to make debugging easier and avoiding depending on what earlier passes did) , or we should only invalidate the analysis when doing something that makes the analysis invalid.

I mean, if we introduce some analysis that is about debug info., then we can't just say that ADCE did nothing and all analyses should be kept. Unless the pass manager is changed to return two statuses "did not change anything relate to non-dbg" vs "did not change anything related to dbg".

mikaelholmen · 2023-02-28T12:53:26Z

Thanks for the discussion and explanations!

There are probably several other different solutions to this as well but the most obvious to me look like either
1. Let ADCE remove debug intrinsics as it does today, but only invalidate analyses if some non-debug instruction was removed as well.
   or

2. Only let ADCE remove debug intrinsics if it also removes some non-debug instruction.
If 1) is correct/allowed I think prefer that, but I don't know if it is? I've at least managed to run a lot of tests downstream with 1) without seeing problems but that's no real proof.

What do you say?

Since we see this problem fairly often in downstream testing I'd like to try to move it forward in some way.

Do you think 1) above is correct/allowed?
If we think it is I can put up a patch for that to at least get a short term fix even if the ideal solution is to do something more elaborate.

dwblaikie · 2023-02-28T17:50:19Z

@adrian-prantl @JDevlieghere for debug info thoughts
@aeubanks for pass manager thoughts - how should we handle passes wanting to clean up debug info, while remaining debug-info invariant for the non-debug instructions/analysis resulrts/etc?

aeubanks · 2023-02-28T20:41:27Z

given the "analyses just need to be correct, but may be different across computations" state of LLVM, I don't think there's a principled way of handling this; it'll have to be case by case to ensure that removing debug intrinsics causes the same analysis computation as not having debug intrinsics

if we just care about MemorySSA, the pass can just say that it preserves MemorySSA if it only removed debug intrinsics (assuming MemorySSA doesn't care about debug intrinsics).

if other analyses run into similar issues, we can introduce another AnalysisSetKey, say DebugInfoInvariantAnalyses, (e.g. PreservedAnalyses::preserveSet<CFGAnalyses>() preserves all analyses that only care about control flow), then have analyses like MemorySSA not be invalidated when that DebugInfoInvariantAnalyses is preserved

but the "don't touch debug intrinsics unless other instructions are modified" approach seems better to me

pogo59 · 2023-02-28T21:24:41Z

In the long run, there's an ongoing effort to eliminate debug-info instructions entirely.
But in the meantime, yeah, don't touch debug intrinsics unless real instructions are modified.

mikaelholmen · 2023-03-01T10:32:52Z

Thanks everyone.
I put up
https://reviews.llvm.org/D145051
as an attempt to do something about this.

…oved We now limit ADCE to only remove debug intrinsics if it does something else that would invalidate cached analyses anyway. As we've seen in #58285 throwing away cached analysis info when only debug instructions are removed can lead to different code when debug info is present or not present. Differential Revision: https://reviews.llvm.org/D145051

mikaelholmen · 2023-03-03T10:27:10Z

Closing as I pushed a fix for this in 8aa9ab3
[ADCE] Only remove debug intrinsics if non debug instructions are removed

Thanks!

mikaelholmen · 2023-03-03T12:14:41Z

Reverted the fix in f5097ed

Revert "[ADCE] Only remove debug intrinsics if non debug instructions are removed"

This reverts commit 8aa9ab336889ae2eb8e4188036faeb151379ab7b.

Reverting due to compile-time regressions as pointed out in
 https://reviews.llvm.org/D145051#4166656
E.g.
 "In particular tramp3d-v4 with debuginfo regressed by 15%."

mikaelholmen · 2023-03-06T06:07:03Z

Since I reverted the first fix due to compile-time regressions we need anyother way forward here.

@nikic suggested

I think what we can do instead is to mark MemorySSA as preserved if only debuginfo intrinsics have been removed. (This is
safe, MSSA explicitly never tracks debuginfo intrinsics.)

in https://reviews.llvm.org/D145051#4166656

Does anyone object to that or is that the way to go? @aeubanks ?

aeubanks · 2023-03-06T16:51:35Z

that's fine as a workaround especially if there are efforts to remove debug info instructions altogether

(although it would be good to know exactly why that patch caused major compile time regressions)

mikaelholmen · 2023-03-07T08:37:36Z

Another attempt here then:
https://reviews.llvm.org/D145478

As we've seen in #58285 throwing away MemorySSA when only debug instructions are removed can lead to different code when debug info is present or not present. This is the second attempt at fixing the problem. The first one was https://reviews.llvm.org/D145051 which was reverted due to a 15% compile time regression for tramp3d-v4 in NewPM-ReleaseLTO-g.

As we've seen in #58285 throwing away MemorySSA when only debug instructions are removed can lead to different code when debug info is present or not present. This is the second attempt at fixing the problem. The first one was https://reviews.llvm.org/D145051 which was reverted due to a 15% compile time regression for tramp3d-v4 in NewPM-ReleaseLTO-g. Differential Revision: https://reviews.llvm.org/D145478

mikaelholmen · 2023-03-08T06:48:54Z

I pushed a fix in b9337b1

[ADCE] Preserve MemorySSA if only debug instructions are removed

As we've seen in
 https://github.com/llvm/llvm-project/issues/58285
throwing away MemorySSA when only debug instructions are removed can lead
to different code when debug info is present or not present.

This is the second attempt at fixing the problem. The first one was
 https://reviews.llvm.org/D145051
which was reverted due to a 15% compile time regression for tramp3d-v4
in NewPM-ReleaseLTO-g.

Differential Revision: https://reviews.llvm.org/D145478

so I'll close this again and hope it can stay that way. :)

Thanks for the help everyone!

github-actions bot added the new issue label Oct 11, 2022

EugeneZelenko added llvm:optimizations and removed new issue labels Oct 11, 2022

mikaelholmen mentioned this issue Oct 25, 2022

Presence of dbg.declare affects result from LICM #58586

Closed

jryans added the debuginfo label Nov 22, 2022

mikaelholmen closed this as completed Mar 3, 2023

mikaelholmen reopened this Mar 3, 2023

mikaelholmen closed this as completed Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Presence of dbg.declare affects result from DeadStoreElimination #58285

Presence of dbg.declare affects result from DeadStoreElimination #58285

mikaelholmen commented Oct 11, 2022 •

edited

mikaelholmen commented Oct 11, 2022 •

edited by VoltrexKeyva

fhahn commented Oct 11, 2022

mikaelholmen commented Oct 12, 2022 •

edited

fhahn commented Oct 12, 2022

fhahn commented Oct 12, 2022

llvmbot commented Nov 22, 2022

mikaelholmen commented Feb 14, 2023

dwblaikie commented Feb 14, 2023

mikaelholmen commented Feb 15, 2023

dwblaikie commented Feb 15, 2023

nikic commented Feb 16, 2023

dwblaikie commented Feb 16, 2023

mikaelholmen commented Feb 17, 2023

bjope commented Feb 17, 2023

bjope commented Feb 17, 2023

mikaelholmen commented Feb 28, 2023

dwblaikie commented Feb 28, 2023

aeubanks commented Feb 28, 2023

pogo59 commented Feb 28, 2023

mikaelholmen commented Mar 1, 2023

mikaelholmen commented Mar 3, 2023

mikaelholmen commented Mar 3, 2023

mikaelholmen commented Mar 6, 2023

aeubanks commented Mar 6, 2023

mikaelholmen commented Mar 7, 2023

mikaelholmen commented Mar 8, 2023

Presence of dbg.declare affects result from DeadStoreElimination #58285

Presence of dbg.declare affects result from DeadStoreElimination #58285

Comments

mikaelholmen commented Oct 11, 2022 • edited

mikaelholmen commented Oct 11, 2022 • edited by VoltrexKeyva

fhahn commented Oct 11, 2022

mikaelholmen commented Oct 12, 2022 • edited

fhahn commented Oct 12, 2022

fhahn commented Oct 12, 2022

llvmbot commented Nov 22, 2022

mikaelholmen commented Feb 14, 2023

dwblaikie commented Feb 14, 2023

mikaelholmen commented Feb 15, 2023

dwblaikie commented Feb 15, 2023

nikic commented Feb 16, 2023

dwblaikie commented Feb 16, 2023

mikaelholmen commented Feb 17, 2023

bjope commented Feb 17, 2023

bjope commented Feb 17, 2023

mikaelholmen commented Feb 28, 2023

dwblaikie commented Feb 28, 2023

aeubanks commented Feb 28, 2023

pogo59 commented Feb 28, 2023

mikaelholmen commented Mar 1, 2023

mikaelholmen commented Mar 3, 2023

mikaelholmen commented Mar 3, 2023

mikaelholmen commented Mar 6, 2023

aeubanks commented Mar 6, 2023

mikaelholmen commented Mar 7, 2023

mikaelholmen commented Mar 8, 2023

mikaelholmen commented Oct 11, 2022 •

edited

mikaelholmen commented Oct 11, 2022 •

edited by VoltrexKeyva

mikaelholmen commented Oct 12, 2022 •

edited