Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dEQP-VK.graphicsfuzz.complex-nested-loops-and-call #204

Closed
paulthomson opened this issue Sep 13, 2019 · 20 comments
Closed

dEQP-VK.graphicsfuzz.complex-nested-loops-and-call #204

paulthomson opened this issue Sep 13, 2019 · 20 comments

Comments

@paulthomson
Copy link

This test fails. To reproduce: amdllpc -gfxip=9.0.0 -verify-ir -o temp.out *.spv

complex-nested-loops-and-call.zip

@jaebaek
Copy link
Contributor

jaebaek commented Sep 23, 2019

I built it in Release 64bit mode based on AMDVLK and spvgen, but I cannot reproduce this bug.

Did someone fix this already?

@paulthomson
Copy link
Author

paulthomson commented Sep 24, 2019

I used a Debug build on dev branch.

@kuhar kuhar self-assigned this Sep 24, 2019
@kuhar
Copy link
Contributor

kuhar commented Sep 25, 2019

Seems like there is a bug in SILowerI1Copies.cpp (an LLVM MachineFunction pass).

It happens during the 54th iteration of the inner for loop over machine instructions in SILowerI1Copies::lowerPhis when a phi has a use in the same machine basic block:

bb.39 (%ir-block.176):
; predecessors: %bb.38
  successors: %bb.41(0x40000000), %bb.42(0x40000000); %bb.41(50.00%), %bb.42(50.00%)

  %457:vgpr_32 = PHI %458:vgpr_32, %bb.38
  %153:sreg_64 = PHI %315:vreg_1, %bb.38    ; <== def
  %154:vgpr_32 = PHI %316:vgpr_32, %bb.38
  %155:vgpr_32 = PHI %316:vgpr_32, %bb.38
  %156:vgpr_32 = PHI %316:vgpr_32, %bb.38
  %320:sreg_64_xexec = COPY %153:sreg_64    ; <== use
  %319:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, %320:sreg_64_xexec, implicit $exec
  %321:sreg_64 = S_MOV_B64 -1
  %318:vreg_1 = COPY %321:sreg_64
  %322:sreg_32_xm0 = S_MOV_B32 1
  %323:sreg_32 = IMPLICIT_DEF
  V_CMP_NE_U32_e32 killed %322:sreg_32_xm0, %319:vgpr_32, implicit-def $vcc, implicit $exec
  $vcc = S_AND_B64 $exec, $vcc, implicit-def $scc
  S_CBRANCH_VCCNZ %bb.41, implicit $vcc
  S_BRANCH %bb.42

The pass attempts to create a new register and replace the Phi with a new register placed at the end of the block. This is how the machine basic blocks looks at the end of that loop iteration:

bb.39 (%ir-block.176):
; predecessors: %bb.38
  successors: %bb.41(0x40000000), %bb.42(0x40000000); %bb.41(50.00%), %bb.42(50.00%)

  %457:vgpr_32 = PHI %458:vgpr_32, %bb.38
  %584:sreg_64 = PHI %315:vreg_1, %bb.38      ; <== NewReg (old instruction, will be erased)
  %154:vgpr_32 = PHI %316:vgpr_32, %bb.38
  %155:vgpr_32 = PHI %316:vgpr_32, %bb.38
  %156:vgpr_32 = PHI %316:vgpr_32, %bb.38
  %320:sreg_64_xexec = COPY %153:sreg_64      ; <== use
  %319:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, %320:sreg_64_xexec, implicit $exec
  %321:sreg_64 = S_MOV_B64 -1
  %318:vreg_1 = COPY %321:sreg_64
  %322:sreg_32_xm0 = S_MOV_B32 1
  %323:sreg_32 = IMPLICIT_DEF
  V_CMP_NE_U32_e32 killed %322:sreg_32_xm0, %319:vgpr_32, implicit-def $vcc, implicit $exec
  $vcc = S_AND_B64 $exec, $vcc, implicit-def $scc
  %153:sreg_64 = IMPLICIT_DEF                ; <== DstReg (new instruction, to be used instead)
  S_CBRANCH_VCCNZ %bb.41, implicit $vcc
  S_BRANCH %bb.42

After this iteration, the function verifiers complains that the definition doesn't dominate the use. I think that the new value from the MachineSSAUpdater should be placed before the use, instead of at the end of the block, which would require a small modification to the MachineSSAUpdater.

I hope that this makes sense and I'd really appreciate some tips/relevant documentation if you have any. I'll resume the investigation tomorrow from this point.

@trenouf
Copy link
Member

trenouf commented Sep 26, 2019

Hi Jakub

I'll pass this on to Matt and Stas (AMD engineers on the compute side), who look like the people most linked with SILowerI1Copies, and see what we get back.

@nhaehnle
Copy link
Member

nhaehnle commented Sep 26, 2019

I haven't been able to do a full analysis on this and will have to take a closer look at the CFG overall. What I can say is that the new value %153 is almost certainly correct where it is (or at least "more correct" than moving it would be). This is because we certainly want a defined value to be used in the V_CNDMASK_B32, since thats what's happens in the original code.

That is to say, it seems more likely that the correct fix would be to change the use of %153 into a use of some other newly generated instruction that does dominate.

@arsenm
Copy link

arsenm commented Sep 26, 2019

This is some confusing shadowing to me; I would assume the second MBB really intended to be the current iterated block, not the predecessor
for (MachineBasicBlock *MBB : PIA.predecessors()) SSAUpdater.AddAvailableValue(MBB, insertUndefLaneMask(*MBB));

@kuhar
Copy link
Contributor

kuhar commented Sep 26, 2019

Thanks @trenouf!

@nhaehnle, @arsenm thank you for looking into this and your insights.
Can you recommend me some resources to learn more about what SILowerI1Copies is trying to achieve and why? I don't fully understand the transformation after reading the header and the implementation files.
To be specific, I'm confused about what values are supposed to flow into the uses in the middle of blocks and what the semantics of implicit defs are.
Without a better understanding of the pass I don't know if I can be of much use when it comes to this bug, unless somebody tells me exactly what is broken here.

@arsenm
Copy link

arsenm commented Sep 26, 2019

Fundamentally SILowerI1Copies is a SelectionDAG workaround. In the DAG we don't have the necessary context to know how to treat a boolean value. We use the pseudo register-class VReg_1, and then SILowerI1Copies sorts out when it's appropriate to use a real lane mask for these values.

@trenouf
Copy link
Member

trenouf commented Sep 26, 2019

Stas said:

If it did replace %153 with %584 it should probably also replace its uses with the new %584. I see that COPY left using %153.

@kuhar kuhar removed their assignment Sep 26, 2019
@kuhar
Copy link
Contributor

kuhar commented Sep 26, 2019

@trenouf

Stas said:

If it did replace %153 with %584 it should probably also replace its uses with the new %584. I see that COPY left using %153.

Yes, but the new value %584 does not dominate the use, even if the use gets updated.

@trenouf
Copy link
Member

trenouf commented Sep 26, 2019

Nicolai said (in reply to Stas's comment):

%584 may just be an artefact of how the registers are swapped around in order to erase instructions.

@nhaehnle
Copy link
Member

@kuhar, I've root-caused a first issue to a subtle bug in the MachineSSAUpdater, but there's a follow-on issue that I'm currently looking at.

@kuhar
Copy link
Contributor

kuhar commented Sep 27, 2019

@nhaehnle OK. Let me know if you need help with this issue.

nhaehnle added a commit to nhaehnle/llvm-project that referenced this issue Sep 27, 2019
When getValueInMiddleOfBlock happens to be called for a basic block
that has no incoming value at all, an IMPLICIT_DEF is inserted in that
block via GetValueAtEndOfBlockInternal. This IMPLICIT_DEF must be at
the top of its basic block or it will likely not reach the use that
the caller intends to insert.

TODO: Add a MIR test for SILowerI1Copies

Issue: GPUOpen-Drivers/llpc#204
nhaehnle added a commit to nhaehnle/llvm-project that referenced this issue Sep 27, 2019
@nhaehnle
Copy link
Member

@kuhar, the branch here with two simple changes allows the problematic shader to compile: https://github.com/nhaehnle/llvm-project/tree/wip-github-llpc-204

I'm currently travelling and not setup to run amber tests. Could you please check whether those changes fully fix the issue in the dEQP-VK test?

@kuhar
Copy link
Contributor

kuhar commented Sep 27, 2019

@nhaehnle Applying the patches seems to fix the issue - MIR verifieas correctly now.

@nhaehnle
Copy link
Member

llvm-git-migration pushed a commit to llvm/llvm-project that referenced this issue Oct 8, 2019
Summary:
When getValueInMiddleOfBlock happens to be called for a basic block
that has no incoming value at all, an IMPLICIT_DEF is inserted in that
block via GetValueAtEndOfBlockInternal. This IMPLICIT_DEF must be at
the top of its basic block or it will likely not reach the use that
the caller intends to insert.

Issue: GPUOpen-Drivers/llpc#204

Reviewers: arsenm, rampitec

Subscribers: jvesely, wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68183

llvm-svn: 374040
llvm-git-migration pushed a commit to llvm/llvm-project that referenced this issue Oct 8, 2019
Summary: Issue: GPUOpen-Drivers/llpc#204

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68184

llvm-svn: 374041
dtzWill pushed a commit to llvm-mirror/llvm that referenced this issue Oct 8, 2019
Summary:
When getValueInMiddleOfBlock happens to be called for a basic block
that has no incoming value at all, an IMPLICIT_DEF is inserted in that
block via GetValueAtEndOfBlockInternal. This IMPLICIT_DEF must be at
the top of its basic block or it will likely not reach the use that
the caller intends to insert.

Issue: GPUOpen-Drivers/llpc#204

Reviewers: arsenm, rampitec

Subscribers: jvesely, wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68183

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374040 91177308-0d34-0410-b5e6-96231b3b80d8
dtzWill pushed a commit to llvm-mirror/llvm that referenced this issue Oct 8, 2019
Summary: Issue: GPUOpen-Drivers/llpc#204

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68184

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374041 91177308-0d34-0410-b5e6-96231b3b80d8
earl pushed a commit to earl/llvm-mirror that referenced this issue Oct 8, 2019
Summary:
When getValueInMiddleOfBlock happens to be called for a basic block
that has no incoming value at all, an IMPLICIT_DEF is inserted in that
block via GetValueAtEndOfBlockInternal. This IMPLICIT_DEF must be at
the top of its basic block or it will likely not reach the use that
the caller intends to insert.

Issue: GPUOpen-Drivers/llpc#204

Reviewers: arsenm, rampitec

Subscribers: jvesely, wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68183

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374040 91177308-0d34-0410-b5e6-96231b3b80d8
earl pushed a commit to earl/llvm-mirror that referenced this issue Oct 8, 2019
Summary: Issue: GPUOpen-Drivers/llpc#204

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68184

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374041 91177308-0d34-0410-b5e6-96231b3b80d8
@nhaehnle
Copy link
Member

nhaehnle commented Oct 8, 2019

@amdrexu @paulthomson @kuhar We should probably close this issue given that the fix has gone into upstream LLVM, but I don't seem to have the rights to actually do that.

@paulthomson
Copy link
Author

paulthomson commented Oct 8, 2019

Or we could close it when the LLVM version is updated, possibly adding a "fixed in LLVM" label in the meantime (or something like that). We don't have continuous running of these tests yet, but when we do, the automation could create/re-open issues like this one automatically unless the dev branch is passing.

@paulthomson
Copy link
Author

This still seems to fail.

@paulthomson
Copy link
Author

This appears to be fixed.

arichardson pushed a commit to arichardson/llvm-project that referenced this issue Nov 16, 2019
Summary:
When getValueInMiddleOfBlock happens to be called for a basic block
that has no incoming value at all, an IMPLICIT_DEF is inserted in that
block via GetValueAtEndOfBlockInternal. This IMPLICIT_DEF must be at
the top of its basic block or it will likely not reach the use that
the caller intends to insert.

Issue: GPUOpen-Drivers/llpc#204

Reviewers: arsenm, rampitec

Subscribers: jvesely, wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68183

llvm-svn: 374040
arichardson pushed a commit to arichardson/llvm-project that referenced this issue Nov 16, 2019
Summary: Issue: GPUOpen-Drivers/llpc#204

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68184

llvm-svn: 374041
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants