Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscompile after changes to MachineCopyPropagation in commit cae46f6210293ba4d3568eb2 #73512

Closed
bjope opened this issue Nov 27, 2023 · 5 comments

Comments

@bjope
Copy link
Collaborator

bjope commented Nov 27, 2023

We started to see miscompiles downstream after the update to MachineCopyPropagation here: cae46f6

I managed to reduce one failure and convert it into this (just picking AMDGCPU as target to get something with register tuples):

# RUN: llc -mtriple=amdgcn-- -mcpu=gfx900 -run-pass machine-cp -o - %s

---
name:            test1
body:             |
  bb.0:
    liveins: $sgpr4_sgpr5, $sgpr6_sgpr7

    $sgpr2_sgpr3 = COPY $sgpr6_sgpr7
    $sgpr0 = COPY $sgpr3
    S_NOP 0, implicit-def $sgpr0
    $sgpr3 = COPY killed $sgpr5
    $sgpr2_sgpr3 = COPY $sgpr6_sgpr7
    S_NOP 0, implicit $sgpr2_sgpr3
...

Without the changes to MCP:

> llc -mtriple=amdgcn-- -mcpu=gfx900 -o - -run-pass machine-cp mcp.mir

    $sgpr2_sgpr3 = COPY $sgpr6_sgpr7
    S_NOP 0, implicit-def $sgpr0
    $sgpr2_sgpr3 = COPY $sgpr6_sgpr7
    S_NOP 0, implicit $sgpr2_sgpr3

After cae46f6:

> llc -mtriple=amdgcn-- -mcpu=gfx900 -o - -run-pass machine-cp mcp.mir

    $sgpr2_sgpr3 = COPY $sgpr6_sgpr7
    S_NOP 0, implicit-def $sgpr0
    $sgpr3 = COPY killed $sgpr5                       <-- clobbers $sgpr3 but this is a dead def in the test case
    S_NOP 0, implicit $sgpr2_sgpr3                   <- this is incorrect, $sgpr3 should have the value of $sgrp7
@bjope
Copy link
Collaborator Author

bjope commented Nov 27, 2023

I reverted the offending commit.

@LWenH
Copy link
Member

LWenH commented Nov 28, 2023

Sorry for the late reply. I would like to ask if there are a lot of other miscompiles in the downstream?
If so, could you provide some more test samples(one or two are fine), so that I can address the root cause and
make further changes? Thank you.

@bjope
Copy link
Collaborator Author

bjope commented Nov 28, 2023

Sorry for the late reply. I would like to ask if there are a lot of other miscompiles in the downstream? If so, could you provide some more test samples(one or two are fine), so that I can address the root cause and make further changes? Thank you.

I don't have any other reproducers that I can share (and that would be meaningful). Took me quite some time just to reduce the test case for the downstream target. And then I needed to figure out how to convert that into something showing the problem for an in-tree target.
Hopefully it's just one main problem, and if the reproducer from this ticket pass then I can try it out on the larger downstream test cases.

With that said, my first suspicion here was around implicit operands. Not quite sure if that is the case here (i.e. those S_NOP:s with implicit operands can probaby be any instruction with non-implicit defs/uses, if you just know how to write AMDPPU mir).

Anyway, afaict lots of code in MachineCopyPropagation is inspecting all operands (including implicit operands). But then there are several places where things like CopyOperands->Destination->getReg() and CopyOperands->Source->getReg() is used. And then I wonder what happens if the COPY instruction also has implicit operands. Maybe that is uncommon, but at least downstream we sometimes need to add implicit defs operands on COPY instruction to model that writing to a subreg might have side-effect on other parts of the register. No idea if this cause problems in reality. It just made me suspicious about the pass in general since I don't know exactly how it works :-)

@LWenH
Copy link
Member

LWenH commented Nov 28, 2023

Yeah, I'll try to debug the pass and figure out the root cause for this test case later.

Essentially, when we encounter:
$sgpr3 = COPY killed $sgpr5,
the preceding instruction:
$sgpr2_sgpr3 = COPY $sgpr6_sgpr7,
will become unpropagatable.

Consequently, it prevents the removal of the subsequent instruction during the basic block scanning procedure.
However, it's possible that this behavior is also influenced by the implicit source operand, necessitating deeper investigation to
pinpoint the exact cause. Hopefully it's just the only special test case here :-)

Thank you for taking the time to investigate this issue.

LWenH added a commit that referenced this issue Dec 26, 2023
…74239)

Machine Copy Propagation Pass may lose some opportunities to further
remove the redundant copy instructions during the ForwardCopyPropagateBlock
procedure. When we Clobber a "Def" register, we also need to remove the record 
from the copy maps that indicates "Src" defined "Def" to ensure the correct semantics
of the ClobberRegister function.  This patch reapplies #70778 and addresses the corner 
case bug  #73512 specific to the AMDGPU backend. Additionally, it refines the criteria 
for removing empty records from the copy maps, thereby enhancing overall safety.

For more information, please see the C++ test case generated code in 
"vector.body" after the MCP Pass: https://gcc.godbolt.org/z/nK4oMaWv5.
@dtcxzyw
Copy link
Member

dtcxzyw commented Feb 16, 2024

Fixed by #74239.

@dtcxzyw dtcxzyw closed this as completed Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants