Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

likely arm64 miscompile due to global isel #55129

Closed
regehr opened this issue Apr 27, 2022 · 7 comments
Closed

likely arm64 miscompile due to global isel #55129

regehr opened this issue Apr 27, 2022 · 7 comments

Comments

@regehr
Copy link
Contributor

regehr commented Apr 27, 2022

this function:

define i64 @lsr_zext_i1_i64(i1 %b) {
  %1 = zext i1 %b to i64
  %2 = lshr i64 %1, 1
  ret i64 %2
}

gets turned into this arm64 by global isel, using top of tree:

regehr@john-home:~/arm-tests/error_39$ ~/llvm-project/for-alive/bin/llc test-0101865.bc -o - --march=arm64 -global-isel
	.text
	.file	"fast-isel-shift.ll"
	.globl	lsr_zext_i1_i64                 // -- Begin function lsr_zext_i1_i64
	.p2align	2
	.type	lsr_zext_i1_i64,@function
lsr_zext_i1_i64:                        // @lsr_zext_i1_i64
	.cfi_startproc
// %bb.0:
                                        // kill: def $w0 killed $w0 def $x0
	lsl	x0, x0, #63
	ret
.Lfunc_end0:
	.size	lsr_zext_i1_i64, .Lfunc_end0-lsr_zext_i1_i64
	.cfi_endproc
                                        // -- End function
	.section	".note.GNU-stack","",@progbits
regehr@john-home:~/arm-tests/error_39$ 

when %b == 1, it seems clear that the LLVM code should return zero. on the other hand, the arm64 code ends up returning 0x8000000000000000

cc @nunoplopes @nbushehri @ryan-berger @ornata

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 27, 2022

@llvm/issue-subscribers-backend-aarch64

@DavidSpickett
Copy link
Collaborator

This effects 14 but not 13. The cause is:

1300677f976e9975c21fd7bc4aa4e38e1bffeb27 is the first bad commit
commit 1300677f976e9975c21fd7bc4aa4e38e1bffeb27
Author: Jon Roelofs <jonathan_roelofs@apple.com>
Date:   Thu Oct 14 12:15:35 2021 -0700

    [AArch64][GlobalISel] combine and + [la]sr => ubfx

https://reviews.llvm.org/D111839

The asm from llvm 13 looked like:

lsr_zext_i1_i64:                        // @lsr_zext_i1_i64
        .cfi_startproc
// %bb.0:
                                        // kill: def $w0 killed $w0 def $x0
        and     x8, x0, #0x1
        lsr     x0, x8, #1
        ret

So it makes sense this change would do something here.

@jroelofs

@DavidSpickett
Copy link
Collaborator

I can see what's going on but not sure what layer it should be fixed at.

When we get to CombinerHelper::matchBitfieldExtractFromShrAnd we find an and mask of 0x1 and a shift right amount of 0x1.
We recognise the and/shr combo and make a UBFX that has a least significant bit (the start of the bitfield) of 1 with a width of 0. So basically, throw away the result. So far, nothing is wrong as such.

That goes to AArch64InstPrinter::printInst which does something I don't fully understand where some bitfield extracts can be represented as shifts. E.g. a bitfield from lsb of 4 with width of 60 could just be a shift right of 4.

For our UBFX it chooses to use an lsl with a shift of 63, because it sees that the width plus 1 is equal to the bitfield start position (that's the bit I don't understand). I don't know if this code ever expected to handle a width of 0 though, so we get the shift left of 63.

      } else if (Opcode == AArch64::UBFMXri && imms != 0x3f &&
                 ((imms + 1 == immr))) {
        AsmMnemonic = "lsl";
        shift = 63 - imms;

That 63 - imms makes me think we're assuming a minimum width of 1. The architecture docs say width must be at least one for an actual ubfx instruction at least (https://developer.arm.com/documentation/ddi0596/2020-12/Base-Instructions/UBFX--Unsigned-Bitfield-Extract--an-alias-of-UBFM-). I'm not sure if that applies to the MIR as well.

If I simply return false from the combiner you see this just before the asm step:

bb.0 (%ir-block.0):
  liveins: $w0
  renamable $w0 = KILL killed $w0, implicit-def $x0
  renamable $x8 = ANDXri killed renamable $x0, 4096
  renamable $x0 = UBFMXri killed renamable $x8, 1, 63
  RET undef $lr, implicit killed $x0

Which is a bit odd but since we know that the top bits will be 0, extracting a field from lsb of 1 with width 63 gets us 0. So the correct result but a different path.

Perhaps it would make sense to have the combiner generate a UBFX like that one. Or just not combine if and mask >> shift right amount is 0 meaning we're in a "throw away the result" sequence.

@DavidSpickett
Copy link
Collaborator

DavidSpickett commented Apr 27, 2022

According to https://llvm.org/docs/GlobalISel/GenericOpcode.html#g-sbfx-g-ubfx width could be zero.

With the combiner being at that level, width of 0 is ok. It's the instruction selection in the AArch64 backend that should account for width of 0 for G_UBFX/S_UBFX, or perhaps it should have been legalised somewhere before that? (I don't know all the steps very well)

Ignore that, I read the spec wrong.

0 <= lsb < lsb + width <= source bitwidth, where all values are unsigned

So for lsb < lsb + width to be true, width must be at least 1. The combiner shouldn't produce a UBFX with a width of zero.

@ornata
Copy link

ornata commented Apr 27, 2022

cc @aemerson too

@jroelofs
Copy link
Contributor

The combiner shouldn't produce a UBFX with a width of zero.

Agreed. The combiner should detect this case, and emit a zero instead.

I'll put together a patch next week.

@jroelofs jroelofs self-assigned this Apr 27, 2022
@jroelofs
Copy link
Contributor

jroelofs commented May 3, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants