Skip to content

Commit

Permalink
[ARM] Allow ARMVectorRegCast to match bitconverts too. (NFC)
Browse files Browse the repository at this point in the history
Summary:
When we start putting instances of `ARMVectorRegCast` in complex isel
patterns, it will be awkward that they're often turned into the more
standard `bitconvert` in little-endian mode. We'd rather not have to
write separate isel patterns for the two endiannesses, matching
different but equivalent cast operations.

This change aims to fix that awkwardness in advance, by turning the
Tablegen record `ARMVectorRegCast` from a simple `SDNode` instance
into a `PatFrags` that can match either kind of cast – with a
predicate that prevents it matching a bitconvert in the big-endian
case, where bitconvert isn't semantically identical.

No existing code generation should be affected by this change, but it
will enable the patterns introduced by D74336 to work in both
endiannesses.

Reviewers: dmgreen

Reviewed By: dmgreen

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74716
  • Loading branch information
statham-arm committed Feb 18, 2020
1 parent 68b49f7 commit 9dcc166
Show file tree
Hide file tree
Showing 3 changed files with 31 additions and 4 deletions.
17 changes: 16 additions & 1 deletion llvm/lib/Target/ARM/ARMInstrInfo.td
Original file line number Diff line number Diff line change
Expand Up @@ -307,7 +307,22 @@ def ARMLoopDec : SDNode<"ARMISD::LOOP_DEC", SDTIntBinOp, [SDNPHasChain]>;
// because that's what (MVE) VSTRH.16 followed by VLDRB.8 would do. So the
// bitconvert would have to emit a VREV16.8 instruction, whereas the
// VECTOR_REG_CAST emits no code at all if the vector is already in a register.
def ARMVectorRegCast : SDNode<"ARMISD::VECTOR_REG_CAST", SDTUnaryOp>;
def ARMVectorRegCastImpl : SDNode<"ARMISD::VECTOR_REG_CAST", SDTUnaryOp>;

// In little-endian, VECTOR_REG_CAST is often turned into bitconvert during
// lowering (because in that situation they're identical). So an isel pattern
// that needs to match something that's _logically_ a VECTOR_REG_CAST must
// _physically_ match a different node type depending on endianness.
//
// This 'PatFrags' instance is a centralized facility to make that easy. It
// matches VECTOR_REG_CAST in either endianness, and also bitconvert in the
// endianness where it's equivalent.
def ARMVectorRegCast: PatFrags<
(ops node:$x), [(ARMVectorRegCastImpl node:$x), (bitconvert node:$x)], [{
// Reject a match against bitconvert (aka ISD::BITCAST) if big-endian
return !(CurDAG->getDataLayout().isBigEndian() &&
N->getOpcode() == ISD::BITCAST);
}]>;

//===----------------------------------------------------------------------===//
// ARM Flag Definitions.
Expand Down
8 changes: 7 additions & 1 deletion llvm/lib/Target/ARM/ARMInstrMVE.td
Original file line number Diff line number Diff line change
Expand Up @@ -4013,9 +4013,15 @@ let Predicates = [HasMVEInt] in {
(VT (COPY_TO_REGCLASS (VT2 VCCR:$src), VCCR))>;
}

// Here we match the specific SDNode type 'ARMVectorRegCastImpl'
// rather than the more general 'ARMVectorRegCast' which would also
// match some bitconverts. If we use the latter in cases where the
// input and output types are the same, the bitconvert gets elided
// and we end up generating a nonsense match of nothing.

foreach VT = [ v16i8, v8i16, v8f16, v4i32, v4f32, v2i64, v2f64 ] in
foreach VT2 = [ v16i8, v8i16, v8f16, v4i32, v4f32, v2i64, v2f64 ] in
def : Pat<(VT (ARMVectorRegCast (VT2 MQPR:$src))), (VT MQPR:$src)>;
def : Pat<(VT (ARMVectorRegCastImpl (VT2 MQPR:$src))), (VT MQPR:$src)>;
}

// end of MVE compares
Expand Down
10 changes: 8 additions & 2 deletions llvm/lib/Target/ARM/ARMInstrNEON.td
Original file line number Diff line number Diff line change
Expand Up @@ -7532,13 +7532,19 @@ let Predicates = [IsBE,HasNEON] in {
}

let Predicates = [HasNEON] in {
// Here we match the specific SDNode type 'ARMVectorRegCastImpl'
// rather than the more general 'ARMVectorRegCast' which would also
// match some bitconverts. If we use the latter in cases where the
// input and output types are the same, the bitconvert gets elided
// and we end up generating a nonsense match of nothing.

foreach VT = [ v16i8, v8i16, v8f16, v4i32, v4f32, v2i64, v2f64 ] in
foreach VT2 = [ v16i8, v8i16, v8f16, v4i32, v4f32, v2i64, v2f64 ] in
def : Pat<(VT (ARMVectorRegCast (VT2 QPR:$src))), (VT QPR:$src)>;
def : Pat<(VT (ARMVectorRegCastImpl (VT2 QPR:$src))), (VT QPR:$src)>;

foreach VT = [ v8i8, v4i16, v4f16, v2i32, v2f32, v1i64, f64 ] in
foreach VT2 = [ v8i8, v4i16, v4f16, v2i32, v2f32, v1i64, f64 ] in
def : Pat<(VT (ARMVectorRegCast (VT2 DPR:$src))), (VT DPR:$src)>;
def : Pat<(VT (ARMVectorRegCastImpl (VT2 DPR:$src))), (VT DPR:$src)>;
}

// Use VLD1/VST1 + VREV for non-word-aligned v2f64 load/store on Big Endian
Expand Down

0 comments on commit 9dcc166

Please sign in to comment.