Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SelectionDAG][RISCV] Fix break of vnsrl pattern in issue #94265 #95563

Merged
merged 7 commits into from
Jul 14, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions llvm/include/llvm/CodeGen/TargetLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -4339,6 +4339,20 @@ class TargetLowering : public TargetLoweringBase {
return isTypeLegal(VT);
}

/// Same as isTypeDesirableForOp(unsigned Opc, EVT VT), but also check if
/// the target is 'desirable' to truncate or extend OldVT to NewVT only using
/// the given node type, without the need of explicit trunc or ext. e.g. On
/// RISC-V Vector extension, vnsrl.wi can directly convert <n x i32> to <n x
/// i16> when shifting, with no extra trunc operations needed.
virtual bool isTypeDesirableForOp(unsigned Opc, EVT NewVT, EVT OldVT) const {
Fros1er marked this conversation as resolved.
Show resolved Hide resolved
// Fallback to isTypeDesirableForOp(unsigned Opc, EVT VT).
if (NewVT == OldVT) {
return isTypeDesirableForOp(Opc, NewVT);
}
// Most of instructions are not desirable, so return false by default.
return false;
}

/// Return true if it is profitable for dag combiner to transform a floating
/// point op of specified opcode to a equivalent op of an integer
/// type. e.g. f32 load -> i32 load can be profitable on ARM.
Expand Down
4 changes: 3 additions & 1 deletion llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2597,7 +2597,9 @@ bool TargetLowering::SimplifyDemandedBits(
HighBits.lshrInPlace(ShVal);
HighBits = HighBits.trunc(BitWidth);

if (!(HighBits & DemandedBits)) {
if (!isTypeDesirableForOp(ISD::SRL, Op.getValueType(),
Src.getValueType()) &&
!(HighBits & DemandedBits)) {
// None of the shifted in bits are needed. Add a truncate of the
// shift input, then shift it.
SDValue NewShAmt =
Expand Down
8 changes: 8 additions & 0 deletions llvm/lib/Target/RISCV/RISCVISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17462,6 +17462,14 @@ bool RISCVTargetLowering::isDesirableToCommuteWithShift(
return true;
}

bool RISCVTargetLowering::isTypeDesirableForOp(unsigned Opc, EVT NewVT,
EVT OldVT) const {
if (Subtarget.hasStdExtV() && NewVT.isVector() && OldVT.isVector()) {
return true;
}
return TargetLowering::isTypeDesirableForOp(Opc, NewVT, OldVT);
}

bool RISCVTargetLowering::targetShrinkDemandedConstant(
SDValue Op, const APInt &DemandedBits, const APInt &DemandedElts,
TargetLoweringOpt &TLO) const {
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/Target/RISCV/RISCVISelLowering.h
Original file line number Diff line number Diff line change
Expand Up @@ -708,6 +708,8 @@ class RISCVTargetLowering : public TargetLowering {
bool isDesirableToCommuteWithShift(const SDNode *N,
CombineLevel Level) const override;

bool isTypeDesirableForOp(unsigned Opc, EVT NewVT, EVT OldVT) const override;

/// If a physical register, this returns the register that receives the
/// exception address on entry to an EH pad.
Register
Expand Down
31 changes: 31 additions & 0 deletions llvm/test/CodeGen/RISCV/pr94265.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=riscv32-- -mattr=+v | FileCheck -check-prefix=RV32I %s
; RUN: llc < %s -mtriple=riscv64-- -mattr=+v | FileCheck -check-prefix=RV64I %s

define <8 x i16> @PR94265(<8 x i32> %a0) #0 {
; RV32I-LABEL: PR94265:
; RV32I: # %bb.0:
; RV32I-NEXT: vsetivli zero, 8, e32, m2, ta, ma
; RV32I-NEXT: vsra.vi v10, v8, 31
; RV32I-NEXT: vsrl.vi v10, v10, 26
; RV32I-NEXT: vadd.vv v8, v8, v10
; RV32I-NEXT: vsetvli zero, zero, e16, m1, ta, ma
; RV32I-NEXT: vnsrl.wi v10, v8, 6
; RV32I-NEXT: vsll.vi v8, v10, 10
; RV32I-NEXT: ret
;
; RV64I-LABEL: PR94265:
; RV64I: # %bb.0:
; RV64I-NEXT: vsetivli zero, 8, e32, m2, ta, ma
; RV64I-NEXT: vsra.vi v10, v8, 31
; RV64I-NEXT: vsrl.vi v10, v10, 26
; RV64I-NEXT: vadd.vv v8, v8, v10
; RV64I-NEXT: vsetvli zero, zero, e16, m1, ta, ma
; RV64I-NEXT: vnsrl.wi v10, v8, 6
; RV64I-NEXT: vsll.vi v8, v10, 10
; RV64I-NEXT: ret
%t1 = sdiv <8 x i32> %a0, <i32 64, i32 64, i32 64, i32 64, i32 64, i32 64, i32 64, i32 64>
%t2 = trunc <8 x i32> %t1 to <8 x i16>
%t3 = shl <8 x i16> %t2, <i16 10, i16 10, i16 10, i16 10, i16 10, i16 10, i16 10, i16 10>
ret <8 x i16> %t3
}
Loading