Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SDAG][RISCV] vnsrl pattern is broken after PR93182 #94265

Closed
dtcxzyw opened this issue Jun 3, 2024 · 8 comments · Fixed by #95563
Closed

[SDAG][RISCV] vnsrl pattern is broken after PR93182 #94265

dtcxzyw opened this issue Jun 3, 2024 · 8 comments · Fixed by #95563

Comments

@dtcxzyw
Copy link
Member

dtcxzyw commented Jun 3, 2024

See the diff of file result/rvv/193fc4d8b7725d32.S in dtcxzyw/llvm-codegen-benchmark@c134d37#diff-bc8f186427aa6515f00cd4d764feea719d5d90240f5602f6a8acf2e114cc9660

; llc -mtriple=riscv64 -mattr=+v test.ll -debug -o -
define <8 x i16> @func0000000000000000(<8 x i32> %0) #0 {
entry:
  %1 = sdiv <8 x i32> %0, <i32 64, i32 64, i32 64, i32 64, i32 64, i32 64, i32 64, i32 64>
  %2 = trunc <8 x i32> %1 to <8 x i16>
  %3 = shl <8 x i16> %2, <i16 10, i16 10, i16 10, i16 10, i16 10, i16 10, i16 10, i16 10>
  ret <8 x i16> %3
}

Before:

        vsetivli zero, 8, e32, m2, ta, ma
        vsra.vi v10, v8, 31
        vsrl.vi v10, v10, 26
        vadd.vv v8, v8, v10
        vsetvli zero, zero, e16, m1, ta, ma
        vnsrl.wi        v10, v8, 10
        vsll.vi v8, v10, 6
        ret

After #93182:

        vsetivli zero, 8, e32, m2, ta, ma
        vsra.vi v10, v8, 31
        vsrl.vi v10, v10, 26
        vadd.vv v8, v8, v10
        vsetvli zero, zero, e16, m1, ta, ma
        vnsrl.wi        v10, v8, 0
        vsll.vi v8, v10, 4
        li      a0, -1024
        vand.vx v8, v8, a0
        ret

I think the following code breaks vnsrl (i.e., trunc (srl X, C)) pattern:

if (!(HighBits & DemandedBits)) {
// None of the shifted in bits are needed. Add a truncate of the
// shift input, then shift it.
SDValue NewShAmt =
TLO.DAG.getShiftAmountConstant(ShVal, VT, dl, TLO.LegalTypes());
SDValue NewTrunc =
TLO.DAG.getNode(ISD::TRUNCATE, dl, VT, Src.getOperand(0));
return TLO.CombineTo(
Op, TLO.DAG.getNode(ISD::SRL, dl, VT, NewTrunc, NewShAmt));
}

Do we need a TLI hook to query whether (trunc (srl X, C)) is natively supported?

cc @RKSimon @topperc

@llvmbot
Copy link
Collaborator

llvmbot commented Jun 3, 2024

@llvm/issue-subscribers-backend-risc-v

Author: Yingwei Zheng (dtcxzyw)

See the diff of file `result/rvv/193fc4d8b7725d32.S` in https://github.com/dtcxzyw/llvm-codegen-benchmark/commit/c134d37047d3e6ee2336279936f98a6eab922e77#diff-bc8f186427aa6515f00cd4d764feea719d5d90240f5602f6a8acf2e114cc9660
; llc -mtriple=riscv64 -mattr=+v test.ll -debug -o -
define &lt;8 x i16&gt; @<!-- -->func0000000000000000(&lt;8 x i32&gt; %0) #<!-- -->0 {
entry:
  %1 = sdiv &lt;8 x i32&gt; %0, &lt;i32 64, i32 64, i32 64, i32 64, i32 64, i32 64, i32 64, i32 64&gt;
  %2 = trunc &lt;8 x i32&gt; %1 to &lt;8 x i16&gt;
  %3 = shl &lt;8 x i16&gt; %2, &lt;i16 10, i16 10, i16 10, i16 10, i16 10, i16 10, i16 10, i16 10&gt;
  ret &lt;8 x i16&gt; %3
}

Before:

        vsetivli zero, 8, e32, m2, ta, ma
        vsra.vi v10, v8, 31
        vsrl.vi v10, v10, 26
        vadd.vv v8, v8, v10
        vsetvli zero, zero, e16, m1, ta, ma
        vnsrl.wi        v10, v8, 10
        vsll.vi v8, v10, 6
        ret

After #93182:

        vsetivli zero, 8, e32, m2, ta, ma
        vsra.vi v10, v8, 31
        vsrl.vi v10, v10, 26
        vadd.vv v8, v8, v10
        vsetvli zero, zero, e16, m1, ta, ma
        vnsrl.wi        v10, v8, 0
        vsll.vi v8, v10, 4
        li      a0, -1024
        vand.vx v8, v8, a0
        ret

I think the following code breaks vnsrl (i.e., trunc (srl X, C)) pattern:

if (!(HighBits & DemandedBits)) {
// None of the shifted in bits are needed. Add a truncate of the
// shift input, then shift it.
SDValue NewShAmt =
TLO.DAG.getShiftAmountConstant(ShVal, VT, dl, TLO.LegalTypes());
SDValue NewTrunc =
TLO.DAG.getNode(ISD::TRUNCATE, dl, VT, Src.getOperand(0));
return TLO.CombineTo(
Op, TLO.DAG.getNode(ISD::SRL, dl, VT, NewTrunc, NewShAmt));
}

Do we need a TLI hook to query whether (trunc (srl X, C)) is natively supported?

cc @RKSimon @topperc

@RKSimon
Copy link
Collaborator

RKSimon commented Jun 3, 2024

Do we need a TLI hook to query whether (trunc (srl X, C)) is natively supported?

There is a isTypeDesirableForOp check after type legalization - is that not working?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Jun 4, 2024

Do we need a TLI hook to query whether (trunc (srl X, C)) is natively supported?

There is a isTypeDesirableForOp check after type legalization - is that not working?

RISC-V target doesn't override this method. I think we need a new hook like isNarrowingProfitableForOp.

@RKSimon
Copy link
Collaborator

RKSimon commented Jun 4, 2024

Why can't it use isTypeDesirableForOp ?

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Jun 4, 2024

Why can't it use isTypeDesirableForOp ?

ISD::SRL is legal for both v8i32 and v8i16 on RISCV with rvv.

@RKSimon
Copy link
Collaborator

RKSimon commented Jun 4, 2024

What about if we updated isTypeDesirableForOp to take NewVT + OldVT arguments?

I always worry when we add yet another TLI hook :(

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Jun 4, 2024

What about if we updated isTypeDesirableForOp to take NewVT + OldVT arguments?

It sounds reasonable.

@Fros1er
Copy link
Contributor

Fros1er commented Jun 13, 2024

May I get assigned to this issue? I'm guided by @dtcxzyw.

aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this issue Jul 14, 2024
…lvm#95563)

Added a RISCV overload of `isTruncateFree` to fix the break of vnsrl described in issue llvm#94265.

Fixes llvm#94265
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants