Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Make InitUndef handle undef operand #65755

Merged
merged 6 commits into from
Dec 1, 2023
Merged

Conversation

BeMg
Copy link
Contributor

@BeMg BeMg commented Sep 8, 2023

#65704.


When operand be mark as undef, the InitUndef will miss this case.

This patch

  1. support the undef operand case
  2. Merge the code for handleImplictDef and fixupUndefOperandOnly to reduce the searching logic.

@BeMg BeMg requested a review from a team as a code owner September 8, 2023 13:22
@lukel97
Copy link
Contributor

lukel97 commented Sep 8, 2023

Hi, if it's of any use to you I reduced the test case last night with llvm-reduce:

target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "riscv64-unknown-linux-gnu"

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare <16 x i8> @llvm.vector.extract.v16i8.nxv8i8(<vscale x 8 x i8>, i64 immarg) #0

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare <vscale x 8 x i8> @llvm.vector.insert.nxv8i8.v16i8(<vscale x 8 x i8>, <16 x i8>, i64 immarg) #0

; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
declare <vscale x 8 x i8> @llvm.riscv.vslideup.nxv8i8.i64(<vscale x 8 x i8>, <vscale x 8 x i8>, i64, i64, i64 immarg) #1

define void @foo(<vscale x 8 x i8> %0) #2 {
  %2 = tail call <vscale x 8 x i8> @llvm.vector.insert.nxv8i8.v16i8(<vscale x 8 x i8> undef, <16 x i8> undef, i64 0)
  %3 = tail call <vscale x 8 x i8> @llvm.vector.insert.nxv8i8.v16i8(<vscale x 8 x i8> undef, <16 x i8> poison, i64 0)
  br label %4

4:                                                ; preds = %4, %1
  %5 = tail call <vscale x 8 x i8> @llvm.riscv.vslideup.nxv8i8.i64(<vscale x 8 x i8> zeroinitializer, <vscale x 8 x i8> %2, i64 0, i64 0, i64 0)
  %6 = tail call <16 x i8> @llvm.vector.extract.v16i8.nxv8i8(<vscale x 8 x i8> %5, i64 0)
  %7 = bitcast <16 x i8> %6 to <2 x i64>
  %8 = extractelement <2 x i64> %7, i64 0
  %9 = insertvalue [2 x i64] zeroinitializer, i64 %8, 0
  %10 = tail call <vscale x 8 x i8> @llvm.riscv.vslideup.nxv8i8.i64(<vscale x 8 x i8> %0, <vscale x 8 x i8> %3, i64 0, i64 0, i64 0)
  %11 = tail call <16 x i8> @llvm.vector.extract.v16i8.nxv8i8(<vscale x 8 x i8> %10, i64 0)
  %12 = bitcast <16 x i8> %11 to <2 x i64>
  %13 = extractelement <2 x i64> %12, i64 0
  %14 = insertvalue [2 x i64] zeroinitializer, i64 %13, 0
  %15 = tail call fastcc [2 x i64] null([2 x i64] %9, [2 x i64] %14, [2 x i64] zeroinitializer)
  br label %4
}

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
attributes #1 = { nocallback nofree nosync nounwind willreturn memory(none) }
attributes #2 = { "target-features"="+64bit,+a,+d,+f,+m,+relax,+v,+zicsr,+zifencei,+zve32f,+zve32x,+zve64d,+zve64f,+zve64x,+zvl128b,+zvl32b,+zvl64b,-c,-e,-h,-save-restore,-svinval,-svnapot,-svpbmt,-unaligned-scalar-mem,-unaligned-vector-mem,-xcvalu,-xcvbi,-xcvbitmanip,-xcvmac,-xcvsimd,-xsfcie,-xsfvcp,-xventanacondops,-zawrs,-zba,-zbb,-zbc,-zbkb,-zbkc,-zbkx,-zbs,-zca,-zcb,-zcd,-zce,-zcf,-zcmp,-zcmt,-zdinx,-zfh,-zfhmin,-zfinx,-zhinx,-zhinxmin,-zicbom,-zicbop,-zicboz,-zicntr,-zihintntl,-zihintpause,-zihpm,-zk,-zkn,-zknd,-zkne,-zknh,-zkr,-zks,-zksed,-zksh,-zkt,-zmmul,-zvfh,-zvl1024b,-zvl16384b,-zvl2048b,-zvl256b,-zvl32768b,-zvl4096b,-zvl512b,-zvl65536b,-zvl8192b" }

@BeMg
Copy link
Contributor Author

BeMg commented Sep 11, 2023

Hi, if it's of any use to you I reduced the test case last night with llvm-reduce:

target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
target triple = "riscv64-unknown-linux-gnu"

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare <16 x i8> @llvm.vector.extract.v16i8.nxv8i8(<vscale x 8 x i8>, i64 immarg) #0

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare <vscale x 8 x i8> @llvm.vector.insert.nxv8i8.v16i8(<vscale x 8 x i8>, <16 x i8>, i64 immarg) #0

; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
declare <vscale x 8 x i8> @llvm.riscv.vslideup.nxv8i8.i64(<vscale x 8 x i8>, <vscale x 8 x i8>, i64, i64, i64 immarg) #1

define void @foo(<vscale x 8 x i8> %0) #2 {
  %2 = tail call <vscale x 8 x i8> @llvm.vector.insert.nxv8i8.v16i8(<vscale x 8 x i8> undef, <16 x i8> undef, i64 0)
  %3 = tail call <vscale x 8 x i8> @llvm.vector.insert.nxv8i8.v16i8(<vscale x 8 x i8> undef, <16 x i8> poison, i64 0)
  br label %4

4:                                                ; preds = %4, %1
  %5 = tail call <vscale x 8 x i8> @llvm.riscv.vslideup.nxv8i8.i64(<vscale x 8 x i8> zeroinitializer, <vscale x 8 x i8> %2, i64 0, i64 0, i64 0)
  %6 = tail call <16 x i8> @llvm.vector.extract.v16i8.nxv8i8(<vscale x 8 x i8> %5, i64 0)
  %7 = bitcast <16 x i8> %6 to <2 x i64>
  %8 = extractelement <2 x i64> %7, i64 0
  %9 = insertvalue [2 x i64] zeroinitializer, i64 %8, 0
  %10 = tail call <vscale x 8 x i8> @llvm.riscv.vslideup.nxv8i8.i64(<vscale x 8 x i8> %0, <vscale x 8 x i8> %3, i64 0, i64 0, i64 0)
  %11 = tail call <16 x i8> @llvm.vector.extract.v16i8.nxv8i8(<vscale x 8 x i8> %10, i64 0)
  %12 = bitcast <16 x i8> %11 to <2 x i64>
  %13 = extractelement <2 x i64> %12, i64 0
  %14 = insertvalue [2 x i64] zeroinitializer, i64 %13, 0
  %15 = tail call fastcc [2 x i64] null([2 x i64] %9, [2 x i64] %14, [2 x i64] zeroinitializer)
  br label %4
}

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
attributes #1 = { nocallback nofree nosync nounwind willreturn memory(none) }
attributes #2 = { "target-features"="+64bit,+a,+d,+f,+m,+relax,+v,+zicsr,+zifencei,+zve32f,+zve32x,+zve64d,+zve64f,+zve64x,+zvl128b,+zvl32b,+zvl64b,-c,-e,-h,-save-restore,-svinval,-svnapot,-svpbmt,-unaligned-scalar-mem,-unaligned-vector-mem,-xcvalu,-xcvbi,-xcvbitmanip,-xcvmac,-xcvsimd,-xsfcie,-xsfvcp,-xventanacondops,-zawrs,-zba,-zbb,-zbc,-zbkb,-zbkc,-zbkx,-zbs,-zca,-zcb,-zcd,-zce,-zcf,-zcmp,-zcmt,-zdinx,-zfh,-zfhmin,-zfinx,-zhinx,-zhinxmin,-zicbom,-zicbop,-zicboz,-zicntr,-zihintntl,-zihintpause,-zihpm,-zk,-zkn,-zknd,-zkne,-zknh,-zkr,-zks,-zksed,-zksh,-zkt,-zmmul,-zvfh,-zvl1024b,-zvl16384b,-zvl2048b,-zvl256b,-zvl32768b,-zvl4096b,-zvl512b,-zvl65536b,-zvl8192b" }

Thanks, very helpful!

@BeMg
Copy link
Contributor Author

BeMg commented Sep 11, 2023

Use the reduce version testcase

continue;
if (!isVectorRegClass(UseMO.getReg()))
continue;
if (UseMO.getReg() == 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already checked it is virtual. It can't be virtual and 0 can it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. $noreg is not belong to virtual register.

Remove the reg == 0 .

; RUN: llc -mtriple=riscv64 -mattr=+v,+f,+m,+zfh,+zvfh \
; RUN: < %s | FileCheck %s

; Function Attrs: mustprogress nocallback nofree nosync nounwind speculatable willreturn memory(none)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop Function Attrs comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

; Function Attrs: mustprogress nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare <vscale x 2 x i32> @llvm.vector.insert.nxv2i32.v4i32(<vscale x 2 x i32>, <4 x i32>, i64 immarg) #3

define void @foo(<vscale x 8 x i8> %0) #2 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2 doesn't exist. You can remove it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@BeMg BeMg requested a review from topperc September 18, 2023 03:52
@BeMg
Copy link
Contributor Author

BeMg commented Oct 11, 2023

ping

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Bug report from llvm#65704.

The InitUndef pass miss the pattern that operand is not implict_def but undef directly.

This patch support this pattern in InitUndef pass.
They share the same pattern of replacing the Operand with PseudoRVVInitUndef.

This patch

1. reduces the logic for finding MachineInstr that needs to be fixed.
2. emit PseudoRVVInitUndef just before the user's operand to reduce register pressure (shorter LiveInterval).
@BeMg BeMg merged commit 5ecb37b into llvm:main Dec 1, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants