Skip to content

Commit

Permalink
[NVPTX] Lower v2f16 and v2bf16 stores as 32-bit scalars.
Browse files Browse the repository at this point in the history
This avoids unnecessary vector splitting that was needed for vectorized store
instruction.

Differential Revision: https://reviews.llvm.org/D152593
  • Loading branch information
Artem-B committed Jun 23, 2023
1 parent a67208e commit 60941f1
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 2 deletions.
4 changes: 4 additions & 0 deletions llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2465,6 +2465,10 @@ SDValue NVPTXTargetLowering::LowerSTORE(SDValue Op, SelectionDAG &DAG) const {
VT, *Store->getMemOperand()))
return expandUnalignedStore(Store, DAG);

// v2f16 and v2bf16 don't need special handling.
if (VT == MVT::v2f16 || VT == MVT::v2bf16)
return SDValue();

if (VT.isVector())
return LowerSTOREVector(Op, DAG);

Expand Down
3 changes: 1 addition & 2 deletions llvm/test/CodeGen/NVPTX/f16x2-instructions.ll
Original file line number Diff line number Diff line change
Expand Up @@ -276,8 +276,7 @@ define <2 x half> @test_frem(<2 x half> %a, <2 x half> %b) #0 {
; CHECK-DAG: ld.param.u64 %[[A:rd[0-9]+]], [test_ldst_v2f16_param_0];
; CHECK-DAG: ld.param.u64 %[[B:rd[0-9]+]], [test_ldst_v2f16_param_1];
; CHECK-DAG: ld.b32 [[E:%r[0-9]+]], [%[[A]]]
; CHECK: mov.b32 {[[E0:%rs[0-9]+]], [[E1:%rs[0-9]+]]}, [[E]];
; CHECK-DAG: st.v2.b16 [%[[B]]], {[[E0]], [[E1]]};
; CHECK-DAG: st.b32 [%[[B]]], [[E]];
; CHECK: ret;
define void @test_ldst_v2f16(ptr %a, ptr %b) {
%t1 = load <2 x half>, ptr %a
Expand Down

0 comments on commit 60941f1

Please sign in to comment.