-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SelectionDAG][RISCV] Mask constants to narrow size in TargetLowering::expandUnalignedStore. #66567
Conversation
…a large constant. On the first split we create two i32 trunc stores and a srl to shift the high part down. The srl gets constant folded, but to produce a new i32 constant. But the truncstore for the low store still uses the original constant. This original constant then gets converted to a constant pool before we revisit the stores to further split them. The constant pool prevents further constant folding of the additional srls. After legalization is done, we run DAGCombiner and get some constant folding of srl via computeKnownBits which can peek through the constant pool load. This can create new constants that also need a constant pool.
…::expandUnalignedStore. If the SRL for Hi constant folds, but we don't remoe those bits from the Lo, we can end up with strange constant folding through DAGCombine later. I've only seen this with constants being lowered to constant pools during lowering on RISC-V.
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-risc-v ChangesIf the SRL for Hi constant folds, but we don't remoe those bits from the Lo, we can end up with strange constant folding. I've only seen this with constants being lowered to constant pools during lowering on RISC-V. On the first split we create two i32 trunc stores and a srl to shift the high part down. The srl gets constant folded, but to produce a new i32 constant. But the truncstore for the low store still uses the original constant. This original constant then gets converted to a constant pool before we revisit the stores to further split them. The constant pool prevents further constant folding of the additional srls. After legalization is done, we run DAGCombiner and get some constant folding of srl via computeKnownBits which can peek through the constant pool load. This can create new constants that also need a constant pool. Full diff: https://github.com/llvm/llvm-project/pull/66567.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index bd1940994a87f0f..1a7799816711c03 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -9558,6 +9558,14 @@ SDValue TargetLowering::expandUnalignedStore(StoreSDNode *ST,
SDValue ShiftAmount = DAG.getConstant(
NumBits, dl, getShiftAmountTy(Val.getValueType(), DAG.getDataLayout()));
SDValue Lo = Val;
+ // If Val is a constant, replace the upper bits with 0. The SRL will constant
+ // fold and not use the upper bits. A smaller constant may be easier to
+ // materialize.
+ if (auto *C = dyn_cast<ConstantSDNode>(Lo); C && !C->isOpaque())
+ Lo = DAG.getNode(
+ ISD::AND, dl, VT, Lo,
+ DAG.getConstant(APInt::getLowBitsSet(VT.getSizeInBits(), NumBits), dl,
+ VT));
SDValue Hi = DAG.getNode(ISD::SRL, dl, VT, Val, ShiftAmount);
// Store the two parts
diff --git a/llvm/test/CodeGen/RISCV/unaligned-load-store.ll b/llvm/test/CodeGen/RISCV/unaligned-load-store.ll
index 429f0543b41b3db..ce0d8fedbfb88f2 100644
--- a/llvm/test/CodeGen/RISCV/unaligned-load-store.ll
+++ b/llvm/test/CodeGen/RISCV/unaligned-load-store.ll
@@ -415,3 +415,44 @@ define void @merge_stores_i32_i64(ptr %p) {
store i32 0, ptr %p2
ret void
}
+
+define void @store_large_constant(ptr %x) {
+; SLOW-LABEL: store_large_constant:
+; SLOW: # %bb.0:
+; SLOW-NEXT: li a1, 254
+; SLOW-NEXT: sb a1, 7(a0)
+; SLOW-NEXT: li a1, 220
+; SLOW-NEXT: sb a1, 6(a0)
+; SLOW-NEXT: li a1, 186
+; SLOW-NEXT: sb a1, 5(a0)
+; SLOW-NEXT: li a1, 152
+; SLOW-NEXT: sb a1, 4(a0)
+; SLOW-NEXT: li a1, 118
+; SLOW-NEXT: sb a1, 3(a0)
+; SLOW-NEXT: li a1, 84
+; SLOW-NEXT: sb a1, 2(a0)
+; SLOW-NEXT: li a1, 50
+; SLOW-NEXT: sb a1, 1(a0)
+; SLOW-NEXT: li a1, 16
+; SLOW-NEXT: sb a1, 0(a0)
+; SLOW-NEXT: ret
+;
+; RV32I-FAST-LABEL: store_large_constant:
+; RV32I-FAST: # %bb.0:
+; RV32I-FAST-NEXT: lui a1, 1043916
+; RV32I-FAST-NEXT: addi a1, a1, -1384
+; RV32I-FAST-NEXT: sw a1, 4(a0)
+; RV32I-FAST-NEXT: lui a1, 484675
+; RV32I-FAST-NEXT: addi a1, a1, 528
+; RV32I-FAST-NEXT: sw a1, 0(a0)
+; RV32I-FAST-NEXT: ret
+;
+; RV64I-FAST-LABEL: store_large_constant:
+; RV64I-FAST: # %bb.0:
+; RV64I-FAST-NEXT: lui a1, %hi(.LCPI16_0)
+; RV64I-FAST-NEXT: ld a1, %lo(.LCPI16_0)(a1)
+; RV64I-FAST-NEXT: sd a1, 0(a0)
+; RV64I-FAST-NEXT: ret
+ store i64 18364758544493064720, ptr %x, align 1
+ ret void
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If the SRL for Hi constant folds, but we don't remove those bits from the Lo, we can end up with strange constant folding. I've only seen this with constants being lowered to constant pools during lowering on RISC-V.
On the first split we create two i32 trunc stores and a srl to shift the high part down. The srl gets constant folded, but to produce a new i32 constant. But the truncstore for the low store still uses the original constant.
This original constant then gets converted to a constant pool before we revisit the stores to further split them. The constant pool prevents further constant folding of the additional srls.
After legalization is done, we run DAGCombiner and get some constant folding of srl via computeKnownBits which can peek through the constant pool load. This can create new constants that also need a constant pool.