-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[WebAssembly] Fix lowering of (truncating) stores from addrspace(1) globals #163136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[WebAssembly] Fix lowering of (truncating) stores from addrspace(1) globals #163136
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-backend-webassembly Author: Demetrius Kanios (QuantumSegfault) ChangesVery similar to #155937. Implements custom truncating store logic to correctly handle such to The truncating stores work such that they preserve the upper When storing into a global, bits beyond what are declared in the LLVM IR global are considered undefined and may be clobbered. For instance, an i8 store into a Full diff: https://github.com/llvm/llvm-project/pull/163136.diff 2 Files Affected:
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index 163bf9ba5b089..59bac54f210ab 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -111,6 +111,12 @@ WebAssemblyTargetLowering::WebAssemblyTargetLowering(
}
}
+ setTruncStoreAction(MVT::i32, MVT::i8, Custom);
+ setTruncStoreAction(MVT::i32, MVT::i16, Custom);
+ setTruncStoreAction(MVT::i64, MVT::i8, Custom);
+ setTruncStoreAction(MVT::i64, MVT::i16, Custom);
+ setTruncStoreAction(MVT::i64, MVT::i32, Custom);
+
setOperationAction(ISD::GlobalAddress, MVTPtr, Custom);
setOperationAction(ISD::GlobalTLSAddress, MVTPtr, Custom);
setOperationAction(ISD::ExternalSymbol, MVTPtr, Custom);
@@ -1683,6 +1689,11 @@ static bool IsWebAssemblyGlobal(SDValue Op) {
if (const GlobalAddressSDNode *GA = dyn_cast<GlobalAddressSDNode>(Op))
return WebAssembly::isWasmVarAddressSpace(GA->getAddressSpace());
+ if (Op->getOpcode() == WebAssemblyISD::Wrapper)
+ if (const GlobalAddressSDNode *GA =
+ dyn_cast<GlobalAddressSDNode>(Op->getOperand(0)))
+ return WebAssembly::isWasmVarAddressSpace(GA->getAddressSpace());
+
return false;
}
@@ -1704,15 +1715,84 @@ SDValue WebAssemblyTargetLowering::LowerStore(SDValue Op,
const SDValue &Base = SN->getBasePtr();
const SDValue &Offset = SN->getOffset();
+ const EVT ValueType = Value.getValueType();
+ const EVT MemType = SN->getMemoryVT();
+
if (IsWebAssemblyGlobal(Base)) {
if (!Offset->isUndef())
report_fatal_error("unexpected offset when storing to webassembly global",
false);
- SDVTList Tys = DAG.getVTList(MVT::Other);
- SDValue Ops[] = {SN->getChain(), Value, Base};
- return DAG.getMemIntrinsicNode(WebAssemblyISD::GLOBAL_SET, DL, Tys, Ops,
- SN->getMemoryVT(), SN->getMemOperand());
+ if (!ValueType.isInteger()) {
+ SDVTList Tys = DAG.getVTList(MVT::Other);
+ SDValue Ops[] = {SN->getChain(), Value, Base};
+ return DAG.getMemIntrinsicNode(WebAssemblyISD::GLOBAL_SET, DL, Tys, Ops,
+ SN->getMemoryVT(), SN->getMemOperand());
+ }
+
+ EVT GT = MVT::INVALID_SIMPLE_VALUE_TYPE;
+
+ if (auto *GA = dyn_cast<GlobalAddressSDNode>(
+ Base->getOpcode() == WebAssemblyISD::Wrapper ? Base->getOperand(0)
+ : Base))
+ GT = EVT::getEVT(GA->getGlobal()->getValueType());
+
+ if (GT != MVT::i8 && GT != MVT::i16 && GT != MVT::i32 && GT != MVT::i64)
+ report_fatal_error("encountered unexpected global type for Base when "
+ "storing into webassembly global",
+ false);
+
+ EVT PromotedGT = getTypeToTransformTo(*DAG.getContext(), GT);
+
+ if (PromotedGT == ValueType && ValueType == MemType) {
+ SDVTList Tys = DAG.getVTList(MVT::Other);
+ SDValue Ops[] = {SN->getChain(), Value, Base};
+ return DAG.getMemIntrinsicNode(WebAssemblyISD::GLOBAL_SET, DL, Tys, Ops,
+ SN->getMemoryVT(), SN->getMemOperand());
+ }
+
+ SDValue ModifiedNewValue = Value;
+ if (MemType.bitsLT(ValueType)) {
+ APInt Mask = APInt::getLowBitsSet(ValueType.getFixedSizeInBits(),
+ MemType.getFixedSizeInBits());
+ ModifiedNewValue = DAG.getNode(ISD::AND, DL, ValueType, ModifiedNewValue,
+ DAG.getConstant(Mask, DL, ValueType));
+ }
+ if (ValueType != PromotedGT) {
+ ModifiedNewValue = DAG.getAnyExtOrTrunc(ModifiedNewValue, DL, PromotedGT);
+ }
+
+ SDValue Chain = SN->getChain();
+
+ SDValue CombinedValue = ModifiedNewValue;
+ if (GT.bitsGT(MemType)) {
+ MachineMemOperand &StoreMMO = *SN->getMemOperand();
+ MachineMemOperand *NewMMO = DAG.getMachineFunction().getMachineMemOperand(
+ StoreMMO.getPointerInfo(),
+ (StoreMMO.getFlags() & ~MachineMemOperand::Flags::MOStore) |
+ MachineMemOperand::Flags::MOLoad,
+ LLT(PromotedGT.getSimpleVT()), StoreMMO.getBaseAlign(),
+ StoreMMO.getAAInfo(), StoreMMO.getRanges(), StoreMMO.getSyncScopeID(),
+ StoreMMO.getSuccessOrdering(), StoreMMO.getFailureOrdering());
+
+ SDValue OrigValue = DAG.getLoad(PromotedGT, DL, Chain, Base, NewMMO);
+ Chain = OrigValue.getValue(1);
+ APInt Mask = APInt::getHighBitsSet(GT.getFixedSizeInBits(),
+ GT.getFixedSizeInBits() -
+ MemType.getFixedSizeInBits());
+ SDValue ModifiedOrigValue =
+ DAG.getNode(ISD::AND, DL, PromotedGT, OrigValue,
+ DAG.getConstant(Mask, DL, PromotedGT));
+ CombinedValue = DAG.getNode(ISD::OR, DL, PromotedGT, ModifiedNewValue,
+ ModifiedOrigValue);
+ }
+
+ MachineMemOperand *MMO = SN->getMemOperand();
+ MMO->setType(LLT(PromotedGT.getSimpleVT()));
+
+ SDValue NewStore = DAG.getStore(Chain, DL, CombinedValue, Base, MMO);
+
+ return NewStore;
}
if (std::optional<unsigned> Local = IsWebAssemblyLocal(Base, DAG)) {
diff --git a/llvm/test/CodeGen/WebAssembly/lower-store-wasm-global.ll b/llvm/test/CodeGen/WebAssembly/lower-store-wasm-global.ll
new file mode 100644
index 0000000000000..c03559efc032f
--- /dev/null
+++ b/llvm/test/CodeGen/WebAssembly/lower-store-wasm-global.ll
@@ -0,0 +1,167 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s | FileCheck %s
+
+; Test various stores from WASM (address space 1) globals lower as intended
+
+target triple = "wasm32-unknown-unknown"
+
+
+@globalI8 = local_unnamed_addr addrspace(1) global i8 undef
+@globalI32 = local_unnamed_addr addrspace(1) global i32 undef
+@globalI64 = local_unnamed_addr addrspace(1) global i64 undef
+
+define void @store_i8_to_i32(i8 %val) {
+; CHECK-LABEL: store_i8_to_i32:
+; CHECK: .functype store_i8_to_i32 (i32) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: i32.const 255
+; CHECK-NEXT: i32.and
+; CHECK-NEXT: global.get globalI32
+; CHECK-NEXT: i32.const -256
+; CHECK-NEXT: i32.and
+; CHECK-NEXT: i32.or
+; CHECK-NEXT: global.set globalI32
+; CHECK-NEXT: # fallthrough-return
+ store i8 %val, ptr addrspace(1) @globalI32
+ ret void
+}
+
+define void @store_i16_to_i32(i16 %val) {
+; CHECK-LABEL: store_i16_to_i32:
+; CHECK: .functype store_i16_to_i32 (i32) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: i32.const 65535
+; CHECK-NEXT: i32.and
+; CHECK-NEXT: global.get globalI32
+; CHECK-NEXT: i32.const -65536
+; CHECK-NEXT: i32.and
+; CHECK-NEXT: i32.or
+; CHECK-NEXT: global.set globalI32
+; CHECK-NEXT: # fallthrough-return
+ store i16 %val, ptr addrspace(1) @globalI32
+ ret void
+}
+
+define void @store_i32_to_i32(i32 %val) {
+; CHECK-LABEL: store_i32_to_i32:
+; CHECK: .functype store_i32_to_i32 (i32) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: global.set globalI32
+; CHECK-NEXT: # fallthrough-return
+ store i32 %val, ptr addrspace(1) @globalI32
+ ret void
+}
+
+define void @store_i64_to_i32(i64 %val) {
+; CHECK-LABEL: store_i64_to_i32:
+; CHECK: .functype store_i64_to_i32 (i64) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: i32.wrap_i64
+; CHECK-NEXT: global.set globalI32
+; CHECK-NEXT: # fallthrough-return
+ store i64 %val, ptr addrspace(1) @globalI32
+ ret void
+}
+
+define void @store_i8_to_i64(i8 %val) {
+; CHECK-LABEL: store_i8_to_i64:
+; CHECK: .functype store_i8_to_i64 (i32) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: i32.const 255
+; CHECK-NEXT: i32.and
+; CHECK-NEXT: i64.extend_i32_u
+; CHECK-NEXT: global.get globalI64
+; CHECK-NEXT: i64.const -256
+; CHECK-NEXT: i64.and
+; CHECK-NEXT: i64.or
+; CHECK-NEXT: global.set globalI64
+; CHECK-NEXT: # fallthrough-return
+ store i8 %val, ptr addrspace(1) @globalI64
+ ret void
+}
+
+define void @store_i16_to_i64(i16 %val) {
+; CHECK-LABEL: store_i16_to_i64:
+; CHECK: .functype store_i16_to_i64 (i32) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: i32.const 65535
+; CHECK-NEXT: i32.and
+; CHECK-NEXT: i64.extend_i32_u
+; CHECK-NEXT: global.get globalI64
+; CHECK-NEXT: i64.const -65536
+; CHECK-NEXT: i64.and
+; CHECK-NEXT: i64.or
+; CHECK-NEXT: global.set globalI64
+; CHECK-NEXT: # fallthrough-return
+ store i16 %val, ptr addrspace(1) @globalI64
+ ret void
+}
+
+define void @store_i32_to_i64(i32 %val) {
+; CHECK-LABEL: store_i32_to_i64:
+; CHECK: .functype store_i32_to_i64 (i32) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: i64.extend_i32_u
+; CHECK-NEXT: global.get globalI64
+; CHECK-NEXT: i64.const -4294967296
+; CHECK-NEXT: i64.and
+; CHECK-NEXT: i64.or
+; CHECK-NEXT: global.set globalI64
+; CHECK-NEXT: # fallthrough-return
+ store i32 %val, ptr addrspace(1) @globalI64
+ ret void
+}
+
+define void @store_i64_to_i64(i64 %val) {
+; CHECK-LABEL: store_i64_to_i64:
+; CHECK: .functype store_i64_to_i64 (i64) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: global.set globalI64
+; CHECK-NEXT: # fallthrough-return
+ store i64 %val, ptr addrspace(1) @globalI64
+ ret void
+}
+
+define void @store_i8_to_i8(i8 %val) {
+; CHECK-LABEL: store_i8_to_i8:
+; CHECK: .functype store_i8_to_i8 (i32) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: i32.const 255
+; CHECK-NEXT: i32.and
+; CHECK-NEXT: global.set globalI8
+; CHECK-NEXT: # fallthrough-return
+ store i8 %val, ptr addrspace(1) @globalI8
+ ret void
+}
+
+define void @store_i32_to_i8(i32 %val) {
+; CHECK-LABEL: store_i32_to_i8:
+; CHECK: .functype store_i32_to_i8 (i32) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: global.set globalI8
+; CHECK-NEXT: # fallthrough-return
+ store i32 %val, ptr addrspace(1) @globalI8
+ ret void
+}
+
+define void @store_i64_to_i8(i64 %val) {
+; CHECK-LABEL: store_i64_to_i8:
+; CHECK: .functype store_i64_to_i8 (i64) -> ()
+; CHECK-NEXT: # %bb.0:
+; CHECK-NEXT: local.get 0
+; CHECK-NEXT: i32.wrap_i64
+; CHECK-NEXT: global.set globalI8
+; CHECK-NEXT: # fallthrough-return
+ store i64 %val, ptr addrspace(1) @globalI8
+ ret void
+}
|
Very similar to #155937.
Implements custom truncating store logic to correctly handle such to
addrspace(1)
/WASM globals. Includes inserting a truncate when trying to store an i64 into an i32 global.The truncating stores work such that they preserve the upper
N - M
bits of aM
-bit store into anN
-bit global (that is, an i8 truncating store loads the global, masks the upper 24-bits, then combines that with the 8-bit value, storing the 32-bit result into the global).When storing into a global, bits beyond what are declared in the LLVM IR global are considered undefined and may be clobbered. For instance, an i8 store into a
addrspace(1) global i8
will make no attempt to preserve the upper 24 bits of the result WASM i32 global, and simply store the i8 (extended to i32) value into the global. Similarly, storing an i64 into an i32 global simply discards the extraneous bits of the input.