Skip to content

Conversation

@QuantumSegfault
Copy link
Contributor

@QuantumSegfault QuantumSegfault commented Nov 6, 2025

Adds lowering of addrspacecast [0 -> 20] to allow easy conversion of function pointers to Wasm funcref

When given a constant function pointer, it lowers to a direct ref.func. Otherwise it lowers to a table.get from __indirect_function_table using the provided pointer as the index.

@llvmbot
Copy link
Member

llvmbot commented Nov 6, 2025

@llvm/pr-subscribers-backend-webassembly

Author: Demetrius Kanios (QuantumSegfault)

Changes

Adds lowering of addrspacecast [0 -> 20] to allow easy conversion of function pointers to WASM funcref

When given a constant function pointer, it lowers to a direct ref.func. Otherwise it lowers to a table.get from __indirect_function_table using the provided pointer as the index.


Full diff: https://github.com/llvm/llvm-project/pull/166820.diff

4 Files Affected:

  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (+58)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.h (+1)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyInstrRef.td (+6-1)
  • (added) llvm/test/CodeGen/WebAssembly/addrspacecast-funcref.ll (+55)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index af322982d5355..782b878350b42 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -409,6 +409,10 @@ WebAssemblyTargetLowering::WebAssemblyTargetLowering(
   setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::Other, Custom);
   setOperationAction(ISD::INTRINSIC_VOID, MVT::Other, Custom);
 
+  // Allow converting function ptrs in address space 0 to WASM funcref (address
+  // space 20)
+  setOperationAction(ISD::ADDRSPACECAST, MVT::funcref, Custom);
+
   setMaxAtomicSizeInBitsSupported(64);
 
   // Always convert switches to br_tables unless there is only one case, which
@@ -1733,6 +1737,8 @@ SDValue WebAssemblyTargetLowering::LowerOperation(SDValue Op,
     return LowerMUL_LOHI(Op, DAG);
   case ISD::UADDO:
     return LowerUADDO(Op, DAG);
+  case ISD::ADDRSPACECAST:
+    return LowerADDRSPACECAST(Op, DAG);
   }
 }
 
@@ -1876,6 +1882,58 @@ SDValue WebAssemblyTargetLowering::LowerUADDO(SDValue Op,
   return DAG.getMergeValues(Ops, DL);
 }
 
+SDValue WebAssemblyTargetLowering::LowerADDRSPACECAST(SDValue Op,
+                                                      SelectionDAG &DAG) const {
+  SDLoc DL(Op);
+
+  AddrSpaceCastSDNode *ACN = cast<AddrSpaceCastSDNode>(Op.getNode());
+
+  if (ACN->getSrcAddressSpace() !=
+          WebAssembly::WasmAddressSpace::WASM_ADDRESS_SPACE_DEFAULT ||
+      ACN->getDestAddressSpace() !=
+          WebAssembly::WasmAddressSpace::WASM_ADDRESS_SPACE_FUNCREF)
+    return Op;
+
+  if (ACN->getValueType(0) != MVT::funcref) {
+    reportFatalInternalError("Cannot addrspacecast to funcref addrspace with "
+                             "results other than MVT::funcref");
+  }
+
+  SDValue Src = ACN->getOperand(0);
+
+  // Lower addrspacecasts of direct/constant function ptrs to ref.func
+  if (auto *GA = dyn_cast<GlobalAddressSDNode>(
+          Src->getOpcode() == WebAssemblyISD::Wrapper ? Src->getOperand(0)
+                                                      : Src)) {
+    auto *GV = GA->getGlobal();
+
+    if (const Function *F = dyn_cast<Function>(GV)) {
+      SDValue FnAddress = DAG.getTargetGlobalAddress(F, DL, MVT::i32);
+
+      SDValue RefFuncNode =
+          DAG.getNode(WebAssemblyISD::REF_FUNC, DL, MVT::funcref, FnAddress);
+      return RefFuncNode;
+    }
+  }
+
+  // Lower everything else to a table.get from the indirect function table
+  const MachineFunction &MF = DAG.getMachineFunction();
+
+  MVT PtrVT = getPointerTy(MF.getDataLayout());
+
+  MCSymbolWasm *Table =
+      WebAssembly::getOrCreateFunctionTableSymbol(MF.getContext(), Subtarget);
+  SDValue TableSym = DAG.getMCSymbol(Table, PtrVT);
+
+  SDValue TableSlot = Op.getOperand(0);
+
+  SDValue Result(DAG.getMachineNode(WebAssembly::TABLE_GET_FUNCREF, DL,
+                                    MVT::funcref, TableSym, TableSlot),
+                 0);
+
+  return Result;
+}
+
 SDValue WebAssemblyTargetLowering::Replace128Op(SDNode *N,
                                                 SelectionDAG &DAG) const {
   assert(Subtarget->hasWideArithmetic());
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.h b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.h
index f7052989b3c75..c3cca072f1958 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.h
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.h
@@ -121,6 +121,7 @@ class WebAssemblyTargetLowering final : public TargetLowering {
   SDValue LowerMUL_LOHI(SDValue Op, SelectionDAG &DAG) const;
   SDValue Replace128Op(SDNode *N, SelectionDAG &DAG) const;
   SDValue LowerUADDO(SDValue Op, SelectionDAG &DAG) const;
+  SDValue LowerADDRSPACECAST(SDValue Op, SelectionDAG &DAG) const;
 
   // Custom DAG combine hooks
   SDValue
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrRef.td b/llvm/lib/Target/WebAssembly/WebAssemblyInstrRef.td
index 304c4f3fcb028..2589ab758638c 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrRef.td
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrRef.td
@@ -11,6 +11,11 @@
 ///
 //===----------------------------------------------------------------------===//
 
+def WebAssemblyRefFunc_t : SDTypeProfile<1, 1, [SDTCisVT<0, funcref>, SDTCisPtrTy<1>]>;
+def WebAssemblyRefFunc :
+    SDNode<"WebAssemblyISD::REF_FUNC", WebAssemblyRefFunc_t,
+           []>;
+
 multiclass REF_I<WebAssemblyRegClass rc, ValueType vt, string ht> {
   defm REF_NULL_#rc : I<(outs rc:$dst), (ins),
                         (outs), (ins),
@@ -42,7 +47,7 @@ defm REF_TEST_FUNCREF : I<(outs I32:$res), (ins TypeIndex:$type, FUNCREF:$ref),
                         Requires<[HasGC]>;
 
 defm REF_FUNC : I<(outs FUNCREF:$res), (ins function32_op:$func),
-                    (outs), (ins function32_op:$func), [],
+                    (outs), (ins function32_op:$func), [(set FUNCREF:$res, (WebAssemblyRefFunc tglobaladdr:$func))],
                     "ref.func\t$func", "ref.func $func", 0xd2>,
                 Requires<[HasReferenceTypes]>;
 
diff --git a/llvm/test/CodeGen/WebAssembly/addrspacecast-funcref.ll b/llvm/test/CodeGen/WebAssembly/addrspacecast-funcref.ll
new file mode 100644
index 0000000000000..1ae676f1c99c8
--- /dev/null
+++ b/llvm/test/CodeGen/WebAssembly/addrspacecast-funcref.ll
@@ -0,0 +1,55 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc -mtriple=wasm32-unknown-unknown -mattr=+reference-types < %s | FileCheck -check-prefixes=CHECK,WASM32 %s
+; RUN: llc -mtriple=wasm64-unknown-unknown -mattr=+reference-types < %s | FileCheck -check-prefixes=CHECK,WASM64 %s
+
+%funcref = type ptr addrspace(20) ;; addrspace 20 is nonintegral
+
+declare void @foo();
+
+@global_var = local_unnamed_addr global i32 undef
+
+define %funcref @cast_const_funcptr() {
+; CHECK-LABEL: cast_const_funcptr:
+; CHECK:         .functype cast_const_funcptr () -> (funcref)
+; CHECK-NEXT:  # %bb.0:
+; CHECK-NEXT:    ref.func foo
+; CHECK-NEXT:    # fallthrough-return
+  %result = addrspacecast ptr @foo to ptr addrspace(20)
+  ret %funcref %result
+}
+
+define %funcref @cast_const_not_funcptr() {
+; WASM32-LABEL: cast_const_not_funcptr:
+; WASM32:         .functype cast_const_not_funcptr () -> (funcref)
+; WASM32-NEXT:  # %bb.0:
+; WASM32-NEXT:    i32.const global_var
+; WASM32-NEXT:    table.get __indirect_function_table
+; WASM32-NEXT:    # fallthrough-return
+;
+; WASM64-LABEL: cast_const_not_funcptr:
+; WASM64:         .functype cast_const_not_funcptr () -> (funcref)
+; WASM64-NEXT:  # %bb.0:
+; WASM64-NEXT:    i64.const global_var
+; WASM64-NEXT:    table.get __indirect_function_table
+; WASM64-NEXT:    # fallthrough-return
+  %result = addrspacecast ptr @global_var to ptr addrspace(20)
+  ret %funcref %result
+}
+
+define %funcref @cast_param_funcptr(ptr %funcptr) {
+; WASM32-LABEL: cast_param_funcptr:
+; WASM32:         .functype cast_param_funcptr (i32) -> (funcref)
+; WASM32-NEXT:  # %bb.0:
+; WASM32-NEXT:    local.get 0
+; WASM32-NEXT:    table.get __indirect_function_table
+; WASM32-NEXT:    # fallthrough-return
+;
+; WASM64-LABEL: cast_param_funcptr:
+; WASM64:         .functype cast_param_funcptr (i64) -> (funcref)
+; WASM64-NEXT:  # %bb.0:
+; WASM64-NEXT:    local.get 0
+; WASM64-NEXT:    table.get __indirect_function_table
+; WASM64-NEXT:    # fallthrough-return
+  %result = addrspacecast ptr %funcptr to ptr addrspace(20)
+  ret %funcref %result
+}

@github-actions
Copy link

github-actions bot commented Nov 6, 2025

✅ With the latest revision this PR passed the undef deprecator.

@QuantumSegfault
Copy link
Contributor Author

@dschuff

Requesting review.

@QuantumSegfault
Copy link
Contributor Author

Ping

@dschuff

Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I started a review last time but forgot to submit it 🤦

Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@dschuff dschuff merged commit d3b9fd0 into llvm:main Dec 5, 2025
10 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Dec 5, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-ubuntu running on as-builder-4 while building llvm at step 7 "test-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/187/builds/14282

Here is the relevant piece of the build log for the reference
Step 7 (test-check-all) failure: Test just built components: check-all completed (failure)
******************** TEST 'LLVM :: CodeGen/WebAssembly/addrspacecast-funcref.ll' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 2
/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=wasm32-unknown-unknown -mattr=+reference-types < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/WebAssembly/addrspacecast-funcref.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -check-prefixes=CHECK,WASM32 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/WebAssembly/addrspacecast-funcref.ll
# executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=wasm32-unknown-unknown -mattr=+reference-types
# executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -check-prefixes=CHECK,WASM32 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/WebAssembly/addrspacecast-funcref.ll
# RUN: at line 3
/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=wasm64-unknown-unknown -mattr=+reference-types < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/WebAssembly/addrspacecast-funcref.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -check-prefixes=CHECK,WASM64 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/WebAssembly/addrspacecast-funcref.ll
# executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=wasm64-unknown-unknown -mattr=+reference-types
# .---command stderr------------
# | 
# | # After Post-RA pseudo instruction expansion pass
# | # Machine code for function cast_const_not_funcptr: NoPHIs, TracksLiveness, TiedOpsRewritten
# | Function Live Ins: $arguments
# | 
# | bb.0 (%ir-block.0):
# |   liveins: $arguments
# |   %0:i64 = CONST_I64 @global_var, implicit-def dead $arguments
# |   %2:i32 = COPY_I32 %0:i64, implicit-def $arguments
# |   %1:funcref = TABLE_GET_FUNCREF <mcsymbol __indirect_function_table>, %2:i32, implicit-def dead $arguments
# |   RETURN %1:funcref, implicit-def dead $arguments
# | 
# | # End machine code for function cast_const_not_funcptr.
# | 
# | *** Bad machine code: Illegal virtual register for instruction ***
# | - function:    cast_const_not_funcptr
# | - basic block: %bb.0  (0x5b54c32d4820)
# | - instruction: %2:i32 = COPY_I32 %0:i64, implicit-def $arguments
# | - operand 1:   %0:i64
# | Expected a I32 register, but got a I64 register
# | LLVM ERROR: Found 1 machine code errors.
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace and instructions to reproduce the bug.
# | Stack dump:
# | 0.	Program arguments: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=wasm64-unknown-unknown -mattr=+reference-types
# | 1.	Running pass 'Function Pass Manager' on module '<stdin>'.
# | 2.	Running pass 'Verify generated machine code' on function '@cast_const_not_funcptr'
# |  #0 0x00005b54a8ca1488 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc+0x80b8488)
# |  #1 0x00005b54a8c9eb95 llvm::sys::RunSignalHandlers() (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc+0x80b5b95)
# |  #2 0x00005b54a8ca2251 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
# |  #3 0x0000760d6bc45330 (/lib/x86_64-linux-gnu/libc.so.6+0x45330)
# |  #4 0x0000760d6bc9eb2c pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x9eb2c)
# |  #5 0x0000760d6bc4527e raise (/lib/x86_64-linux-gnu/libc.so.6+0x4527e)
# |  #6 0x0000760d6bc288ff abort (/lib/x86_64-linux-gnu/libc.so.6+0x288ff)
# |  #7 0x00005b54a8c04ff5 llvm::report_fatal_error(llvm::Twine const&, bool) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc+0x801bff5)
# |  #8 0x00005b54a7dc45fb (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc+0x71db5fb)
# |  #9 0x00005b54a7dc55db (anonymous namespace)::MachineVerifierLegacyPass::runOnMachineFunction(llvm::MachineFunction&) MachineVerifier.cpp:0:0
# | #10 0x00005b54a7ca0d43 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc+0x70b7d43)
...

@dschuff
Copy link
Member

dschuff commented Dec 5, 2025

Looks like this test is failing in the LLVM configuration with the expensive checks enabled. I'm going to go ahead and revert it for now. If you build with that configuration (see the EXPENSIVE_CHECKS flag in https://llvm.org/docs/CMake.html) it should be pretty easy to reproduce. Then we can reland it.

dschuff added a commit that referenced this pull request Dec 5, 2025
dschuff added a commit that referenced this pull request Dec 5, 2025
Reverts #166820
There was a failure in the ENABLE_EXPENSIVE_CHECKS configuration.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Dec 5, 2025
…(#170785)

Reverts llvm/llvm-project#166820
There was a failure in the ENABLE_EXPENSIVE_CHECKS configuration.
@QuantumSegfault
Copy link
Contributor Author

I see what's happening.

There's a mismatch in Wasm64 between the table types, the type TABLE_GET expects, and what I'm giving it.

TABLE_GET only ever expects i32 as an index. However, in Wasm64, tables can have either i32 or i64 indices. However we aren't consistent about it. It seems that the __indirect_function_table is using i64 indices, but other module-defined ones are still i32. I was trying to pass an i64 index in, but TABLE_GET can't support that.

Should we move to i64 across the board (why wasn't this done?)? That way we can make TABLE_GET and TABLE_SET address mode aware.

Either that or we need a different ISD node to be 64-bit aware while the existing one remains not.

@dschuff
Copy link
Member

dschuff commented Dec 5, 2025

+cc @sbc100
The original version of the memory64 proposal didn't actually have 64-bit-addressed tables at all (because nobody really needs tables with more than 4G entries). But table64 was added later in the process to allow producers to be consistent in their address types (and allow a small improvement in code size, since e.g. LLVM previously stored all the pointers as 64 bit but then had to truncate them before every indirect call). So LLVM moved to using 64 bit __indirect_function_table but didn't follow up with the other module-defined tables.

So the options are basically as you say. It seems like it would probably be more straightforward to move to table64 across the board for wasm64 and then have everything be architecture/address-mode dependent as we do for pointers. In principle we could also have 32-bit module-defined tables but there doesn't seem a lot of benefit or need for that right now AFAIK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants