Skip to content

Conversation

@hmelder
Copy link
Contributor

@hmelder hmelder commented Nov 21, 2025

This pull request adds initial support for compiling Objective-C to WebAssembly. I tested my changes with libobjc2 and the swift-corelibs-blocksruntime.

There are two outstanding issues, which I cannot fix as deeper knowledge of the subsystems is required:

  1. Symbols marked as explicitly hidden in code generation are exported
  2. Clang crashes in SelectionDAG when compiling an Objective-C try/catch block with -fwasm-exceptions

First Issue

Emscripten is processing the generated .wasm file in emscripten.py and checks if all exported symbols are valid javascript identifiers (tools/js_manipulation.py#L104). However, hidden symbols such as .objc_init are intentionally an invalid C identifier.

The core of the problem is that symbols with the WASM_SYMBOL_NO_STRIP attribute are exported when targeting Emscripten (https://reviews.llvm.org/D62542). This attribute is added to the symbol during relocation in WasmObjectWriter::recordRelocation. So we are accidentally exporting a lot of hidden symbols and not only ones generated by ObjC CG...

I'm currently hacking around this by not exporting no-strip symbols. This is the default behaviour for Wasm.

Second Issue

Here is a minimal example that triggers the crash.

#include<stdio.h>

int main(void) {
	int ret = 0;
	@try {
	}
	@catch (id a)
	{
		ret = 1;
                 puts("abc");
	}

	return ret;
}

The following assertion is triggered:

clang: /home/vm/llvm-project/llvm/lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp:124: void llvm::WebAssemblyExceptionInfo::recalculate(MachineFunction &, MachineDominatorTree &, const MachineDominanceFrontier &): Assertion `EHInfo' failed.

Here is the crash report main-c3884.zip.

You can use emcc with a modified LLVM build by exporting EM_LLVM_ROOT before sourcing emsdk/emsdk_env.sh:

emcc -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

or just invoke clang directly:

/home/vm/llvm-build-wasm/bin/clang -target wasm32-unknown-emscripten -mllvm -combiner-global-alias-analysis=false -mllvm -wasm-enable-sjlj -mllvm -wasm-use-legacy-eh=false -mllvm -disable-lsr --sysroot=/home/vm/emsdk/upstream/emscripten/cache/sysroot -DEMSCRIPTEN -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

Building libobjc2 and the BlocksRuntime

Building the BlocksRuntime

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DCMAKE_BUILD_TYPE=Debug -B build -G Ninja

Building libobjc2

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DBlocksRuntime_LIBRARIES=/home/vm/demo-install/lib/libBlocksRuntime.a -DBlocksRuntime_INCLUDE_DIR=/home/vm/demo-install/include/BlocksRuntime -DEMBEDDED_BLOCKS_RUNTIME=OFF -DTESTS=OFF  -B build  -DCMAKE_BUILD_TYPE=Debug  -G Ninja

@llvmbot llvmbot added backend:WebAssembly clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:mc Machine (object) code labels Nov 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-backend-webassembly

Author: Hugo Melder (hmelder)

Changes

This pull request adds initial support for compiling Objective-C to WebAssembly. I tested my changes with libobjc2 and the swift-corelibs-blocksruntime.

There are two outstanding issues, which I cannot fix as deeper knowledge of the subsystems is required:

  1. Symbols marked as explicitly hidden in code generation are exported
  2. Clang crashes in SelectionDAG when compiling an Objective-C try/catch block with -fwasm-exceptions

First Issue

Emscripten is processing the generated .wasm file in emscripten.py and checks if all exported symbols are valid javascript identifiers (tools/js_manipulation.py#L104). However, hidden symbols such as .objc_init are intentionally an invalid C identifier.

The core of the problem is that symbols with the WASM_SYMBOL_NO_STRIP attribute are exported when targeting Emscripten (https://reviews.llvm.org/D62542). This attribute is added to the symbol during relocation in WasmObjectWriter::recordRelocation. So we are accidentally exporting a lot of hidden symbols and not only ones generated by ObjC CG...

I'm currently hacking around this by not exporting no-strip symbols. This is the default behaviour for Wasm.

Second Issue

Here is a minimal example that triggers the crash.

#include&lt;stdio.h&gt;

int main(void) {
	int ret = 0;
	@<!-- -->try {
	}
	@<!-- -->catch (id a)
	{
		ret = 1;
                 puts("abc");
	}

	return ret;
}

The following assertion is triggered:

clang: /home/vm/llvm-project/llvm/lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp:124: void llvm::WebAssemblyExceptionInfo::recalculate(MachineFunction &amp;, MachineDominatorTree &amp;, const MachineDominanceFrontier &amp;): Assertion `EHInfo' failed.

Here is the crash report main-c3884.zip.

You can use emcc with a modified LLVM build by exporting EM_LLVM_ROOT before sourcing emsdk/emsdk_env.sh:

emcc -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

or just invoke clang directly:

/home/vm/llvm-build-wasm/bin/clang -target wasm32-unknown-emscripten -mllvm -combiner-global-alias-analysis=false -mllvm -wasm-enable-sjlj -mllvm -wasm-use-legacy-eh=false -mllvm -disable-lsr --sysroot=/home/vm/emsdk/upstream/emscripten/cache/sysroot -DEMSCRIPTEN -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

Building libobjc2 and the BlocksRuntime

Building the BlocksRuntime

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DCMAKE_BUILD_TYPE=Debug -B build -G Ninja

Building libobjc2

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DBlocksRuntime_LIBRARIES=/home/vm/demo-install/lib/libBlocksRuntime.a -DBlocksRuntime_INCLUDE_DIR=/home/vm/demo-install/include/BlocksRuntime -DEMBEDDED_BLOCKS_RUNTIME=OFF -DTESTS=OFF  -B build  -DCMAKE_BUILD_TYPE=Debug  -G Ninja

Full diff: https://github.com/llvm/llvm-project/pull/169043.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGObjCGNU.cpp (+10-4)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+2-1)
  • (modified) llvm/lib/MC/WasmObjectWriter.cpp (-3)
diff --git a/clang/lib/CodeGen/CGObjCGNU.cpp b/clang/lib/CodeGen/CGObjCGNU.cpp
index 06643d4bdc211..3b9f9f306829d 100644
--- a/clang/lib/CodeGen/CGObjCGNU.cpp
+++ b/clang/lib/CodeGen/CGObjCGNU.cpp
@@ -179,8 +179,15 @@ class CGObjCGNU : public CGObjCRuntime {
       (R.getVersion() >= VersionTuple(major, minor));
   }
 
-  std::string ManglePublicSymbol(StringRef Name) {
-    return (StringRef(CGM.getTriple().isOSBinFormatCOFF() ? "$_" : "._") + Name).str();
+  const std::string ManglePublicSymbol(StringRef Name) {
+    auto triple = CGM.getTriple();
+
+    // Exported symbols in Emscripten must be a valid Javascript identifier.
+    if (triple.isOSBinFormatCOFF() || triple.isOSBinFormatWasm()) {
+      return (StringRef("$_") + Name).str();
+    } else {
+      return (StringRef("._") + Name).str();
+    }
   }
 
   std::string SymbolForProtocol(Twine Name) {
@@ -4106,8 +4113,7 @@ llvm::Function *CGObjCGNU::ModuleInitFunction() {
   if (!ClassAliases.empty()) {
     llvm::Type *ArgTypes[2] = {PtrTy, PtrToInt8Ty};
     llvm::FunctionType *RegisterAliasTy =
-      llvm::FunctionType::get(Builder.getVoidTy(),
-                              ArgTypes, false);
+        llvm::FunctionType::get(BoolTy, ArgTypes, false);
     llvm::Function *RegisterAlias = llvm::Function::Create(
       RegisterAliasTy,
       llvm::GlobalValue::ExternalWeakLinkage, "class_registerAlias_np",
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 30d3e5293a31b..6cbec5e17ae1a 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -8001,7 +8001,8 @@ ObjCRuntime Clang::AddObjCRuntimeArgs(const ArgList &args,
     if ((runtime.getKind() == ObjCRuntime::GNUstep) &&
         (runtime.getVersion() >= VersionTuple(2, 0)))
       if (!getToolChain().getTriple().isOSBinFormatELF() &&
-          !getToolChain().getTriple().isOSBinFormatCOFF()) {
+          !getToolChain().getTriple().isOSBinFormatCOFF() &&
+          !getToolChain().getTriple().isOSBinFormatWasm()) {
         getToolChain().getDriver().Diag(
             diag::err_drv_gnustep_objc_runtime_incompatible_binary)
           << runtime.getVersion().getMajor();
diff --git a/llvm/lib/MC/WasmObjectWriter.cpp b/llvm/lib/MC/WasmObjectWriter.cpp
index 15590b31fd07f..d882146e21b8a 100644
--- a/llvm/lib/MC/WasmObjectWriter.cpp
+++ b/llvm/lib/MC/WasmObjectWriter.cpp
@@ -1794,9 +1794,6 @@ uint64_t WasmObjectWriter::writeOneObject(MCAssembler &Asm,
       Flags |= wasm::WASM_SYMBOL_UNDEFINED;
     if (WS.isNoStrip()) {
       Flags |= wasm::WASM_SYMBOL_NO_STRIP;
-      if (isEmscripten()) {
-        Flags |= wasm::WASM_SYMBOL_EXPORTED;
-      }
     }
     if (WS.hasImportName())
       Flags |= wasm::WASM_SYMBOL_EXPLICIT_NAME;

@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2025

@llvm/pr-subscribers-llvm-mc

Author: Hugo Melder (hmelder)

Changes

This pull request adds initial support for compiling Objective-C to WebAssembly. I tested my changes with libobjc2 and the swift-corelibs-blocksruntime.

There are two outstanding issues, which I cannot fix as deeper knowledge of the subsystems is required:

  1. Symbols marked as explicitly hidden in code generation are exported
  2. Clang crashes in SelectionDAG when compiling an Objective-C try/catch block with -fwasm-exceptions

First Issue

Emscripten is processing the generated .wasm file in emscripten.py and checks if all exported symbols are valid javascript identifiers (tools/js_manipulation.py#L104). However, hidden symbols such as .objc_init are intentionally an invalid C identifier.

The core of the problem is that symbols with the WASM_SYMBOL_NO_STRIP attribute are exported when targeting Emscripten (https://reviews.llvm.org/D62542). This attribute is added to the symbol during relocation in WasmObjectWriter::recordRelocation. So we are accidentally exporting a lot of hidden symbols and not only ones generated by ObjC CG...

I'm currently hacking around this by not exporting no-strip symbols. This is the default behaviour for Wasm.

Second Issue

Here is a minimal example that triggers the crash.

#include&lt;stdio.h&gt;

int main(void) {
	int ret = 0;
	@<!-- -->try {
	}
	@<!-- -->catch (id a)
	{
		ret = 1;
                 puts("abc");
	}

	return ret;
}

The following assertion is triggered:

clang: /home/vm/llvm-project/llvm/lib/Target/WebAssembly/WebAssemblyExceptionInfo.cpp:124: void llvm::WebAssemblyExceptionInfo::recalculate(MachineFunction &amp;, MachineDominatorTree &amp;, const MachineDominanceFrontier &amp;): Assertion `EHInfo' failed.

Here is the crash report main-c3884.zip.

You can use emcc with a modified LLVM build by exporting EM_LLVM_ROOT before sourcing emsdk/emsdk_env.sh:

emcc -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

or just invoke clang directly:

/home/vm/llvm-build-wasm/bin/clang -target wasm32-unknown-emscripten -mllvm -combiner-global-alias-analysis=false -mllvm -wasm-enable-sjlj -mllvm -wasm-use-legacy-eh=false -mllvm -disable-lsr --sysroot=/home/vm/emsdk/upstream/emscripten/cache/sysroot -DEMSCRIPTEN -fobjc-runtime=gnustep-2.2 -fwasm-exceptions -c main.m

Building libobjc2 and the BlocksRuntime

Building the BlocksRuntime

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DCMAKE_BUILD_TYPE=Debug -B build -G Ninja

Building libobjc2

cmake -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake   -DCMAKE_INSTALL_PREFIX=/home/vm/demo-install -DBlocksRuntime_LIBRARIES=/home/vm/demo-install/lib/libBlocksRuntime.a -DBlocksRuntime_INCLUDE_DIR=/home/vm/demo-install/include/BlocksRuntime -DEMBEDDED_BLOCKS_RUNTIME=OFF -DTESTS=OFF  -B build  -DCMAKE_BUILD_TYPE=Debug  -G Ninja

Full diff: https://github.com/llvm/llvm-project/pull/169043.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGObjCGNU.cpp (+10-4)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+2-1)
  • (modified) llvm/lib/MC/WasmObjectWriter.cpp (-3)
diff --git a/clang/lib/CodeGen/CGObjCGNU.cpp b/clang/lib/CodeGen/CGObjCGNU.cpp
index 06643d4bdc211..3b9f9f306829d 100644
--- a/clang/lib/CodeGen/CGObjCGNU.cpp
+++ b/clang/lib/CodeGen/CGObjCGNU.cpp
@@ -179,8 +179,15 @@ class CGObjCGNU : public CGObjCRuntime {
       (R.getVersion() >= VersionTuple(major, minor));
   }
 
-  std::string ManglePublicSymbol(StringRef Name) {
-    return (StringRef(CGM.getTriple().isOSBinFormatCOFF() ? "$_" : "._") + Name).str();
+  const std::string ManglePublicSymbol(StringRef Name) {
+    auto triple = CGM.getTriple();
+
+    // Exported symbols in Emscripten must be a valid Javascript identifier.
+    if (triple.isOSBinFormatCOFF() || triple.isOSBinFormatWasm()) {
+      return (StringRef("$_") + Name).str();
+    } else {
+      return (StringRef("._") + Name).str();
+    }
   }
 
   std::string SymbolForProtocol(Twine Name) {
@@ -4106,8 +4113,7 @@ llvm::Function *CGObjCGNU::ModuleInitFunction() {
   if (!ClassAliases.empty()) {
     llvm::Type *ArgTypes[2] = {PtrTy, PtrToInt8Ty};
     llvm::FunctionType *RegisterAliasTy =
-      llvm::FunctionType::get(Builder.getVoidTy(),
-                              ArgTypes, false);
+        llvm::FunctionType::get(BoolTy, ArgTypes, false);
     llvm::Function *RegisterAlias = llvm::Function::Create(
       RegisterAliasTy,
       llvm::GlobalValue::ExternalWeakLinkage, "class_registerAlias_np",
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 30d3e5293a31b..6cbec5e17ae1a 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -8001,7 +8001,8 @@ ObjCRuntime Clang::AddObjCRuntimeArgs(const ArgList &args,
     if ((runtime.getKind() == ObjCRuntime::GNUstep) &&
         (runtime.getVersion() >= VersionTuple(2, 0)))
       if (!getToolChain().getTriple().isOSBinFormatELF() &&
-          !getToolChain().getTriple().isOSBinFormatCOFF()) {
+          !getToolChain().getTriple().isOSBinFormatCOFF() &&
+          !getToolChain().getTriple().isOSBinFormatWasm()) {
         getToolChain().getDriver().Diag(
             diag::err_drv_gnustep_objc_runtime_incompatible_binary)
           << runtime.getVersion().getMajor();
diff --git a/llvm/lib/MC/WasmObjectWriter.cpp b/llvm/lib/MC/WasmObjectWriter.cpp
index 15590b31fd07f..d882146e21b8a 100644
--- a/llvm/lib/MC/WasmObjectWriter.cpp
+++ b/llvm/lib/MC/WasmObjectWriter.cpp
@@ -1794,9 +1794,6 @@ uint64_t WasmObjectWriter::writeOneObject(MCAssembler &Asm,
       Flags |= wasm::WASM_SYMBOL_UNDEFINED;
     if (WS.isNoStrip()) {
       Flags |= wasm::WASM_SYMBOL_NO_STRIP;
-      if (isEmscripten()) {
-        Flags |= wasm::WASM_SYMBOL_EXPORTED;
-      }
     }
     if (WS.hasImportName())
       Flags |= wasm::WASM_SYMBOL_EXPLICIT_NAME;

@github-actions
Copy link

github-actions bot commented Nov 21, 2025

🐧 Linux x64 Test Results

  • 111606 tests passed
  • 4467 tests skipped

@hmelder
Copy link
Contributor Author

hmelder commented Nov 28, 2025

@sunfishcode, I see that you are the original author of https://reviews.llvm.org/D62542. As @dschuff said in the review back then:

I'm hoping we can make that export behavior nicer soon; I find the attribute(used) -> export behavior a bit odd too. Once we drop fastcomp it will be easier to redefine EMSCRIPTEN_KEEPALIVE and other things.

This was in 2019, is this hack still required now that fastcomp is deprecated? Then problem with the current behaviour is that hidden no-strip symbols, added during codegen, are exported.

@hmelder
Copy link
Contributor Author

hmelder commented Nov 28, 2025

@davidchisnall the changes in codegen are trivial:

  1. Mangle public symbols with '$' instead of '.' as the latter is not a valid javascript identifier.
  2. Fix the function signature of class_registerAlias_np to return a bool instead of void.

@hmelder
Copy link
Contributor Author

hmelder commented Nov 28, 2025

Assuming that the new WASM exception implementation implements the mandatory functions and data structure of the Itanium EH ABI correctly, not much needs to be done to get EH working with libobjc2. I just need to find the root course of the crash...

Copy link
Contributor

@davidchisnall davidchisnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Objective-C bits look fine to me, the MC bit possibly should be a separate PR.

@github-actions
Copy link

github-actions bot commented Nov 28, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Nov 28, 2025
Copy link
Contributor

@davidchisnall davidchisnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM once clang-format is happy.

We probably should have a test in tests/CodeGenObjC checking that the mangling is correct for WAsm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:WebAssembly clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category llvm:mc Machine (object) code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants