From 19a35b89e7c99d0df55d9c27d956c4ae3b3466e1 Mon Sep 17 00:00:00 2001 From: Dan Gohman Date: Mon, 6 Mar 2023 13:58:44 -0800 Subject: [PATCH 1/4] Add a toolchain-independent ABI document, and propose `_initialize` The Wasm ecosystem is currently not consistent in how "constructors" such as C++ static initializers and similar features in other languages are implemented, and the result is users reporting constructs running multiple times, and other users reporting constructors not getting run when they should. WASI has [defined a convention] using an exported function named `_initialize`, however not all users are using WASI conventions. In particular, users of what is sometimes called "wasm32-unknown-unknown" are not expecting to follow WASI conventions. However, they still have a need for constructors working in a reliable way. To address this, I propose moving this out of WASI and defining this as a toolchain-independent ABI, here in tool-conventions. This would recognize the `_initialize` function as the toolchain-independent way to ensure that constructors are properly called before other exports are accessed. Related activities ------------------ In the component model, there is a proposal to add a [second initialization phase]. If that's done, then component-model toolchains could arrange for this `_initialize` function to be called automatically by this second initialization mechanism. Considered alternatives ----------------------- It is tempting to use the [Wasm start function] for C++ constructors; this has been [extensively discussed], and the short answer is, the Wasm start function is often called at a time when the outside environment can't access the module's exports, and C++ constructors can run arbitrary user code which may generate calls to things that need to access the module's exports. It's also tempting to propose defining a second initialization phase in core Wasm. I'm not opposed to this, but it is more complex at the core Wasm level than at the component-model level. For example, in Emscripten, Wasm modules depend on JS code being able to run after the exports are available but before the initialization function is called, which wouldn't be possible if we simply call the initilaization function as part of the linking step. Wasm-ld has a [`__wasm_call_ctors` function], and in theory we could use that name instead of `_initialize`, but wasm-ld already does insert some initialization in addition to just constructors, so I think it makes sense to use `_initialize` as the exported function, which may call `__wasm_call_ctors` in its body. Process ------- We don't have a formal process defined for tool-convention proposals, but because this is proposal has potentially wide-ranging impacts, I propose to follow the following process: - I'm starting by posting this PR here, and people can comment on it. If a better alternative emerges, I'll close this PR. - After discussion here settles, if a better alternative hasn't emerged, I plan to request a CG meeting agenda item to present this topic to the CG, and seek feedback there, to ensure that it has CG-level visibility. - If the CG is in favor of it, then I'd propose we merge this PR. [defined a convention]: https://github.com/WebAssembly/WASI/blob/main/legacy/application-abi.md [second initialization phase]: https://github.com/WebAssembly/component-model/issues/146 [Wasm start function]: https://webassembly.github.io/spec/core/syntax/modules.html#syntax-start [extensively discussed]: https://github.com/WebAssembly/design/issues/1160 [`__wasm_call_ctors` function]: https://github.com/WebAssembly/tool-conventions/blob/main/Linking.md#start-section --- ToolchainIndependentABI.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) create mode 100644 ToolchainIndependentABI.md diff --git a/ToolchainIndependentABI.md b/ToolchainIndependentABI.md new file mode 100644 index 0000000..10d7151 --- /dev/null +++ b/ToolchainIndependentABI.md @@ -0,0 +1,16 @@ +Toolchain-independent ABI +========================= + +There are many different ways to use Wasm modules, and many different +conventions and toolchain-specific ABIs. This document describes ABI features +intended to be common across all ABIs. + +## The `_initialize` function + +If a module exports a function named `_initialize` with no arguments and no +return values, and does not export a function named `_start`, the toolchain +that produced my assume that on any instance of the module, this `_initialize` +function is called before any other exports are accessed. + +This is intended to support language features such as C++ static constructors, +as well as "top-level scripts" in many scripting languages. From 710ec66fe333667f3c8c4cac25197466bc7fed0a Mon Sep 17 00:00:00 2001 From: Dan Gohman Date: Tue, 7 Mar 2023 10:32:03 -0800 Subject: [PATCH 2/4] Rename to "Basic Module ABI". --- ToolchainIndependentABI.md => BasicModuleABI.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) rename ToolchainIndependentABI.md => BasicModuleABI.md (75%) diff --git a/ToolchainIndependentABI.md b/BasicModuleABI.md similarity index 75% rename from ToolchainIndependentABI.md rename to BasicModuleABI.md index 10d7151..3f2619e 100644 --- a/ToolchainIndependentABI.md +++ b/BasicModuleABI.md @@ -1,9 +1,9 @@ -Toolchain-independent ABI -========================= +Basic Module API +================ There are many different ways to use Wasm modules, and many different -conventions and toolchain-specific ABIs. This document describes ABI features -intended to be common across all ABIs. +conventions, language-specific ABIs, and toolchain-specific ABIs. This +document describes ABI features intended to be common across all ABIs. ## The `_initialize` function From 875879d0821e105538a2e0f841cf02dd2d610923 Mon Sep 17 00:00:00 2001 From: Dan Gohman Date: Tue, 7 Mar 2023 13:30:23 -0800 Subject: [PATCH 3/4] Update BasicModuleABI.md Co-authored-by: Derek Schuff --- BasicModuleABI.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/BasicModuleABI.md b/BasicModuleABI.md index 3f2619e..d2fc460 100644 --- a/BasicModuleABI.md +++ b/BasicModuleABI.md @@ -1,4 +1,4 @@ -Basic Module API +Basic Module ABI ================ There are many different ways to use Wasm modules, and many different From 8e073603d65e5c8a99955fde83cbe0732d250aee Mon Sep 17 00:00:00 2001 From: Dan Gohman Date: Tue, 7 Mar 2023 13:36:26 -0800 Subject: [PATCH 4/4] Explain when we can and can't use the wasm start function. --- BasicModuleABI.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/BasicModuleABI.md b/BasicModuleABI.md index d2fc460..afe8043 100644 --- a/BasicModuleABI.md +++ b/BasicModuleABI.md @@ -13,4 +13,9 @@ that produced my assume that on any instance of the module, this `_initialize` function is called before any other exports are accessed. This is intended to support language features such as C++ static constructors, -as well as "top-level scripts" in many scripting languages. +as well as "top-level scripts" in many scripting languages, which can't use +the [wasm start function] because they may call imports that need to access +the module's exports. In use cases that don't need this, the wasm start +function should be used. + +[wasm start section]: https://webassembly.github.io/spec/core/syntax/modules.html#syntax-start