-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for multi-call executables #281
Conversation
`_start` is the default export which is called when the user doesn't select a | ||
specific function to call. Commands may also export additional functions, | ||
(similar to "multi-call" executables), which may be explicitly selected by the | ||
user to run instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What the use case for making this first class like this? Why not just do that multiplexing at a higher level like with the traditional argv0 trick?
Are all the possible entry points required to the have the same signature as start (void -> void)?
Do we need to mark these entry point exports in some way or are all function exports considered to be entry points? Either way this seems like quite big change. I may have missed the meeting where this was discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do both. When we add toolchain support for this, we can connect things such that when a program is called from an alternate entrypoint, it sets argv[0] accordingly. And this way, we can also support environments that don't have traditional argv strings.
Other entrypoints aren't limited to void->void. That'll also be up to toolchains to use.
I'm proposing all function exports from a command are entry points.
The multi-call part is new; I'm proposing it here.
The rest of the patch here is just explicitly stating assumptions that we're effectively already making. If the restrictions here feel too limiting for some use cases, it's possible those use cases don't actually want commands, in the sense used here, and instead want reactors.
design/application-abi.md
Outdated
user to run instead. | ||
|
||
Functions exported from a command are available to be called without a | ||
pre-existing instance. When they are called, the module is instantiated and used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make a lot of sense in terms of the current JS embedding where an instance is required to even get hold of the export. Even the current node wasi implementation requires in instance before its exports can be used. Perhaps we could re-word to keep the intent but allow for such implementations?
How about something like Functions exported from a command are required to run in a fresh instance each time they are called, and when the call returns, the instance is considered terminated and should not be accessed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point; I've reworded this section to phrase it in terms of what instances may assume rather than in terms of how instantiation actually works.
For compatibility with existing toolchains, modules may also export globals | ||
named `__heap_base` and `__data_end`. Environments shall not access them. | ||
This provision is deprecated and toolchains are encouraged to avoid providing | ||
these exports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this last paragraph is needed? Did we have some toolchain version where those symbols were exported by default? Presumably wasi modules are free export any number of additional things on top of what wasi specifies? If I'm wrong and additional exports are not permitted then we should explicitly state that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; they're exported by default in the Rust toolchain, due to a series of coincidences. I've now added additional wording clarifying this.
These requirements are just the subtyping rules of module types, right? I agree we should describe them in this friendly way, but it may be good to note that we're talking about the same thing as the module types/linking proposal discusses. |
It seems like there three parts to this change:
Maybe you can split out (1) and we can focus on (2) and (3) here? For (2), its not clear we have strong use case for this. Existing multi-call binaries are quite happy to go through argv. Any naive refactoring would still involve some kind of argv vector for the remaining arguments (busybox For (3), having the core wasi spec start to interpret all function exports as command entry points seems like quite a large change, and perhaps unnecessarily restrictive? Now my command with a single entry point is not able to export any other function without them being interpreted by the wasi runtime as alternative entry points. |
We have a use case in node for wasi binaries which use a _napi_init entry point (and don't have _start) |
This pulls out the parts of WebAssembly#281 which document existing practice.
Ok, I've now added #282 which splits out the parts which document existing practice and assumptions.
I have a use case that doesn't need traditional I believe it's important to let wasm's unique advantages shine through for applications which wish to take advantage of them. Compatibility is also important, and I expect we can achieve it in a reasonable way here. Would it help to discuss this in a meeting? I'd be happy to discuss it further.
If a command is calling out to an instance it imports from, and the instance is importing from the command to call one of its exports, that would create a cyclic dependency. We do do a little of this with "memory" and "__indirect_call_table", but it makes APIs non-polyfillable and depends on host-specific magic, and the goal is to move away from using those once interface types gives us better alternatives. So it's desirable in general to find other ways to do this. For example if you want the environment to call back in to a module to call its |
If your function is called init, aren't your defining a reactor? This discussion is more about commands (things with which can only be used once and then require new instance). |
Yeah, I think that would be good. I think this multi-command thing is kind of new entry in the space between reactors and commands. I'm sure your use case is reasonable but it would be good to hear a little more about it (assuming you can share) in the next meeting.
I agree, in general. I was mostly worried about over-specifying needlessly. Why not let modules with extra function exports validate as a valud wasi modules (as they do today). This PR seems to suggest that extra function exports are now meaningful as additional entry points when they were not before. I wonder if we can avoid this? How about and alternative: Any function export that begins with |
@devsnek If you want a
modules that don't have Modules that we run as commands today already don't really support extra non-entrypoint exports, because the tools already assume that you're not calling into an instance before or after the call to Is the problem that there are people with modules containing a |
Ah sorry I misunderstood the intention of multi-call. |
I'm not sure I understand the problem this presents. Wouldn't such user know right away they had made a mistake because they would get runtime errors like:
Isn't the solution is force such users to build as a reactor and avoid _start completely? How does adding mult-call modules help with users that have make this mistake? Or are you saying that such users could be transitioned to mutli-call rather than reactor? This makes me realize that multi-call has another issue in terms of program startup because each of the entry points would need to call the libc init and static constructor functions. I'm not sure we want to be exposing those details to user entry points. Having a single entry point mean that startup code exists in a single location. |
I made a mistake here in introducing muliticall along with doc changes that I expected were just making existing assumptions explicit, but the assumptions turned out to be more interesting than I thought, so I've now split them out into #282. Let's discuss that first. |
* Elaborate on the definitions of commands and reactors. This pulls out the parts of #281 which document existing practice. * Simplify the text about __heap_base and __data_end. We no longer need to say "applications may export these", but it's still useful to say that environments shouldn't access them.
This expands the definitions of command to allow alternate functions to be exported, in the manner of multi-call executables.
This also includes a description of the
__indirect_function_table
export, as well as the__heap_case
and__data_end
exports.