Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weak imports #36

Closed
sunfishcode opened this issue May 11, 2019 · 27 comments · Fixed by #47
Closed

Weak imports #36

sunfishcode opened this issue May 11, 2019 · 27 comments · Fixed by #47
Labels
discussion A discussion that doesn't yet have a specific conclusion or actionable proposal. module-system Issues target at the module system

Comments

@sunfishcode
Copy link
Member

sunfishcode commented May 11, 2019

Wasm doesn't have a "weak" import mechanism, which would allow one to (import "foo" "bar") where "bar" may or may not exist in "foo" (or perhaps where "foo" itself may not exist). This would be one way to allow feature detection in APIs.

As one possible solution, it's expected that the wasm GC proposal will eventually give wasm the concept of optional/nullable references, which would permit a kind of weak import where one imports global variables with reference type which can be either references to the desired entities or absent/null. However, GC may not be available in wasm for some time, and even when it is, it may not be implemented in all wasm engines, so it's not entirely obvious whether we should depend on it for something so fundamental to WASI.

Another is to essentially add a weak-import system on top of Wasm. For example, we could use a naming convention, such as a ".weak" suffix, to indicate a weak import, and a ".is_present" suffix on a global variable import to hold a boolean indicating whether a weak import was successfully resolved, like this:

  (func $foo (import "wasi_something" "foo.weak"))
  (global $do_we_have_foo (import "wasi_something" "foo.is_present") i32)

A similar scheme could enable detection of modules too:

   (func $foo (import "wasi_something.weak" "foo"))
   (global $do_we_have_wasi_something (import "wasi_something.is_present" "foo") i32)

or even both at the same time:

   (func $qux (import "wasi_something.weak" "bar.weak"))
   (global $do_we_have_it (import "wasi_something.is_present" "foo.is_present") i32)

Another option would be to have a dynamic API that takes a string, however while we can do that, passing around strings introduces some complexity and it seems like if have a naming convention for ".weak", we might as well have ".is_present" too.

I'm not attached to any of the details here, I'm mainly just looking to start a discussion.

@sunfishcode sunfishcode added discussion A discussion that doesn't yet have a specific conclusion or actionable proposal. module-system Issues target at the module system labels May 11, 2019
@programmerjake
Copy link
Contributor

I would assume that calling wasi_something.weak when it is not present would result in a trap.

@programmerjake
Copy link
Contributor

One solution is to support nullable references only for importing variables/functions as a subset of the GC proposal. That seems trivial enough that I would expect all WASM implementations to be able to easily implement that, since it doesn't actually require GC.

@devsnek
Copy link
Member

devsnek commented May 11, 2019

this definitely seems more like a general wasm problem than a wasi problem. i do like what programmerjake said.

@dumblob
Copy link

dumblob commented May 11, 2019

it may not be implemented in all wasm engines, so it's not entirely obvious whether we should depend on it for something so fundamental to WASI.

I'm definitely raising my hand for not making WASI dependent on a garbage collector. GC is meant to be fully optional and an efficient implementation will be a significant code base compared to the rest of WASM runtime.

In that sense, I'm supporting what @programmerjake proposed as it's exactly as @devsnek says - not a WASI problem, but rather a WASM problem.

Btw. I find any naming convention as wrong solution when proposed from the beginning as to me naming conventions try to solve rather some problems discovered later - and having two such conventions already in the standard makes a really bad impression. In my opinion weak imports shall be first class citizens in WASI/WASM.

@rossberg
Copy link
Member

FWIW, there has been a general sentiment that the typed function references proposal, which brings typed references into Wasm, should also include nullable ones. While I didn't have time yet to update the proposal, that's the current plan. So nullable refs are to be split off from the GC proposal already.

Whether they are the best way to provide feature detection is a different question. Dynamic feature detection is a dead end anyway, since you cannot opt into types that way. That, however, would be necessary for APIs that introduce new host types (e.g., imported reference types). So in general, you'll need link-time feature detection.

@sbc100
Copy link
Member

sbc100 commented May 12, 2019

I can think of two other possible solutions:

  1. Switch to a an interface query model where a WASI module only imports a single "query_interface" function which can retrieve the set of function pointers (table entries) for a given WASI module. All the types for the module would be known statically but the function addresses would be retrieved dynamically.

  2. Functions could be imported statically but as addresses (table entries) rather than functions:

(global $foo_addr (import "wasi_somthing" "foo_addr") i32)

These could then be compared with 0 (or some other fixes, invalid, table entry).

@rossberg
Copy link
Member

The general case I have in mind is an API that defines its own host reference types. Consider a file system module for example, which might provide a file type. If you want to avoid using anyref for everything (and thereby dynamic type checks in the API), then this type has to be imported into Wasm, to be even able to express the types of corresponding functions, whether on imports or on call_indirect statements. There is no way in which you can make that import dynamic.

(Also, I'd argue that the implication of requiring call_indirect for every API call is defeating half the purpose of the Wasm module system.)

The problem with dynamic feature checking in general is that it conflates program execution with program configuration. For scalable modular software it's better to keep those separate -- at least that's what all larger software systems do. Feature testing is a hack that was invented in languages that did not have any proper support for modularity or versioning at the time and where programs remained small enough to keep that tractable.

@titzer
Copy link

titzer commented May 13, 2019

If I understand Dan's proposal correctly, the ".weak" and ".is_resolved" suffixes would require no changes to wasm's import mechanism, it is a just a convention in WASI. (Presumably, as @programmerjake suggested, trying to call a non-resolved weak import would trap). That seems relatively clean, and preferable to crossing into the dynamic world of nullable function references. Those have some overhead, both in metadata and possibly runtime, whereas imports have less overhead since they are only bound once at instantiation. And I agree with @rossberg that import calls via references and indirection seem like the wrong workaround.

@sbc100
Copy link
Member

sbc100 commented May 13, 2019

Surely calling .is_present or .is_resolved is runtime still feature detection?

It does avoid the call_indirect and allow direct imports I agree is preferable. On the downside it does double the number of imports and require the embedder to create fake trapping functions for each weakly undefined functions (one per signature I guess?)

@sunfishcode
Copy link
Member Author

Yes, I meant to say above that an unresolved ".weak" function would trap.

And yes, ".is_present" enables dynamic feature detection. This is useful. Requiring programs to be split into modules for each API feature they need to detect would be a considerable burden, even on programs written in languages with modern module systems. if (global.get $foo.is_present) is dramatically simpler than putting code in a separate module, attempting instantiation, and catching the link failure.

I even wonder whether we could extend the ".weak" runtime detection to work for types too. Rough idea: Allow ".weak" on type import names, and then have any imported function with a signature that mentions an unresolved weak-import type trap. Unresolved weak-import types would otherwise act like anyref (after validation). It seems like that should be compatible with wasm's existing import mechanism as well.

@PoignardAzur
Copy link

I don't really understand the interest of a specialized weak import mechanism, given the possibility of a more general optional type for imports.

Would .weak and .is_present have use cases that optref wouldn't cover?

@sbc100
Copy link
Member

sbc100 commented May 17, 2019

My understanding is that this proposal is designed to work in lieu of a first class weak import in the core spec and that we could/should move over to spec'd weak imports once such a thing exists.

Its also a goal to avoid the extra level of indirection involved with call_indirect since these imports are essentially static. Its not clear to me that using call.ref will have a similar indirection overhead to call_indirect or not, but it seems likely.

@binji
Copy link
Member

binji commented May 17, 2019

call.ref on an optref should be faster than a call_indirect. call_indirect has to perform a table.get (with bounds check), null check, type comparison, and then indirect jump. call.ref would only have to check whether the reference is null, then indirect jump.

The weak import mechanism could be slightly faster since you can codegen a stub that traps instead of performing a null check, and you can use a direct jump.

@programmerjake
Copy link
Contributor

@binji If the optref is the result of a table.get with constant parameters then a cast, the amount of static analysis required to convert to a direct call is pretty minimal.

@binji
Copy link
Member

binji commented May 17, 2019

You wouldn't be able to do static analysis on a table.get, since there's no way to make an immutable table (though perhaps there should be).

@PoignardAzur
Copy link

PoignardAzur commented May 18, 2019

The weak import mechanism could be slightly faster since you can codegen a stub that traps instead of performing a null check, and you can use a direct jump.

Isn't that something VMs could also do with function optrefs?

I don't think there's anything in the proposal that says that null refs have to be encoded as a 0x0 pointer.

My understanding is that this proposal is designed to work in lieu of a first class weak import in the core spec and that we could/should move over to spec'd weak imports once such a thing exists.

Not sure how relevant this is then. Given their respective age and complexity, I'd expect the funcref proposal to be standardized way before WASI.

@titzer
Copy link

titzer commented May 21, 2019

Indeed, as @binji mentions, the fastest option would be the weak import mechanism where the embedder provides a stub that traps. The call sequence would be exactly the same as other imports; a load from an internal array and an indirect call.

@sunfishcode
Copy link
Member Author

Another concern with the optref in this context is that it isn't composable. It isn't "Option T" for any T, so it doesn't easily support things like optional i32 or optional optional funcref. And as @rossberg pointed out, it doesn't support other kinds of entities like imported types. While "WASI core" is entirely functions today, on one hand WASI will grow and include other kinds of things in the future, and on the other, API feature detection will be needed in many more places than just WASI.

Weak imports still leave the question of what to do with things other than functions. Perhaps:

  • global.get on an absent global:
    • Returns the "default" (zero) value for the type?
    • Traps? (tidier, but requires modifications to wasm core)
  • an absent memory:
    • Behaves like an empty memory?
    • Traps for all accesses including memory.size? (tidier, but requires modifications to wasm core)
  • an absent type:
    • Behaves like anyref?
    • Behaves like anyref and causes imported functions with that type in their signature to trap? (although maybe that's not actually important?)

This may be pointing in the direction of adding weak imports to the wasm core spec.

sbc100 added a commit that referenced this issue May 30, 2019
I wrote up this as initially proposed by Dan, using name mangling rather
than a custom section.  I believe there is still room to switch a custom
section if people feel strongly about it.

The main downsides I see of using such mangling are:

1. It is that reserves the ".weak" and ".is_present" suffixes
2. It might not play nicely with possible future attributes we might
   want to add to imports.

Fixes #36
sbc100 added a commit that referenced this issue May 30, 2019
I wrote up this as initially proposed by Dan, using name mangling rather
than a custom section.  I believe there is still room to switch a custom
section if people feel strongly about it.

The main downsides I see of using such mangling are:

1. It is that reserves the ".weak" and ".is_present" suffixes
2. It might not play nicely with possible future attributes we might
   want to add to imports.

Fixes #36
sbc100 added a commit that referenced this issue May 30, 2019
I wrote up this as initially proposed by Dan, using name mangling rather
than a custom section.  I believe there is still room to switch a custom
section if people feel strongly about it.

The main downsides I see of using such mangling are:

1. It is that reserves the ".weak" and ".is_present" suffixes
2. It might not play nicely with possible future attributes we might
   want to add to imports.

Fixes #36
@sbc100
Copy link
Member

sbc100 commented Jul 11, 2019

I've been thinking about and alternative to #47 based a custom section rather than magic name mangling. My initial plan was to enumerate all the weak import in a list. Each element would be module and and function import name, and a corresponding global import to signal presence of he function. One of the downsides of this approach is that it requires 3x as much space in the binary for each weak import (since it names once in the import section and twice in the custom section). It also doesn't avoid the name mangling issue, since the "is_present" global will still need to be added to the imports section, possibly colliding with existing imports.

So using a custom section doesn't get us away from adding some form of name mangling.

Also, given a name mangling scheme such as the ".is_present" and/or ".weak" suffixes its already simple for the embedder to enumerate all the weak imports, and therefore enumerating them in the custom section seems redundant.

So here are the current directions that I think could work:

  1. Use the existence of the foo.is_present import to signal 'foo as a weak import. Here a function foo would be imported as normal, but if foo.is_present is also imported then foo is implicitly weak.
  2. Same as (1) but additionally mangle foo with the .weak suffix to be more explicit. This is what is currently in Add a writeup of current optional imports design #47.
  3. Same as (1) or (2) but specify the mangling suffixes/prefixes in a custom section rather then hard-coding them in the spec.
  4. Explicitly list each weak import and its corresponding global import by name in a custom section. No naming rules needed.
  5. Explicitly list each weak import in the custom section. Imply the existence of a mangled is_present global import for each weak import. Possibly specify the suffix/prefix in the custom section rather than spec'ing it?

Right now I'm leaning back towards (2) or possibly (3) (in order to avoid spec'ing the name mangling).

Thoughts?

@pchickey
Copy link
Contributor

@sbc100: I'm curious why the import functions and globals need to be specified by name in the custom section, rather than by index

@devsnek
Copy link
Member

devsnek commented Jul 11, 2019

is it unlikely that WebAssembly/design#1281 is going to move forward?

@sbc100
Copy link
Member

sbc100 commented Jul 11, 2019

@sbc100: I'm curious why the import functions and globals need to be specified by name in the custom section, rather than by index

Good question, I seem to remember @lukewagner had opinions on this. It seems like it would be strange to invent an import index space, but I suppose we could use function and/global index space. Luke can you remember your arguments against using indexes here?

@sbc100
Copy link
Member

sbc100 commented Jul 11, 2019

is it unlikely that WebAssembly/design#1281 is going to move forward?

Not in the timeframe that we want for WASI I fear. Also, that fact that we can do this in userspace doesn't help our case for building something first class.

@devsnek
Copy link
Member

devsnek commented Jul 11, 2019

As long as it is planned to move forward, I feel like we should work on that instead. Coming up with our own system seems antithetical to Wasm's design goals.

@rossberg
Copy link
Member

I suppose we could use function and/global index space.

Index spaces are an internal naming mechanism, and as such, an implementation detail of a module that is not observable externally. A custom section mechanism using them would require special access to module internals and would not be polyfillable.

@pchickey
Copy link
Contributor

Got it, thanks @rossberg. I am happiest with option 4 then - I'd prefer to put names in the custom section than defining some way to mangle names.

@guybedford
Copy link
Contributor

Not in the timeframe that we want for WASI I fear. Also, that fact that we can do this in userspace doesn't help our case for building something first class.

I'm not sure I follow this logic - what is the primary friction seen to be? For example implementing this in Node.js would be trivial.

sbc100 added a commit that referenced this issue Aug 14, 2019
I wrote up this as initially proposed by Dan, using name mangling rather
than a custom section.  I believe there is still room to switch a custom
section if people feel strongly about it.

The main downsides I see of using such mangling are:

1. It is that reserves the ".weak" and ".is_present" suffixes
2. It might not play nicely with possible future attributes we might
   want to add to imports.

Fixes #36
sunfishcode pushed a commit that referenced this issue Sep 4, 2019
I wrote up this as initially proposed by Dan, using name mangling rather
than a custom section.  I believe there is still room to switch a custom
section if people feel strongly about it.

The main downsides I see of using such mangling are:

1. It is that reserves the ".weak" and ".is_present" suffixes
2. It might not play nicely with possible future attributes we might
   want to add to imports.

Fixes #36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion A discussion that doesn't yet have a specific conclusion or actionable proposal. module-system Issues target at the module system
Projects
None yet
Development

Successfully merging a pull request may close this issue.