Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebAssembly Response API / Web Embedding #167

Closed
3 of 5 tasks
flagxor opened this issue Apr 12, 2017 · 20 comments
Closed
3 of 5 tasks

WebAssembly Response API / Web Embedding #167

flagxor opened this issue Apr 12, 2017 · 20 comments
Assignees
Labels
Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review

Comments

@flagxor
Copy link

flagxor commented Apr 12, 2017

Hello TAG!

I'm requesting a TAG review of:

Typical usage pattern:

var memory = new WebAssembly.Memory({initial: 15});
WebAssembly.instantiateStreaming(fetch('https://wasmurl...'), 
                        {memory: memory}
).then((instance) => {
  instance.foo(123);
});
  • Primary contacts: flagxor, mtrofin

Further details (optional):

We'd prefer the TAG provide feedback as (please select one):

  • open issues in our Github repo for each point of feedback
    https://github.com/WebAssembly/design/issues
  • open a single issue in our Github repo for the entire review
  • leave review feedback as a comment in this issue and @-notify [github usernames]
@cynthia
Copy link
Member

cynthia commented Apr 29, 2017

From the document, it seems like CompileError doesn't contain any information. Related to this - another question we had is whether or not it there are other cases where this can happen aside from the ones listed. (incorrect types)

A higher level question (from @slightlyoff) is whether the concept of compiling should be exposed to the user or not.

@travisleithead
Copy link
Contributor

Just a syntax note for clarity in the doc:
This: Promise<{module:WebAssembly.Module, instance:WebAssembly.Instance}> needs an intermediary definition to be more clearly understood. Since you've got one foot in the door using WebIDL, I'd recommend:

dictionary WebAssemblyInstantiatedSource {
   required WebAssemblyModule module;
   required WebAssemblyInstance instance;
};

Promise<WebAssemblyInstantiatedSource> instantiate(Response source, optional ?? importObject)

@torgo
Copy link
Member

torgo commented Apr 29, 2017

We're still working on this. We're going to file issues on their tracker.

@flagxor
Copy link
Author

flagxor commented May 18, 2017

We've revised our doc to match the syntax you suggested (thanks!).
We've also gone with separate names instantiateStreaming + compileStreaming.

It's an interesting point that CompileError doesn't tell you more about the failure. I suppose we could expose the byte offset that fails to validate, though that might open up a bunch of subtly around how precise it should be (i.e. the position of the opcode, or the immediate). Seems like something we could add if there's a use case?

The reason we want CompileError separate from other errors is to highlight that there was a particular sort of invalid input (i.e. the bytes don't validate), as opposed to say a TypeError if you pass an unexpected input.
As we expand the format, the intention was to use validation to allow feature testing (compile a small module to check if SIMD opcodes are supported, that sort of thing).

The higher level question of if compilation itself should be user visible is tricky. We've offered explicit compilation because this is a domain where performance characteristics can't be abstracted away (or else we'd just be targeting JS). On the other hand, that constrains implementer flexibility.

The recent experience with asm.js, which was basically a bag of implicitly specified behavior, suggests to me, however, that if developers desperately want a particular behavior, giving them a way to ask for it is better. Without that, they'll instead figure out what each browser does, user agent detect, and tailor their input to each engine, sometimes even doing that client side (making changing the implementation very hard). Much of the asm.js in the wild does textual search and replace + eval to "strip out" float32 use because this wasn't fast in chrome at some point.
There may be contexts where actual compilation may not be something we can offer, but not having the strong hint seems worse.
We have had a some discussion about adding a parameter to specify a preferred compilation strategy (lazy, AOT, shareable etc.) at some point. I think that seems like a better path if someone wants to ask for non-AOT.

@ghost
Copy link

ghost commented May 18, 2017

We've offered explicit compilation because this is a domain where performance characteristics can't be abstracted away (or else we'd just be targeting JS).

The proposed design also limits what is possible in terms of performance, limits the capability to schedule the download and compilation, limits the capability to cache the data along the pipeline.

If a web browser wants to cache the compiled code resulting from a streaming compilation it helps a lot if it can describe the inputs without comparing their entire content, for example a URL and version.

If the input to the streaming compilation comes from JS code then these inputs could come from anywhere so the products could not be cached using named resources, this badly limits caching.

Further if the input comes from JS code then the web browser is not in control of when that code generation (or translation etc) is run and scheduled. It can not re-run it as it needs to, for example if it has decided to flush some data from a cache and needs to re-generate it.

Much of the asm.js in the wild does textual search and replace + eval to "strip out" float32 use because this wasn't fast in chrome at some point.

So a transform stage is expected to be required. The proposed design does not support this use case well, it does not allow the caching of the output based on the versions of inputs, and it does not allow the web browser to schedule the translation and compilation based on what it knows (information that may not be practical to expose for security and privacy matters).

Lets say wasm is shown to be rather flawed after more research and experience. We don't want the web to be stuck here. If there is a translation stage that works well then there is a path to change.

With a good design, a cloud proxy might even pre-evaluate and cache the compilation pipeline. This could potential go a far as delivering code in a proprietary encoding for a particular client side web browser component. There are precedents for this type of optimization already.

An interface to allow a procedural JS context to control the compilation can not solve the use cases as it just can not be given the information and control needed to work well for security and privacy reasons - it should be possible to draw that conclusion now. A build manifest resource is needed.

Lets put some thought into the architecture now to set the stage for future work.

@flagxor
Copy link
Author

flagxor commented May 23, 2017

The proposed design also limits what is possible in terms of performance, limits the capability to schedule the download and compilation, limits the capability to cache the data along the pipeline.

If a web browser wants to cache the compiled code resulting from a streaming compilation it helps a lot if it can describe the inputs without comparing their entire content, for example a URL and version.

Service workers allow a mechanism to intercede and rewrite streams as downloaded, while still keeping them URL bound. URLs streamed and rewritten by a service worker can be cached until either the URL changes or the service worker that did the rewriting. This seems nice in that it works the same for other resources too (imagine an image format converter).

An interface to allow a procedural JS context to control the compilation can not solve the use cases as it just can not be given the information and control needed to work well for security and privacy reasons - it should be possible to draw that conclusion now. A build manifest resource is needed.

The foreign fetch proposal is one possible general path to provide rewriting that applies to other web resources as well (it allows a service worker attached to cross origin requests to get the chance to rewrite content).
https://developers.google.com/web/updates/2016/09/foreign-fetch
I'm not sure what you mean by a build manifest? If you mean some sort of list of alternate urls for wasm resources depending on what a browser implements, this might be useful, but also has the potential to be insufficiently general to encompass future feature testing needs. It seems likely that as future features or format changes are added, that natural patterns to express detecting them will emerge. Predesigning them, we'd likely guess wrong about what degrees of freedom are needed.

@ghost
Copy link

ghost commented May 23, 2017

Service workers allow a mechanism to intercede and rewrite streams as downloaded, while still keeping them URL bound. URLs streamed and rewritten by a service worker can be cached until either the URL changes or the service worker that did the rewriting. This seems nice in that it works the same for other resources too (imagine an image format converter).

If a service work can rewrite the wasm binary then the web browser can not control the caching. It can not know what inputs the product (the rewritten stream) depends on. It might know that if the up stream source changes that the rewritten output has potentially changed, but it can not know that the rewritten output has not change so can not cache the downstream products of that rewrite using upstream URLs and versions alone rather it would be forced to use the rewritten stream as a key and given that these could be large blobs that approach is a far greater burden.

It appears a requirement of effective caching that the rewriting depends on inputs that can be used as effective cache keys, and on those alone. Note that the key might be a provenance trail, for example the input to the compiler might be the key (source URL A version abc, rewritten with source URL B version abc and constants i, j and k). The means that the code that does the rewriting needs to be isolated from other inputs and for any inputs to be well defined (there is a little more to it), and for the web browser to be able to depend on this the web browser needs to isolate such code. The current architecture and framework is not going to work.

By a 'build manifest' is meant something like a Makefile that specifies the source inputs to the data flow such as a URL, and the stages of rewriting along the data flow path. The web browser can then schedule the work and effectively cache the intermediate products or the compiled code as it chooses, and it can re-run these as needed to regenerate code.

We don't need to speculate about what future feature detection is needed, we know what is required to efficiently implement caching with rewriting in general and should design for that now.

We can test proposals using existing web browser support. For example, can wasm be rewritten to asm.js and can caching work effectively. Can the compiled asm.js code be cached using the wasm source URL and version plus the source and version of the code that does the rewriting alone. Can the web browser choose to flush this cache and re-run the pipeline as needed. Can this cached computation be shared safely across origins. These are some tests by which a proposal can be evaluated now, and the proposals being submitted here do not meet these tests.

There are other resource scheduling issues too, for example the wasm code might need a large linear block of memory for its linear memory and it would be more efficient to allocate that after compilation. Compiling on multiple threads can consume a lot of memory, and pre-allocating the linear memory can mean that code will just not run on low memory devices.

Some of the inputs to the pipeline might also be implementation specific matters, and might even be speculative requiring the web browser to re-run the pipeline in on a miss.

If this architecture were addressed well now then it would create an efficient framework for future improvement and it would support some variation between web browsers and thus some experimentation and that seems much more healthy for the web, so this seems very important to get right.

Another test might be: can I promote an alternative web browser that uses a wasm variant with some different performance characteristics and that consumes a different upstream binary encoding, but that can also rewrite to wasm with only a small overhead. An effective pipeline is needed to make that practical.

We still do not have high performance AOT compilers, or better is quite likely, and they many well be able to consumer alternative encodings with a smaller and faster compiler.

Since this is directed to the TAG and the use case of images was mentioned above, I'll add that the data flow description might be more generally useful, but the big difference with wasm is the size of the resources as these are not small images. Would also add that some of the processing stages might not be described by a URL and version, rather might be defined in terms of the data formats of the inputs and/or outputs and the tasks the stage performs, and this might allow the web browser and/or user to substitute some alternatives for these stages - for example to fix issues not anticipated by the web developer or to use an alternative optimized for the web browser or an alternative for development work etc.

@cynthia
Copy link
Member

cynthia commented May 23, 2017

It's an interesting point that CompileError doesn't tell you more about the failure. I suppose we could expose the byte offset that fails to validate, though that might open up a bunch of subtly around how precise it should be (i.e. the position of the opcode, or the immediate). Seems like something we could add if there's a use case?

I think one use case was mentioned in the long discussion above (although this would be a feature oriented to developers, but that's a given considering this is an API for compiling we are discussing) - if for some reason the service worker initiated stream rewrite has a bug and corrupts the code, knowing at least the offset might be useful for debugging. Another use case that comes to mind would be in the context of a building a new binary toolchain implementation.

I do understand you are not supposed to hand write wasm, but even the bare minimum of information that the implementation can provide in error interface could be useful compared than not having any information to work with at all. (Imagine invoking gas and it returns only 0 or 1 as the exit code. That doesn't sound too fun.)

@torgo torgo modified the milestones: tag-f2f-london-2017-07-25, tag-telcon-2017-07-18 Jul 25, 2017
@plinss plinss assigned hadleybeeman and unassigned hadleybeeman Jul 25, 2017
@travisleithead
Copy link
Contributor

Taken up again at TAG London F2F:

Glad to see progress in the spec. We note compileStreaming and instantiateStreaming address our concerns about synchronous loading. We note, Chrome limits synchronous compile to specific sizes; we'd like to see this included in the standard (for interop).

We're happy to see issue WebAssembly/design#1092, and look forward to seeing that conclude.

We have general concerns about how WebAssembly and JavaScript/DOM will interoperate in the future. While this may be a bit down the road, we'd like to see exploration of a general mechanism for JS objects to interoperate with Web Assembly (e.g., supporting JavaScript's dynamic prototype behavior, accessor/data properties, property descriptors and their states, and basic types like Array, Object, Number, etc.). Our concern is that the Web Assembly folks will focus on trying to create specific custom bridges for specific DOM features like Canvas, Web Audio, and Input (in support of some of their use cases), without spending the effort to generally support JavaScript itself. Such an effort would enable their chosen scenarios in the short term, but be a long-term friction to the web platform's desire to align DOM with JavaScript. We believe, focusing on a general interop mechanism for Web Assembly to work with JavaScript itself will also bring DOM along for the ride, and reduce friction long term.

@annevk
Copy link
Member

annevk commented Jul 27, 2017

Once you have wasm bindings for IDL, what's left?

@travisleithead
Copy link
Contributor

Insofar as IDL != JavaScript, then you have JavaScript left, if I understand you correctly. Is wasm not interesting in general bindings for JavaScript interop?

@annevk
Copy link
Member

annevk commented Jul 27, 2017

I think insofar that makes sense that's covered, but @lukewagner can maybe elaborate.

@torgo torgo removed the extra time label Jul 27, 2017
@lukewagner
Copy link

We note, Chrome limits synchronous compile to specific sizes; we'd like to see this included in the standard (for interop).

I'm not sure this is necessary if we have new WebAssembly.Module imply pure JIT compilation. Because of wasm's binary format's design, the synchronous cost of new WebAssembly.Module(hugeBuffer) should be practically the same as new ArrayBuffer(hugeBuffer). The latter of course is synchronous and has no size limits, so I don't think the former needs to either.

I think insofar that makes sense that's covered, but @lukewagner can maybe elaborate.

Today, wasm can import and synchronously call, and be synchronously called by, JS functions. These JS functions can of course be the JS functions defined by WebIDL so this can be a direct path from wasm into DOM and other Web APIs w/o thunking through JS. The only limitation is in the expressiveness of wasm's import signatures: wasm only has i32/i64/f32/f64. So we can call performance.now just fine today, but calling the nodeType getter is a problem :) At the wasm CG meeting last week we discussed how to add sufficient types to call Web APIs both through GC integration and a shorter-term JS binding section. So this would cover calling Web platform functions and holding references to Web platform objects (both WebIDL and JS).

Additionally, the GC integration feature would allow wasm to directly load and store from JS Typed Objects allowing very high-performance interop with these sorts of objects. There are not yet any plans to add direct support for untyped JS objects, but in the meantime JS objects could be accessed indirectly by calling imported functions, as mentioned above.

Hope that helps! I'm happy to discuss more with anyone on this topic.

@torgo torgo added Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review and removed Progress: in progress labels Sep 26, 2017
@torgo torgo modified the milestones: tag-f2f-london-2017-07-25, tag-telcon-2017-10-31 Sep 26, 2017
@cynthia
Copy link
Member

cynthia commented Sep 26, 2017

Taken up at Nice F2F.

@lukewagner Thank you for elaborating. And apologies for the delayed response from our side. So one of the bits that we would like to see is this being either integrated as part of a existing spec for a final review pass (web.md isn't quite something that can be considered a spec) - especially for the JS API parts.

The design bits I believe we are happy with how it looks and it seems like the major concerns we have raised have been addressed. (Although I am still curious about the compile errors, but that is more of a personal curiosity)

Could you update us when you have the JS APIs formulated as a spec?

@lukewagner
Copy link

Will do. There is indeed a draft in progress.

@littledan
Copy link

Note that this is a very early draft! Within a month, ahead of the upcoming TPAC, I plan to have something much further along.

@travisleithead
Copy link
Contributor

@domenic as FYI, since we talked about this a bit before the meeting.

I followed up on this issue in the Web Assembly WG meeting today at TPAC. Thanks @flagxor for the great overview and slides (would love pointers to those BTW). My objective was to review the question of whether JS was going to be considered as a first-class member of the binding proposal (we'd heard and derived from the draft-in-progress that JS objects might be out-of-scope--this was a mis-understanding of the non-goal line Provide a general purpose managed object solution). My takeaway was that the proposal is still trying to make generic JS object binding work.

Some notes:

  • I mentioned that I failed to see how basic [[Get]] might work when given an object_ref. Mark Miller noted Reflect.get, etc., that could be used by wasm to be the operation that performs the [[Get]], etc., calls when passing generic JS objects into wasm.
  • In talking with @lukewagner afterward, he described two scenarios:
    1. two-way interop with generic JS objects. For example, handling an ‘any’ type and exposing conversion primitives like toUint32, etc., (basically the steps described in WebIDL’s binding)
    2. a wasm-only module: consider a wasm module that could be defined in a way that the UA pre-creates the DOM’s set of static entry-points into the wasm binding table, and therefore exposes all the DOM’s C++ call sites natively to wasm—taking JavaScript completely out-of-the-picture within that module. In other words, no GC or script realm is necessary (performance ensues). Caveat—no one (including me) has thought through what a JS-less wasm runtime environment would do, say, for example, when creating an iframe or new window or worker where new Realms are involved :-).

@torgo
Copy link
Member

torgo commented Nov 29, 2017

Discussed on telcon - 29 November 2017. Discussed that where Web Assembly meets the rest of the web platform we should be involved. Our concerns with WebAssembly Response API / Web Embedding have been addressed. Please come back with any new features, especially those that intersect with DOM or the rest of the web platform.

@torgo torgo closed this as completed Nov 29, 2017
@jfbastien
Copy link

@torgo Additions to WebAssembly occur through its W3C CG, and all discussions have scheduled agendas which are announced on the public mailing list and on https://github.com/WebAssembly/meetings

As the CG chair I'm happy to ping the TAG as appropriate. Who should I ping and through which venue? Is it sufficient to tag you on the GitHub schedule for a particular meeting? Or is that not necessary if someone from the TAG follows WebAssembly meetings closely?

For example, are you interested in host bindings, CORS, threads?

@slightlyoff
Copy link
Member

Hey @jfbastien: per usual, filing new issues on this repo is the right way to get our attention. Looking forward to collaborating more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Progress: pending external feedback The TAG is waiting on response to comments/questions asked by the TAG during the review
Projects
None yet
Development

No branches or pull requests