Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependent module support #3034

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

lukaslihotzki
Copy link
Contributor

@lukaslihotzki lukaslihotzki commented Aug 16, 2022

This PR now provides a draft of dependent modules on top of the already available linked modules.

Import handling and linked modules are both very useful. The combination of these features can be used to implement nearly all possible approaches of WASM integration into workers and worklets (pure JS worker, shared WASM module with message passing without threads, shared WASM memory to run threads).

This is a proof-of-concept. It is already usable, but there are a lot of remaining issues, like unclean argument parsing and that it is only tested on --target web.

Fixes #3019

@alexcrichton
Copy link
Contributor

Sorry but this is in the same boat for me as #3069 where I don't have the time/energy to review major changes like this.

@lukaslihotzki lukaslihotzki force-pushed the dependent-module branch 2 times, most recently from b449736 to d7287aa Compare February 2, 2023 14:53
@lukaslihotzki
Copy link
Contributor Author

@Liamolucko, in #3168 (comment) you say that you want to support the feature provided by this PR somehow. This wasn't clear to me, so that was encouraging.

This draft has some problems: It adds a big dependency and it cannot compile_error! on JS parser errors, because that could break existing code. Silently failing could break new projects, so that would be bad too. Also, parsing ES modules recursively would prevent #2787, because there is no const fn JS parser and const fn cannot read additional files. Additionally, parsing JS code in wasm_bindgen can feel too magical.

An option to solve these issues is JS parsing in another crate, that uses wasm-bindgen-macro-support as an interface to wasm_bindgen. wasm_bindgen then needs to expose a simple interface that could also be used directly. For example, wasm_bindgen could allow InlineJsPart arrays as inline_js parameters:

enum InlineJsPart<'a> {
  Raw(&'a str),
  EsModuleShim,
}

// example:
wasm_bindgen::link_to!(inline_js = [
  InlineJsPart::Raw("import * as bindgen from '"), InlineJsPart::EsModuleShim, InlineJsPart::Raw("';

  registerProcessor('WasmProcessor', class WasmProcessor extends AudioWorkletProcessor {
    constructor(options) {
        super();
        let [module, memory, handle] = options.processorOptions;
        bindgen.initSync(module, memory);
        this.processor = bindgen.WasmAudioProcessor.unpack(handle);
    }
    process(inputs, outputs) {
        return this.processor.process(outputs[0][0]);
    }
  });"),
])

Additionally, setting both module and inline_js attributes should place the code of inline_js in the file with the name of module. This way, a JS parsing crate could pass an InlineJsPart array to inline_js and still preserve the original file name.
This option needs minimal work inside wasm_bindgen, but has greater overall complexity than the original draft. If we stick to the original approach, we should make the JS parsing a Cargo feature and enable it explicitly by specifying another attribute (parsed_module) attribute instead of module, so wasm_bindgen can error on invalid code.

No matter which approach we choose, wasm_bindgen should support multiple shims (#3247 (comment)), so the linked module can choose the expected shim format by InlineJsPart or the specific import (for example, import * from '$wbg_es_module_shim').

@Liamolucko, what do you think?

@daxpedda daxpedda mentioned this pull request Mar 16, 2023
@Liamolucko
Copy link
Collaborator

An option to solve these issues is JS parsing in another crate, that uses wasm-bindgen-macro-support as an interface to wasm_bindgen. wasm_bindgen then needs to expose a simple interface that could also be used directly. For example, wasm_bindgen could allow InlineJsPart arrays as inline_js parameters:


enum InlineJsPart<'a> {

  Raw(&'a str),

  EsModuleShim,

}



// example:

wasm_bindgen::link_to!(inline_js = [

  InlineJsPart::Raw("import * as bindgen from '"), InlineJsPart::EsModuleShim, InlineJsPart::Raw("';



  registerProcessor('WasmProcessor', class WasmProcessor extends AudioWorkletProcessor {

    constructor(options) {

        super();

        let [module, memory, handle] = options.processorOptions;

        bindgen.initSync(module, memory);

        this.processor = bindgen.WasmAudioProcessor.unpack(handle);

    }

    process(inputs, outputs) {

        return this.processor.process(outputs[0][0]);

    }

  });"),

])

Additionally, setting both module and inline_js attributes should place the code of inline_js in the file with the name of module. This way, a JS parsing crate could pass an InlineJsPart array to inline_js and still preserve the original file name.

This option needs minimal work inside wasm_bindgen, but has greater overall complexity than the original draft. If we stick to the original approach, we should make the JS parsing a Cargo feature and enable it explicitly by specifying another attribute (parsed_module) attribute instead of module, so wasm_bindgen can error on invalid code.

That's an interesting idea, but I think most of the problems you mentioned can be solved without farming things out to other crates.

The idea I had for the syntax of picking a shim was something along the lines of link_to!(..., shim = "esm"). If we go with this syntax, it means that whether or not the linked module needs access to the shim URL is known up front, and if it doesn't, there's no need to parse the JavaScript. This dodges the potential breakage of JS parser errors triggering compile errors in code that used to work, since that code won't specify shim = "..." and so the JS won't get parsed.

As for the large swc dependency and lack of const-generated inline_js support, I think those can be solved by moving JS parsing to the CLI. This means that users won't have to build swc, as it'll already be baked in. That could potentially increase the CLI's binary size by quite a bit, though.

It doesn't solve the const issue, but it reduces it to the same difficulty as for regular snippets, since the macro doesn't need to parse the JS anymore and so doesn't need to know what it contains.

Side-note about potentially reducing the size of the JS parsing dependency: we don't really need a full JS parser to simply replace "$wbg_main" strings. You only really need a lexer for that, simplifying things a bunch. But that also leads to a potential round 2 of issues with JS parsing errors if we ever need a JS parser for other reasons in the future, so that's probably not a great idea.

Also, parsing ES modules recursively...

Huh, I only just noticed now that this PR makes all of the dependencies of linked modules (and regular snippets) get included as snippets too.

The fact that this applies to normal snippets too is a potential breaking change. Previously, writing import "/foo.js" in a snippet would result in that import getting resolved at runtime to https://<hostname>/foo.js, whereas after this PR it'll get resolved by wasm_bindgen to <crate dir>/foo.js.

Do you think there's much need for this? I always imagined linked modules being pretty small wrappers that just served to get the wasm module up and running, which wouldn't have any need for dependencies (other than the shim).

...so the linked module can choose the expected shim format by InlineJsPart or the specific import (for example, import * from '$wbg_es_module_shim').

Hm, that's an interesting way of doing it. As I mentioned above, I was imagining something more like link_to!(..., shim = "esm"). I feel like that works better, since as I mentioned it provides a solution to the parsing problem. Your approach does allow having the same linked module reference multiple shims, but I'm not sure why you'd need that.

Comment on lines +204 to +206
if val == "$wbg_main" {
pl.push((imp.src.span.lo.0 + 1, imp.src.span.hi.0 - 1));
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little weird that "$wbg_main" only gets expanded when used as an import specifier. That means that you can't do import("$wbg_main"), importScripts("$wbg_main"), require("$wbg_main"), etc. I think it'd be better if all instances of "$wbg_main" were expanded, not just the ones in import specifiers.

Also, I think $wbg_shim might be a better name.

@daxpedda daxpedda added the needs discussion Requires further discussion label May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs discussion Requires further discussion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dependent module support
4 participants