Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interaction between Rust wasm/foreign modules and encapsulation #92

Closed
dflemstr opened this issue Mar 17, 2018 · 20 comments
Closed

Interaction between Rust wasm/foreign modules and encapsulation #92

dflemstr opened this issue Mar 17, 2018 · 20 comments

Comments

@dflemstr
Copy link

I hope this is the right place for this discussion; I tried finding a forum that would reach the right target audience and this seemed like the best place.

I've been working with Rust/wasm for quite a while. For that I've been using wasm-bindgen quite heavily (for a not-yet-released project) using a Webpack loader I've authored: https://github.com/dflemstr/rust-native-wasm-loader

One thing I have noticed is that there doesn't seem to be a clear direction regarding how code modules are to be treated between Rust and foreign modules (such as ES6 modules or other wasm modules) on the source level.

Incidentally, it seems to be working similar to C but not consistently. This behavior is quite confusing when interfacing between two languages that both support modules/namespacing (Rust+JS)

Remember, this is the perhaps naive impression of somebody using Rust together with wasm. I'm aware of how the current wasm FFI is closely modeled after the C FFI.

Imagine the following tree (with the module hierarchy like you would expect):

.
└── src
    ├── lib.rs
    ├── reducer
    │   └── editor.js
    └── util
        ├── leftpad.js
        └── leftpad.rs

What should the semantics be around visibility and scope for these files? This is what my intuition would tell me:

  • editor.js should import leftpad.rs as ../util/leftpad.rs, or import some myrustlib.wasm module where the symbol is available as util::leftpad::xyz in some fashion.
  • leftpad.rs should import editor.js as ../reducer/editor.js.
  • (Stretch:) There should be a separate notion of "exported in my Rust crate" and "available to Javascript"
  • (Stretch:) It should be possible to mark a symbol as being available only in the util module, so that editor.js would not see such a symbol defined in leftpad.rs, but leftpad.js would be able to.

With that said, here is what current tools are doing:

  • Raw rustc symbols that are export "C"'d are exported in a global shared namespace.
  • wasm-bindgen will export everything marked with #[wasm_bindgen] also in a global shared namespace (including more advanced things such as structs/classes etc).
  • wasm-bindgen will resolve all imports (#[wasm_bindgen(module = "...")]) relative to the location of the final generated module artifact, wherever that may be.
  • wasm-bindgen only supports symbols that are both crate-public and exported.
  • All imports/exports have no notion of encapsulation.
  • (Feel free to suggest other items if we want to make a complete list; I just highlighted some examples)

I find this behavior surprising and feel like one could do either of these things:

  • Prohibit imports/exports except on the crate top level (lib.rs/main.rs) and resolve them relative to that location.
  • Properly namespace imports/exports such that they behave as I described above, or in some other intuitive fashion.

Does this sound reasonable? What are everybody's thoughts on this?

@mgattozzi
Copy link
Contributor

Hey @dflemstr I would say this is the right place since this is where the working group for wasm and Rust coordinate! I would like to here what @alexcrichton thinks on this considering he's done most of the compiler work regarding the wasm32 target. I'm sure many others here have thoughts on the matter as well!

@Pauan
Copy link

Pauan commented Mar 18, 2018

@dflemstr You're definitely in the right place. Let me give my thoughts on your post:

editor.js should import leftpad.rs as ../util/leftpad.rs, or import some myrustlib.wasm module where the symbol is available as util::leftpad::xyz in some fashion.

I'm undecided about this. On the one hand importing the .rs file makes sense, but on the other hand Rust compilation units are at the crate level, not the module level.

So I guess it depends on whether we want .js files to be seen as a "part of the crate", or as an external thing which imports the Rust crate. I suppose both use-cases will need to be supported.

leftpad.rs should import editor.js as ../reducer/editor.js.

If we treat .js files as being a part of the crate, then yes absolutely I would expect that.

(Stretch:) There should be a separate notion of "exported in my Rust crate" and "available to Javascript"

I believe Rust already has a notion like that: https://doc.rust-lang.org/book/first-edition/ffi.html#callbacks-from-c-code-to-rust-functions

I imagine it would look like this:

extern "wasm" fn foo() -> i32 {
  5 + 10
}

(Stretch:) It should be possible to mark a symbol as being available only in the util module, so that editor.js would not see such a symbol defined in leftpad.rs, but leftpad.js would be able to.

I'm not so sure about this one. That sounds like it's trying to apply Rust's module system and scoping rules to JS, but JS has its own module system.


As for the rest of your post, I generally agree. I think this is less a matter of "the Rust team decided things should be this way" and more a matter of "we haven't decided how things should be yet, so the current status-quo is a bit wonky".

@alexcrichton
Copy link
Contributor

FWIW right now wasm and rustc have no knowledge of import paths really, they're sort of just opaque strings. AFAIK it's bundlers that attach meaning to these import paths to allow files to import one another. In that sense I'm not entirely sure what we'd do differently in rustc/wasm-bindgen/etc?

Specifically on the topic of FFI and "what symbols are exported" or "what is their ABI", that's quite similar to https://github.com/rust-lang-nursery/rust-wasm/issues/29 I think

@Pauan
Copy link

Pauan commented Mar 20, 2018

@alexcrichton It might be necessary for rustc to carefully generate the relative imports in such a way that they work correctly after running wasm-pack. That will require us to figure out exactly how the custom sections and relative paths work.

@Pauan
Copy link

Pauan commented Mar 20, 2018

Actually, the situation is much more complicated than I thought. The plan is for rustc to bundle up all the .rs and .js code into a single .wasm file, and then wasm-pack grabs the .js files out of the .wasm file.

But that means if the .js files try to import a .rs file, it doesn't work because the .rs files don't exist!

So we either have to give up on the idea of .js files importing .rs files, or we have to parse the .js and rewrite the imports to point to the .wasm instead.

And then there's issues about name collisions, if multiple Rust crates export the same variable name...

@alexcrichton
Copy link
Contributor

@Pauan oh but my point is that rustc isn't generating imports at all, authors are writing them down in source code and rustc is moving those strings along to the output file.

@Pauan
Copy link

Pauan commented Mar 20, 2018

@alexcrichton Yes, but there's all sorts of problems if the imports are just passed through unchanged:

  1. The author of a crate called mycrate has a src/subdir/foo.rs file and a src/bar.js file.

  2. Inside of foo.rs it has #[wasm_import_module = "../bar.js"]

  3. When that crate is compiled and run through wasm-pack it gets converted into a mycrate.wasm and bar.js file, but inside of mycrate.wasm it has an import for ../bar.js, which is wrong.

There's a bunch of other nasty stuff that can happen, that's just a small taste.

Doing this right will be hard, since authors will want to import file paths relative to the .rs file, but the compiled .wasm file won't necessarily have the same directory structure as the original .rs file.

@alexcrichton
Copy link
Contributor

Oh sure, but this is also serious hypothetical territory right now. Nothing in the toolchain supports specification of the import module, much less actually resolving it. I think before we debate what should happen we should get something working, see how it works, and go from there.

@CryZe
Copy link

CryZe commented Mar 20, 2018

Does it makes sense to use a path specific to JS there at all anyway? That means the rust crates would be written with specific JS files in mind, which honestly I don't see working out if we consider normal rust crates like chrono that you have as a dependency. So for something like chrono it makes a lot more sense to just define something like this:

#[wasm_import_module = "chrono"]
extern {
    fn time() -> f64; // Made up signature
}

That way you have a clean namespacing on the rust side that isn't prone to potential conflicts.

However I think it definitely does make sense to be able to specify JS dependencies there too, so it probably makes sense to have a custom section in the wasm that specifies how those map to JS.
WAT-like pseudo-code:

(JSImportMappings ("chrono" "chrono-helper.js"))

And on Rust's side it could look like:

#[js_import_mapping = "chrono-helper.js"]
#[wasm_import_module = "chrono"]
extern {
    fn time() -> f64; // Made up signature
}

@Pauan
Copy link

Pauan commented Mar 21, 2018

@alexcrichton Absolutely, I'm not trying to stop the prototyping work. But since ES6 modules and npm are stabilized, we can discuss theoretical use-cases, to make sure that we will be able to (eventually) support them.


@CryZe That would require that any Rust crate that wants to use JS files would have to publish those JS files as an npm module. That's true even if the Rust crate only wants to include a tiny snippet of JS code.

Also how would the naming go? Would the Rust crate be published to npm as chrono, and the JS snippets published to npm as chrono-js? That seems really weird (and unusual for npm).

And then there's things like wasm-bindgen and stdweb which need to be able to generate JS files. So we will need some way of including JS code in a Rust crate. Maybe it would be inline like the stdweb js! macro, or maybe it would be an external .js file (which is much cleaner), but either way the use case is definitely necessary.

And on Rust's side it could look like [...]

How is that better than specifying a relative path to the .js file? No matter what API we decide on, it will be compiled into wasm custom headers, but your API seems a lot more cumbersome than the relative path API.


Here is my proposal: when compiling a crate to wasm32-unknown-*, rustc will add in a custom header to the .wasm file. This header would contain the various metadata we need. This header contains the metadata for all the crates which were compiled (including direct and transitive dependencies).

As such, the header needs to be able to uniquely identify each crate. I assume rustc already has some way of uniquely disambiguating crates, so that can be used. Or perhaps it could use UUIDs. Or it could use absolute filepath mangling. Or it could use an incrementing counter. But I don't really care what method is used, as long as each crate can be uniquely identified.

When a crate uses #[wasm_import_module = "./path/to/foo.js"], rustc would resolve the relative path based upon the .rs file which contains the wasm_import_module. And then it would insert three things into the wasm custom header:

  1. The unique identifier for the crate.

  2. The absolute filepath to the crate.

  3. The relative path from the crate to the .js file which is referenced within wasm_import_module

In addition, when rustc is generating the wasm import for wasm_import_module, it would contain the unique identifier for the crate followed by the relative path from the crate to the .js file.

As an example, let's consider a Rust crate located at /path/to/crate:

// File `src/foo/bar.rs`
#[wasm_import_module = "../qux.js"]
extern {
    fn time() -> f64;
}

#[wasm_import_module = "./corge.js"]
extern {
    fn random() -> f64;
}
// File `src/qux.js`
export function time() {
    return Date.now();
}
// File `src/foo/corge.js`
export function random() {
    return Math.random();
}

And lets suppose that rustc gives the above crate the unique identifier some_unique_identifier

The .wasm custom header would contain this information:

[
    {
        "id": "some_unique_identifier",
        "path": "/path/to/crate",
        "imports": ["src/qux.js", "src/foo/corge.js"]
    }
]

In this example there's only one crate, but keep in mind that it would contain the information for all the crates (including all dependencies).

For the sake of clarity I described the data format using JSON, but I don't really care about the format: it could be binary or anything else.

In addition, when generating the .wasm it would contain the following imports:

(import "some_unique_identifier/src/qux.js" "time" (func time (result f64)))
(import "some_unique_identifier/src/foo/corge.js" "random" (func random (result f64)))

Now that the .wasm file contains the above information, it is now possible for wasm-pack to correctly handle everything. It can parse the .wasm file, extract the information out of the custom header, copy the .js files into the current folder, and then fix the wasm imports to point to the correct filepaths.

Of course wasm-pack will need to handle filename conflicts, and also relative imports in the .js files, but that's an issue for wasm-pack to solve, not rustc. The important thing is that rustc gives wasm-pack enough information so that it's possible for wasm-pack to solve those problems.

Everything I described above doesn't require rustc to do anything complicated: it doesn't need to parse .js files or anything like that.

rustc already knows the absolute filepath to the crate, and it already knows the relative path to the .rs file, so all it needs to do is resolve the wasm_import_module path relative to the .rs file. It's just simple filepath manipulation.

In addition, the above custom header format is generic, so it can work with other languages like C++. In other words, it's not tied specifically to Rust, it simply contains some filepath information, nothing else. So it can become a de-facto wasm standard (similar to the work being done on embedding package.json information inside the wasm custom headers).

And because it simply contains filepaths, it doesn't need to be limited to .js: the same header format could be used to import .wasm files, .css files, etc.

So let's say somebody is compiling to wasm32-unknown-unknown and running it in a non-JS host. They would probably be importing .wasm files rather than .js files, but they can still use wasm_import_module for that, because the header format only cares about filepaths, nothing else.

The above proposal doesn't allow for .js files to import .rs files, but at least it allows .rs files to import .js files. I think importing .rs files inside .js will require a lot more design work (and probably a new header format, separate from the above header format).

Also, I'm not sure if it's even a good idea to allow for importing .rs files within .js, so I don't think it should block the above proposal.

@Pauan
Copy link

Pauan commented Mar 21, 2018

Oh, also, in my above proposal I assumed that wasm_import_module would be importing relative paths (since that's very intuitive, and consistent with ES6 modules).

I still think that's the best idea, however there is an acceptable alternative:

#[wasm_import_module = "/src/qux.js"]

In this case the import is relative to the crate, not relative to the current .rs file.

The custom header format would be exactly the same, the imports within .wasm would be exactly the same, and wasm-pack would be exactly the same, so the only difference is the syntax of wasm_import_module.

However, I don't like this because it's inconsistent with filepaths in ES6 modules. I think it's really nice if wasm_import_module mirrors ES6 imports as much as possible (especially because we're already committed to importing npm modules).

But if other people are convinced that this approach is better, then I'll accept it. The exact syntax for wasm_import_module isn't that important, the important thing is the support for the custom header format (which is the same regardless of how wasm_import_module works).

@Pauan
Copy link

Pauan commented Mar 21, 2018

Also, I mentioned the need to uniquely identify each crate. Rather than generating a unique identifier, it could instead simply use the index to uniquely identify each crate.

Looking at the /path/to/crate example again, it could generate this custom header:

[
    {
        "path": "/path/to/crate",
        "imports": ["src/qux.js", "src/foo/corge.js"]
    }
]

And generate these imports within the .wasm:

(import "0/src/qux.js" "time" (func time (result f64)))
(import "0/src/foo/corge.js" "random" (func random (result f64)))

Rather than using some_unique_identifier it is instead using the index within the JSON array (which in this case is 0)

Either approach works fine, but this index-based approach might be easier to implement.

@icefoxen
Copy link

icefoxen commented Mar 22, 2018

I'm new to this discussion but... in general this is a well-exercised but somewhat arcane problem called "linking". So I'm going to ramble a little bit and try to describe the problem from that perspective, mainly so I can try to think about it.

WASM modules describe all the information needed for linking: what names a module exports, what module names and symbol names a module imports. However, it says nothing at all about how that relates to Javascript or Rust anything else outside of wasm itself.

The first problem is one of uniqueness of names, and Rust traditionally solves this by mangling names. If the names need to be something that other programming languages can understand, there's extern (usually extern "C" which uses C's name and function call conventions). Mangled names exported from crates are guaranteed to be unique because they include crate names and you can't have two crates in the same project with the same name. If they're not unique, it's a compile-time problem and it doesn't compile, or if dynamic linking is involved the program doesn't run (in wasm all linking between wasm modules is dynamic). This is kindasorta-not-a-problem for wasm because names live in different modules, and so it's the programmer's job to just make sure modules have unique names. wasm modules don't actually have a built-in name, the host decides what to name them, and in general name conflicts are the programmer's problem, not the wasm host's. So, we have an easy way of dealing with this.

It seems like if you want to treat JS files like Rust you have to fit them into Rust's assumptions (modules, crates, etc) in a reasonable way, and conflating symbol names and module names with file names is not the way to do this. If you want to specify the location of a JS file relative to a Rust file you could easily import the JS file into the Rust code as a string, but I don't know what that accomplishes (I frankly know very little about the JS <-> Rust bridging process, just how it looks from Rust and wasm.) From the Rust side, it seems semi-obvious that if you want to be able to call JS code from Rust, it needs to be treated like a Rust module. This probably means just making it a Rust module and filling it with FFI code generated by wasm_bindgen or some other FFI-generator. There may be crates consisting purely of these FFI bindings, and that's fine; there are plenty of foo-sys crates that do exactly that but with C code. They usually don't include C code themselves but they do usually have code to link against the C libraries on various platforms... but not to download and build the C library themselves. Usually. If we're talking about including raw JS in Rust code, then that's basically what we're doing: building a library in a different language, and providing an FFI layer as a Rust module. So I dunno about incorporating JS files into Rust crates, but making Rust crates that link to JS files seems like a problem that just gets solved the same way it gets solved for any other language.

From the perspective of the wasm host, it's the wasm host that's in charge of (dynamically) linking things at runtime. wasm itself has no idea how any of this language/crate/module/function nonsense; it just knows about names in wasm modules. The host provides modules full of names, and how it gets them is none of wasm's business. A crate is not really a unit of dynamic linking, and Rust has no language-level story for dynamic linking, so a .wasm file built from Rust source is going to either include all the code necessary for running the whole program, or just a subset of it, a list of symbols it imports and exports, and a big sign saying "no longer rustc's problem". If we take the cue from the C ffi, then the foo-js-sys FFI crate will simply fail to link at runtime if the wasm host can't provide the foo module somehow or another (presumably by loading foo.js). If we want to be able to guarantee at compile-time that foo.js exists and will work the way foo-js-sys expects it to, then we need to be building some loader infrastructure off in Web land and I have no idea how any of that works.

For calling Rust from JS, I don't know much about how JS modules or the wasm<->JS host interface operates, but a Rust program will export some (non-mangled, probably) names of functions or data and JS will ask the wasm host to provide them.

Binary distribution of a library is a different ball-game; rustc helps you not at all there, as far as I know, it's generally up to the programmer and the operating system to make sure things slot together the way they're supposed to. In this case, the web browser/wasm host is our operating system, wasm-pack is our linker it seems, and it is trying to put everything together so the wasm host doesn't need to figure out where to find multiple Rust libraries and Javascript files. I have no idea how it incorporates JS files into this, but if it's dealing with Rust crates and they're interacting with JS via FFI the same way they interact with everything else then I don't think there's any need for a header in your wasm file. Maybe it will output dependency info for JS functions as part of the linking process but that's a purely temporary thing.

I don't have any real answers, I just want to make sure everyone is 100% aware of what is going on: we're talking about building a format and tools for dynamic linking. WebAssembly is our .dll format, and it provides fairly complete but also quite minimal information. We need to figure out how to map the assumptions of Rust crates and Javascript modules onto the information it provides us. Rust provides assumptions that make linking easy: mangled names, no dynamic linking, no unlinked libraries hanging around, dedicated FFI crates. wasm works fine with these assumptions. So it feels like the real question is "how do we easily make nice FFI crates out of Javascript and bundle it all up to deliver to a web browser to make it as hard as possible for dynamic linking to fail?"

The other question is "how do we nicely expose functions from Rust wasm modules to JS?" and that's not something I know how to handle, but the Rust -> JS part is pretty easy obvious, so if we can figure out the JS -> Rust part, then making JS -> Rust -> JS -> Rust -> whatever work should be entirely possible.

Minor edits for clarity and correctness.

@Pauan
Copy link

Pauan commented Mar 23, 2018

@icefoxen I appreciate you sharing your thoughts. Allow me to respond to some of the things you said:

If you want to specify the location of a JS file relative to a Rust file you could easily import the JS file into the Rust code as a string, but I don't know what that accomplishes (I frankly know very little about the JS <-> Rust bridging process, just how it looks from Rust and wasm.)

I think it's important for this discussion to understand exactly how Rust, JS, and wasm interact with each other (which includes the linking process).

This is our current plan (though it might change):

Let's suppose that you want to compile a Rust crate to wasm:

  • This Rust crate contains dependencies on other Rust crates (by using Cargo.toml, like usual).

  • Each Rust crate might have some dependencies on npm packages. These npm dependencies are specified in a package.json file which is within each crate.

    You can think of package.json as being similar to Cargo.toml, except it describes npm dependencies rather than Rust dependencies.

    It's important to note that each Rust crate has their own separate package.json, just like how each Rust crate has their own separate Cargo.toml

  • Each Rust crate might have some .js files that they use.

Within the .rs files, the Rust crates can use the following to import the JS files:

#[wasm_import_module = "./foo.js"]
extern {
    fn foo() -> f64;
}

Or they can use the following to import npm packages (which are specified in the package.json file within the crate):

#[wasm_import_module = "foo"]
extern {
    fn foo() -> f64;
}

The above import syntax is standard for JavaScript:

  • It imports individual .js files.

  • The import filepath is either an npm package, or a relative path from the current file to the .js file.

After compiling the Rust crate (using rustc or cargo build) the final output is a single .wasm file (which is put into the target folder, as usual).

This .wasm file contains the compiled Rust code for the current crate, and also the compiled Rust code for all the dependencies of the current crate (recursively).

In other words, all of the Rust static linking has already occurred, there is no dynamic linking between Rust crates.

For each usage of wasm_import_module the .wasm file contains a wasm import which will load the .js file at runtime (i.e. dynamic linking, as you said).

However, the .wasm file does not contain the .js files or npm dependencies, those are kept separate (they are not statically linked).

At this point rustc has successfully compiled the Rust crates into .wasm, and therefore its job is complete.

But we're not quite done yet. You now need to use the wasm-pack tool, which does three things:

  1. It copies the .wasm file from the target folder into the pkg folder.

  2. For each Rust crate it reads the package.json file and merges all of the npm dependencies together. It then generates a package.json file in the pkg folder which contains the merged dependencies.

  3. It copies the .js files from all of the Rust crates into the pkg folder.

The end result is that the pkg folder contains three things:

  1. A package.json file which describes all of the npm dependencies for all of the Rust crates.

  2. A single .wasm file which contains all of the compiled Rust code for all of the Rust crates.

  3. Multiple .js files which were copied from all of the Rust crates.

At this point the .js files still haven't been linked into the .wasm file, they are still separate.

Now you must link the .wasm and .js files together. There are two ways to do that:

  1. You can simply run the .wasm file in the browser, since the browser will automatically link the imports. This only works in browsers that support both WebAssembly and ES6 modules.

  2. You can use Webpack, Rollup, or Parcel to do the linking between .wasm and .js

And now you can finally run the program, because everything has been successfully linked together.

There will be a tool which streamlines the above process and makes it easy (by calling cargo build + wasm-pack + webpack all in one step), but for the sake of understanding my proposal I needed to explain how it all works under the hood.

Most of what I just said has already been implemented. The part that hasn't been implemented (and what my proposal is trying to fix) is allowing Rust crates to contain .js files.

By using #[wasm_import_module = "./foo.js"] it's currently possible for Rust crates to import .js files.

However, the import paths in the compiled .wasm file are wrong, which causes the final linking step to fail.

My proposal fixes that by adding a tiny bit of filepath information to the .wasm file so that wasm-pack can then read that information and fix the import paths.

Note: my proposal has nothing at all to do with variable or function names. As you said that is a solved problem with ES6 modules.

My proposal also has nothing at all to do with FFI interop between JS <-> Rust, that's already handled by extern (and wasm-bindgen / stdweb)

My proposal is only about making sure that the import filepaths within .wasm are correct, so that the files can be linked together in the final step.


From the Rust side, it seems semi-obvious that if you want to be able to call JS code from Rust, it needs to be treated like a Rust module. This probably means just making it a Rust module and filling it with FFI code generated by wasm_bindgen or some other FFI-generator.

Yes of course Rust will need to use extern blocks (and probably wasm-bindgen) to describe which JS files to import (and which variables to import from those JS files).

But that's completely separate to my proposal. The problem my proposal is trying to fix is the following:

  1. A Rust crate (let's call it foo) includes a foo.js file, which is intended to be imported from within Rust.

  2. The foo crate uses #[wasm_import_module = "./foo.js"] or whatever to import the .js file. The exact mechanism for importing the file isn't important.

  3. Somebody else creates a Rust crate (let's call it bar) which uses the foo crate as a dependency.

  4. After compiling the bar crate the end result is that it has a single bar.wasm file in the target folder which contains the Rust code for both the foo and bar crates.

    However, the foo.js file doesn't exist in the target folder, and so things break.

    Or maybe the foo.js file does exist, but the ./foo.js import path inside of the .wasm file is wrong.

    My proposal fixes both of those problems.


They usually don't include C code themselves but they do usually have code to link against the C libraries on various platforms...

The situation with JS is very different from the situation with C. Being able to include .js files inside of a Rust crate is a critical use-case (since it will often be needed to create glue code which converts the JS API into an API that can be used within Rust).


but making Rust crates that link to JS files seems like a problem that just gets solved the same way it gets solved for any other language. [...] If we take the cue from the C ffi, then the foo-js-sys FFI crate will simply fail to link at runtime if the wasm host can't provide the foo module somehow or another (presumably by loading foo.js).

Unfortunately it's not that simple, because of the way that wasm + JS + npm are, things cannot just be handled in the same way it's handled with C.


If we want to be able to guarantee at compile-time that foo.js exists and will work the way foo-js-sys expects it to, then we need to be building some loader infrastructure off in Web land and I have no idea how any of that works.

That infrastructure already exists (Webpack, Rollup, Parcel, etc.). And the plan is for us to reuse that infrastructure as much as possible.

The purpose of my proposal is just to ensure that the filepaths are correct so that Webpack/Rollup/Parcel/etc. can do the final linking.


I have no idea how it incorporates JS files into this, but if it's dealing with Rust crates and they're interacting with JS via FFI the same way they interact with everything else then I don't think there's any need for a header in your wasm file.

If we can avoid the header information then that would be great! I'm definitely open to alternative proposals which fix the filepath problem.

However, the status quo is not sufficient to fix the filepath problem. So we need some solution (not necessarily my solution).


Maybe it will output dependency info for JS functions as part of the linking process but that's a purely temporary thing.

Yes, after wasm-pack has done its work it deletes the header from the .wasm file. Though in my proposal the header has nothing to do with functions, it only has to do with filepaths.

@Pauan
Copy link

Pauan commented Mar 23, 2018

By the way, you might be wondering why it needs to go through those multiple steps (running cargo build which then runs rustc which adds header information to the .wasm, then running wasm-pack which fixes things based upon the header information, then running Webpack/Rollup/Parcel/etc.)

There's a few reasons:

  1. This minimizes the amount of work that rustc needs to do. In other words, rustc doesn't need to know (or care) about JS files or npm dependencies or anything like that.

    rustc can focus solely on compiling Rust to .wasm, with wasm-pack handling all the JS-specific stuff.

  2. The wasm-pack, wasm-bindgen, Webpack, Rollup, and Parcel tools are not tied to Rust, so they can be used for other languages which want to compile to JS (like C++).

    Being able to have a unified system for compiling any language to JS would be very nice.

  3. The JS ecosystem is very complex and it's constantly changing: many years of work have gone into JS linkers (like Webpack).

    We really do not want to have to reverse-engineer and duplicate that work in Rust.

    So the best solution is to reuse existing JS tools (like Webpack) as much as possible.

@sendilkumarn
Copy link
Member

Do we have any issues opened for this on the wasm-pack ?

The JS ecosystem is very complex and it's constantly changing

I could not agree more on this.

Multiple .js files which were copied from all of the Rust crates.

How the npm dependencies will be handled here? I agree this will work great if you have JS dependencies (as separate files) But how to relatively link the entire lib inside.

@Pauan
Copy link

Pauan commented Apr 2, 2018

@sendilkumarn Thanks for taking a look!

Do we have any issues opened for this on the wasm-pack ?

I don't think so. There's nothing wasm-pack can do until after rustc (or wasm-bindgen) supports this.

How the npm dependencies will be handled here? I agree this will work great if you have JS dependencies (as separate files) But how to relatively link the entire lib inside.

They'll be handled the same way wasm-pack handles them right now: it creates a package.json file and then you run npm install (or yarn install).

Bundlers (like Webpack, Rollup, and Parcel) understand the node_modules structure, so the final linking step will work just fine.

@sendilkumarn
Copy link
Member

@Pauan Thanks for all those detailed explanations. I really enjoy reading them 👍

They'll be handled the same way wasm-pack handles them right now: it creates a package.json file and then you run npm install (or yarn install).

Even inside the .rs file ? Like I want to use a npm library inside the .rs file. How this will be handled? How will be the linking happens here?

P.S: There are a lot of use cases. Mapping & Framing them in a single mental model is really difficult. (atleast for me)

@Pauan
Copy link

Pauan commented Apr 2, 2018

Even inside the .rs file ? Like I want to use a npm library inside the .rs file. How this will be handled? How will be the linking happens here?

You import it with normal npm import syntax, like this:

#[wasm_import_module = "foo"]
extern {
    fn foo() -> f64;
}

That will import the foo npm package (which must be specified in that crate's package.json file).

Then that will translate into this wasm import:

(import "foo" "foo" (func foo (result f64)))

I haven't tested it, but I believe the above wasm import will Just Work(tm) with the JS bundlers (Webpack, etc.).

So in the case of npm modules, the only thing wasm-pack needs to do is generate the package.json, the bundlers handle everything else.

I suppose there might be some conflicts between my proposal and npm packages (e.g. if there is an npm package called 0, then is 0 an npm package, or is it a unique identifier for a Rust crate?).

That can be solved easily enough by using prefixes/namespaces:

(import "crate:0/src/qux.js" "qux" (func qux (result f64)))
(import "npm:foo" "foo" (func foo (result f64)))

So the crate: namespace means to import the file from the Rust crate (which has the unique id of 0), and npm: means to import the npm module called foo.

Just to be clear, these namespace prefixes would be created by rustc (or wasm-bindgen): the wasm_import_module syntax remains exactly the same. So the namespaces are just an internal implementation detail.

The way that wasm_import_module handles it is that if the import starts with ./ or ../ then it is treated as a relative import (and it is compiled with the crate: namespace), and if it doesn't have a ./ or ../ at the start then it's treated as an npm import (and it is compiled with the npm: namespace).

Now, when running wasm-pack, it would:

  • Copy the .js files from the crate: imports into the pkg folder (rewriting the wasm import path as appropriate).

  • Strip out the npm: namespace.

So the end result would be something like this:

(import "./0/src/qux.js" "qux" (func qux (result f64)))
(import "foo" "foo" (func foo (result f64)))

Notice that it copied the .js file into the pkg/0/src folder, and it changed the import path to be relative. And it stripped out npm: as well.

Now everything is set up correctly so that Webpack (etc.) can correctly do the final linking.

This handles the conflict correctly, because crate:0 and npm:0 would be treated as different by wasm-pack.

@alexcrichton
Copy link
Contributor

I think this is handled by RFCs like rustwasm/rfcs#8 and rustwasm/rfcs#6, so closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants