New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work out core npm integration story #5

Closed
aturon opened this Issue Jan 12, 2018 · 18 comments

Comments

Projects
None yet
7 participants
@aturon
Contributor

aturon commented Jan 12, 2018

There are a lot of questions around the precise workflow and metadata expression for integrating Cargo and npm packages.

Design constraints

  • Consumers of Rust/wasm-based packages should be completely unaware that Rust
    is involved. In particular, using such a package should not require a local
    Rust toolchain.

    • This means that publication to npm is done in binary form: we upload a
      .wasm file containing the fully-compiled Rust code.
  • You should be able to work on the Rust portion of the library using standard
    Cargo workflows.

  • There should be a straightforward way to express npm metadata (i.e. the
    contents of package.json) for a Rust/wasm project.

    • That means, in particular, that a Rust project might pull in several crates,
      each of which pulls in their own npm package dependencies.
  • There should be an easy way to publish such a project to npm, handling all
    needed transitive dependencies.

  • Ultimately, JS bundlers (like WebPack and Parcel) will need to understand
    wasm-based npm packages and generate the appropriate module instantiation.

@aturon aturon added the packaging label Jan 12, 2018

@aturon

This comment has been minimized.

Contributor

aturon commented Jan 12, 2018

@steveklabnik

This comment has been minimized.

Contributor

steveklabnik commented Jan 12, 2018

https://github.com/choojs/bankai is a real-life project that's doing this; we may want to look at what they're doing, and possibly coordinate with yosh.

@linclark

This comment has been minimized.

linclark commented Jan 13, 2018

This diagram reflects the conversation about the toolchain that happened at the Austin All Hands and should give us a good starting place for a more detailed discussion.

wasm-rust-toolchain

@aturon

This comment has been minimized.

Contributor

aturon commented Jan 13, 2018

Some concrete questions, pulled in part from @linclark's summary diagram:

  • How does a Rust crate express JS dependencies?
    • ... on .js files (if we allow this)
    • ... on npm packages
  • Can JS included in a Rust crate depend on the functions the Rust code exports?
    • How is that expressed?
  • How do we approach general "host bindings" -- functionality we expect the host to provide that we don't intend to get through npm?
    • One possibility is to require explicit .js files that define this functionality.
    • Ultimately this data needs to be fed in to the wasm module instantiation; ideally that would be "automatic"
  • If we include .js files in Rust crates, how are these managed in the resulting artifact?
    • As out-of-band .js files?
    • As part of a custom section in the .wasm?
    • How does this work at the .rlib level?
  • How does the bundler ultimately know how to instantiate the wasm module?
  • How do we consolidate the npm dependencies across a crate graph into a form that is digestible by npm itself?
  • Are Rust crates published "directly" to npm (using some tool), or is there a separate, explicit npm package (perhaps as part of the same repo)?
@aturon

This comment has been minimized.

Contributor

aturon commented Jan 17, 2018

At some point soon, we should probably shard this issue into finer-grained ones tracking the various pieces. However, we need some amount of consensus around what those pieces should be :-)

@aturon

This comment has been minimized.

Contributor

aturon commented Jan 17, 2018

One point I realized tonight: there may be some trickiness "flattening" a crate dependency graph that uses JS imports within multiple crates. The final .wasm blob has to have a single, flat import list. If we want to allow any of those imports to have meaningful names that are manually provided, we're going to need some kind of mangling scheme to avoid collisions with imports that we want to provide automatically.

It may be that the best strategy is to always provide all imports automatically (i.e., require you to at the very least write little .js files defining the functions you expect to be present), but even there some mangling will be needed.

@alexcrichton

This comment has been minimized.

Contributor

alexcrichton commented Jan 17, 2018

One thing I was also thinking about yesterday at some point was that I'm not sure how useful this will be without a bindgen-like tool. I'd expect most JS libraries on npm to use more than numbers/floats, so actually calling functionality from Rust will involve holding on to some sort of JS object and calling methods or accessing fields. In that sense @aturon requiring little js files to do this translation initially seems like a good idea, but in the long run I'm not sure what that means...

@fitzgen

This comment has been minimized.

Member

fitzgen commented Jan 17, 2018

It is not clear to me that npm or whatever bundler should be instantiating modules automatically (or it should at least be configurable).

I am exploring designs where instead of forcing JS programmers to learn manual memory management, I create a wasm module instance encapsulated within its JS interface object and let the GC manage the lifetime of the module instance, since the only GC edge is from the JS interface object.

Essentially the .wasm would have a single global data that all of its exported functions implicitly operate on, and if we want multiple instances of that data type, we create multiple wasm module instances.

Now JS programmers don't need to learn manual memory management (which I can tell you from my experience writing heap profiling tools for JS, most of them will have a hard time with).

@lukewagner

This comment has been minimized.

Contributor

lukewagner commented Jan 18, 2018

@aturon Yes, sharding makes sense, probably along the lines of @linclark 's diagram above.

@fitzgen

It is not clear to me that npm or whatever bundler should be instantiating modules automatically (or it should at least be configurable).

If we take the perspective that the bundler (not npm, but webpack or parcel or rollup or browserify) is just generating JS API calls (to instantiate() et al) as a polyfill for the eventual wasm ES module integration, then the wasm instantiation is directly implied by the semantics of ES modules; it's what the browser itself when it implement wasm ES module integration and, at that point, no explicit JS API calls will be needed.

I am exploring designs where instead of forcing JS programmers to learn manual memory management, I create a wasm module instance encapsulated within its JS interface object and let the GC manage the lifetime of the module instance, since the only GC edge is from the JS interface object.

Unless I'm missing your meaning, that's what we have with the above proposal; with the wasm instance being kept alive like any other ES module. The underlying assumption here is each instance is creating its own Memory and so, with wasm-bindgen, most clients of a wrapped wasm module shouldn't be doing any manual memory management.

@alexcrichton

This comment has been minimized.

Contributor

alexcrichton commented Jan 19, 2018

@lukewagner

One thing I've actually wondered about in the past, right now wasm compilation is generally async (or at least I think you want it to be), but does that play well with the es module integration bundlers already have? At least when working in glimmer so far I think all import directives are intended to be "synchronous" in the sense that there's not room for an import which would otherwise be resolved to a wasm instantiated module to get filled in at some point.

Although I think the es module specification allows for async modules? Do you think bundlers (if they don't actually have async modules right now, I could be wrong) would basically just wait for all wasm modules to be instantiated before running any code?

Unless I'm missing your meaning, that's what we have with the above proposal; with the wasm instance being kept alive like any other ES module.

Oh I interpreted @fitzgen's comment as something like instead of code like this

class A {
    constructor() {
        this._privateField = myWasmModule.create_new_foo();
    }

    // call to deallocate memory in wasm, although unfortunately it's not 
    // always clear when to call this
    free() {
        myWasmModule.free_foo(this._privateField);
    }
}

you'd instead do something like

class A {
    constructor() {
        this._wasmModule = new WebAssembly.Instance(..);
        this._privateField = this._wasmModule.create_new_foo();
    }

    // no need for `free`!
}

Although I could also be wrong too!

This aspect of memory management I don't think is directly related to npm integration, though, other than "writing idiomatic npm code will be hard" because some form of memory management will probably need to show up.

@linclark

This comment has been minimized.

linclark commented Jan 19, 2018

@alexcrichton Some bundlers are making it possible for wasm to be loaded async. For example, Parcel did this recently. I hope to dig into how the different bundlers are doing this so that we can decide on a best practice to recommend.

@alexcrichton

This comment has been minimized.

Contributor

alexcrichton commented Jan 19, 2018

Oh nice! Sounds like that'll be a non-issue then!

@fitzgen

This comment has been minimized.

Member

fitzgen commented Jan 19, 2018

Oh I interpreted @fitzgen's comment as something like instead of code like this

Yes, exactly. Thanks for clearing this up, and sorry that I didn't explain it very well :-p

@fitzgen

This comment has been minimized.

Member

fitzgen commented Jan 19, 2018

Although, you wouldn't even need this._privateField anymore, since it can be implicit:

class SourceMapConsumer {
  constructor(rawSourceMap) {
    this._wasm = new WebAssembly.Instance(...);
    // note: no need to save a pointer to the wasm's `Mappings` instance, it
    // just maintains an implicit global in the wasm instance's heap.
    this._wasm.parse(rawSourceMap);
  }

  query(line, column) {
    // Again, the Rust structure inside the wasm heap is implicit, not passing
    // an explicit pointer argument.
    return this._wasm.query(line, column);
  }

  // no need for manual deallocation or a `free` method. Since `this._wasm`'s
  // heap is not shared with any other `SourceMapConsumer`, we can just let
  // the JS GC reclaim the whole module instance when this
  // `SourceMapConsumer` is collected.
}
@lukewagner

This comment has been minimized.

Contributor

lukewagner commented Jan 19, 2018

@alexcrichton @linclark Agreed, I think we should go straight for async (and streaming :). In fact, to discourage devs from using the sync Module ctor, Chrome actually throws if you pass it a too-big ArrayBuffer (and last I heard the limit was tiny, like 4kb), so sync is basically not an option.

@alexcrichton @fitzgen Ah hah, thanks for clearing that up! That's a very interesting idea and I think that could indeed be a useful design pattern for a type of library where the object was heavy weight and one expected, say, <10 of them in a web app. One reason for this is that Memorys are pretty heavyweight (with sizes that are a multiple of 64kb and, on multiple browsers, "fast memories" on 64-bit make a 4-8gb virtual address space reservation per Memory). Then again, if there is only 1 (because it's a global singleton), that's basically equivalent to an ESM, so might as well use that :)

@fitzgen

This comment has been minimized.

Member

fitzgen commented Jan 19, 2018

@alexcrichton @fitzgen Ah hah, thanks for clearing that up! That's a very interesting idea and I think that could indeed be a useful design pattern for a type of library where the object was heavy weight and one expected, say, <10 of them in a web app. One reason for this is that Memorys are pretty heavyweight (with sizes that are a multiple of 64kb and, on multiple browsers, "fast memories" on 64-bit make a 4-8gb virtual address space reservation per Memory). Then again, if there is only 1 (because it's a global singleton), that's basically equivalent to an ESM, so might as well use that :)

Yeah, there can be multiple, but most of the time a handful or fewer.

@ashleygwilliams

This comment has been minimized.

Member

ashleygwilliams commented Jan 30, 2018

hey, i filed #34 to talk more specifically about packaging up wasm, i.e. what we need to make a valid package.json

@aturon

This comment has been minimized.

Contributor

aturon commented Jan 30, 2018

@aturon aturon closed this Jan 30, 2018

blixt added a commit to blixt/rust-wasm that referenced this issue Mar 29, 2018

Update README.md
Fixes a few broken links.

Note that `[npm interop]` tracked back to rustwasm#5 which has been split up and closed. Since I saw no umbrella task for the new issues, I removed the paragraph in favor of the more specific links.

@blixt blixt referenced this issue Mar 29, 2018

Merged

Update README.md #99

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment