Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory export vs import #502

Closed
yamt opened this issue Jan 6, 2023 · 8 comments
Closed

memory export vs import #502

yamt opened this issue Jan 6, 2023 · 8 comments

Comments

@yamt
Copy link
Contributor

yamt commented Jan 6, 2023

currently a module exports its memory for wasi.
for wasi-threads, a module imports the memory too.
it's a bit redundant to both import and export a memory.
regardless of what to do for wasi-threads, for a longer term, it might makes more sense to make a wasi-wide (ie. not specific to wasi-threads) switch from export to import.

wasi-threads issue: WebAssembly/wasi-threads#22
an old discussion: #48 (comment)

@abrown
Copy link
Contributor

abrown commented Jan 12, 2023

@sbc100, does this correctly summarize the objections to this you raised in the WASI meeting?

  • you felt that exporting the memory made modules more self-contained (versus importing them)
  • you mentioned that it is less ergonomic to instantiate a memory-importing module in JS, since one now has to add in something like new WebAssembly.Memory()

@sbc100
Copy link
Member

sbc100 commented Jan 12, 2023

The ergonomics in JS are a little worse that just code size. Having an imported memory would make it hard to write a module loader JS given the current JS API, since the Module.imports object doesn't give you enough information to know how big to make the memory. IIUC it just gives you "kind": "memory": https://developer.mozilla.org/en-US/docs/WebAssembly/JavaScript_interface/Module/imports. There is reflection API proposal that can probably help but I don't know how far along that is.

Simply having the wasm instantiation process take care of constructing the memory IMHO keeps things simple and straight forward. In emscripten we can avoid including this entire file when the memory is exported rather than imported: https://github.com/emscripten-core/emscripten/blob/main/src/runtime_init_memory.js

Could we specify that multi-threaded modules import their memory (because they have to), while single threaded ones export theirs?

Another argument in favor of exporting the memory is that we can depend on the BSS region being zero without having to include explicit zero bits. When memory is imported we don't know its initial content so the linker can't elide the BSS segment: https://github.com/llvm/llvm-project/blob/4d4894ab92ee7b884a5e57b6cbc6772e6cd1fe88/lld/wasm/Writer.cpp#L577-L582 (May be mute point since bulk-memory also allows us to work around this).

Similarly, binaryen and other optimizers might be able to reason less about the contents of an imported memory (for example during the start function).

One day, we might even have first class memory slices, or some other way to communicate memory chunks to the embedder, in which case keeping the memory completely private and internal could yield even more optimization opportunities and provide even better encapsulation.

Finally, some WASI module might be pure compute, and might be able to completely avoid exporting their memory, even today. Exporting of the memory is only really needed if certain API calls are made. Maybe such modules are vanishingly small in number? but its an interesting possibility.

@yamt
Copy link
Contributor Author

yamt commented Jan 13, 2023

The ergonomics in JS are a little worse that just code size. Having an imported memory would make it hard to write a module loader JS given the current JS API, since the Module.imports object doesn't give you enough information to know how big to make the memory. IIUC it just gives you "kind": "memory": https://developer.mozilla.org/en-US/docs/WebAssembly/JavaScript_interface/Module/imports. There is reflection API proposal that can probably help but I don't know how far along that is.

do you mean that, with the current api, a loader have no way to know the size or the "shared" bit of the memory import?
does it mean it's impossible to implement "import memory" for wasi-threads?

Simply having the wasm instantiation process take care of constructing the memory IMHO keeps things simple and straight forward. In emscripten we can avoid including this entire file when the memory is exported rather than imported: https://github.com/emscripten-core/emscripten/blob/main/src/runtime_init_memory.js

Could we specify that multi-threaded modules import their memory (because they have to), while single threaded ones export theirs?

Another argument in favor of exporting the memory is that we can depend on the BSS region being zero without having to include explicit zero bits. When memory is imported we don't know its initial content so the linker can't elide the BSS segment: https://github.com/llvm/llvm-project/blob/4d4894ab92ee7b884a5e57b6cbc6772e6cd1fe88/lld/wasm/Writer.cpp#L577-L582 (May be mute point since bulk-memory also allows us to work around this).

can't we simply say "the memory is zero-filled by the host"?

Similarly, binaryen and other optimizers might be able to reason less about the contents of an imported memory (for example during the start function).

One day, we might even have first class memory slices, or some other way to communicate memory chunks to the embedder, in which case keeping the memory completely private and internal could yield even more optimization opportunities and provide even better encapsulation.

Finally, some WASI module might be pure compute, and might be able to completely avoid exporting their memory, even today. Exporting of the memory is only really needed if certain API calls are made. Maybe such modules are vanishingly small in number? but its an interesting possibility.

so, your suggestion is:

  • a module should import memory if wasi-threads is used
  • otherwise, a module should export memory if it uses pointer-based wasi api
  • otherwise, a module doesn't need to import or export memory

right?

@sbc100
Copy link
Member

sbc100 commented Jan 13, 2023

The ergonomics in JS are a little worse that just code size. Having an imported memory would make it hard to write a module loader JS given the current JS API, since the Module.imports object doesn't give you enough information to know how big to make the memory. IIUC it just gives you "kind": "memory": https://developer.mozilla.org/en-US/docs/WebAssembly/JavaScript_interface/Module/imports. There is reflection API proposal that can probably help but I don't know how far along that is.

do you mean that, with the current api, a loader have no way to know the size or the "shared" bit of the memory import? does it mean it's impossible to implement "import memory" for wasi-threads?

It would be hard to write a generic loader, but not impossible, you would just need to read the bytes of the import section using JS code.

However, the browers/engines that support multi-threading are very likely also modern and more likely to support the new JS reflection proposal. So requiring the new proposal just for threaded applications seems more reasonable that requiring it for all wasi modules.

Simply having the wasm instantiation process take care of constructing the memory IMHO keeps things simple and straight forward. In emscripten we can avoid including this entire file when the memory is exported rather than imported: https://github.com/emscripten-core/emscripten/blob/main/src/runtime_init_memory.js
Could we specify that multi-threaded modules import their memory (because they have to), while single threaded ones export theirs?
Another argument in favor of exporting the memory is that we can depend on the BSS region being zero without having to include explicit zero bits. When memory is imported we don't know its initial content so the linker can't elide the BSS segment: https://github.com/llvm/llvm-project/blob/4d4894ab92ee7b884a5e57b6cbc6772e6cd1fe88/lld/wasm/Writer.cpp#L577-L582 (May be mute point since bulk-memory also allows us to work around this).

can't we simply say "the memory is zero-filled by the host"?

We could do that yes, but the toolchain would have to be aware of that. We may need some kind --assume-imported-memory-is-zero-filled linker flag.

In general the toolchain can't assume that all imported memories are zero-filled. There are use cases where one might want to populate the memory before instantiation, or re-use a memory from one instance to the next.

Similarly, binaryen and other optimizers might be able to reason less about the contents of an imported memory (for example during the start function).
One day, we might even have first class memory slices, or some other way to communicate memory chunks to the embedder, in which case keeping the memory completely private and internal could yield even more optimization opportunities and provide even better encapsulation.
Finally, some WASI module might be pure compute, and might be able to completely avoid exporting their memory, even today. Exporting of the memory is only really needed if certain API calls are made. Maybe such modules are vanishingly small in number? but its an interesting possibility.

so, your suggestion is:

  • a module should import memory if wasi-threads is used
  • otherwise, a module should export memory if it uses pointer-based wasi api
  • otherwise, a module doesn't need to import or export memory

right?

Exactly. Although the third use case may well be hypothetical.

@yamt
Copy link
Contributor Author

yamt commented Jan 13, 2023

The ergonomics in JS are a little worse that just code size. Having an imported memory would make it hard to write a module loader JS given the current JS API, since the Module.imports object doesn't give you enough information to know how big to make the memory. IIUC it just gives you "kind": "memory": https://developer.mozilla.org/en-US/docs/WebAssembly/JavaScript_interface/Module/imports. There is reflection API proposal that can probably help but I don't know how far along that is.

do you mean that, with the current api, a loader have no way to know the size or the "shared" bit of the memory import? does it mean it's impossible to implement "import memory" for wasi-threads?

It would be hard to write a generic loader, but not impossible, you would just need to read the bytes of the import section using JS code.

However, the browers/engines that support multi-threading are very likely also modern and more likely to support the new JS reflection proposal. So requiring the new proposal just for threaded applications seems more reasonable that requiring it for all wasi modules.

ok. it makes sense. thank you for explanation.

Simply having the wasm instantiation process take care of constructing the memory IMHO keeps things simple and straight forward. In emscripten we can avoid including this entire file when the memory is exported rather than imported: https://github.com/emscripten-core/emscripten/blob/main/src/runtime_init_memory.js
Could we specify that multi-threaded modules import their memory (because they have to), while single threaded ones export theirs?
Another argument in favor of exporting the memory is that we can depend on the BSS region being zero without having to include explicit zero bits. When memory is imported we don't know its initial content so the linker can't elide the BSS segment: https://github.com/llvm/llvm-project/blob/4d4894ab92ee7b884a5e57b6cbc6772e6cd1fe88/lld/wasm/Writer.cpp#L577-L582 (May be mute point since bulk-memory also allows us to work around this).

can't we simply say "the memory is zero-filled by the host"?

We could do that yes, but the toolchain would have to be aware of that. We may need some kind --assume-imported-memory-is-zero-filled linker flag.

In general the toolchain can't assume that all imported memories are zero-filled. There are use cases where one might want to populate the memory before instantiation, or re-use a memory from one instance to the next.

i was taking specifically about imported memory for wasi-threads, not in general.
a host which is aware of wasi-threads needs to create a shared memory and provide it for a module which uses wasi-threads. an extra requirement to zero-fill it should not be a burden.

Similarly, binaryen and other optimizers might be able to reason less about the contents of an imported memory (for example during the start function).
One day, we might even have first class memory slices, or some other way to communicate memory chunks to the embedder, in which case keeping the memory completely private and internal could yield even more optimization opportunities and provide even better encapsulation.
Finally, some WASI module might be pure compute, and might be able to completely avoid exporting their memory, even today. Exporting of the memory is only really needed if certain API calls are made. Maybe such modules are vanishingly small in number? but its an interesting possibility.

so, your suggestion is:

  • a module should import memory if wasi-threads is used
  • otherwise, a module should export memory if it uses pointer-based wasi api
  • otherwise, a module doesn't need to import or export memory

right?

Exactly. Although the third use case may well be hypothetical.

ok. it sounds reasonable to me.

@yamt
Copy link
Contributor Author

yamt commented Feb 9, 2023

a possible concern about the "export only when necessary" approach is, things like wasi pollyfill needs to be more dynamic.

@jcbhmr
Copy link

jcbhmr commented Sep 12, 2023

hello! 👋 complete newbie chiming in here.

i like the idea of importing memory. i also like the idea of exporting memory particularly for uniformity with wasm modules that aren't wasi modules but still return strings (or whatever) and need to expose their memory.

as @yamt mentioned, having wasi import a previously created memory means that your wasi impl functions like clock_time_get() that use ptrs then can have an implicit known memory (that they made and exported in the underlying js or whatever) instead of doing underlyingWasi.setMemory(libUsingWasi.memory) after importing your lib_using_wasi.wasm module.

ex from nodejs codebase of wasi implementation using exported memory post-instantiation which IMO isn't great when using esm integration with wasm. https://github.com/nodejs/node/blob/v20.6.1/lib/wasi.js#L43

idea i had idk if its a good one
;; lib_using_wasi.wasm
(module
  (import "wasi_snapshot_preview100" "memory" (memory $memory 1)) ;; imported!
  (import "wasi_snapshot_preview100" "clock_time_get" (func $clock_time_get (...)))
  (func (export "current_date_as_string") ...) ;; returns ptr to c-string in memory
  (export "memory" (memory $memory)) ;; and the same thing is exported
)
// mywasi.js
import WASI from "actual-wasi-impl-lib/wasi_snapshot_preview100"
const wasi = new WASI({
  date: new Date('1970-07-04T00:00:00.000Z')
})
export const {
  memory,
  clock_time_get,
  // ...
} = wasi
<script type=importmap>
{
  "scopes": {
    "./lib_using_wasi.wasm": {
      "wasi_snapshot_preview100": "./mywasi.js"
    }
  }
}
</script>

<script type=module>
import { CString } from "wasm-type-magic-lib"
import { current_date_as_string, memory } from "./lib_using_wasi.wasm"
coonst s = ptr => CString(memory, ptr)
console.log(s(current_date_as_string()))
//=> 'July 4th, 1970'
</script>

idk just thought id chime in to say that importing memory makes esm-integration wasm wasi polyfills a bit more ergonomic for consumers.

@pchickey
Copy link
Contributor

WASI is on top of the component model now, so any decisions about representation of components using modules and the canonical ABI is now the responsibility of that proposal - this is no longer a relevant question to ask about WASI. The wasi-threads proposal is not compatible with the component model in its current form, and the champions are working on a new approach which will add thread spawning to core wasm. This is both required to fix performance issues inherent in the existing way it is specified, and to be compatible with the component model going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants