Updating import/export tables in the binary format #522

titzer · 2016-01-21T13:02:50Z

Pending #520, the binary format should be updated to match the design.

The difference between the current format and AstSemantics.md is as follows:

In the binary format, there is a single function declaration table with a single index space for functions, including imported functions. An entry in this table can be either: named or unnamed, either: an imported function or a defined function, either: exported or not exported.

Consequences of this:

call_function can index both imported and defined functions.
imported functions can appear in the indirect function table.
imported functions can be immediately re-exported

In the AstSemantics.md, there are imported functions and defined functions. The two index spaces are separate and there are separate call_function and call_import operators.

Consequences of this:

call_function and call_import index separate tables.
imports cannot appear in the indirect function table.
imported functions cannot be immediately re-exported

So this issue proposes updating the binary format to match the AstSemantics, introducing an import table and an export table. If there aren't any strong objections I will start working on implementing it and wait to describe it when we make more progress on #520.

The text was updated successfully, but these errors were encountered:

ghost · 2016-01-21T13:19:35Z

Perhaps there is consensus, or a pending decision, to move to the v8 encoding behaviour. It seems a feature to be able to include imports in the function table. Does anyone object to changing keeping the v8 behaviour in the experimental encoding? If you want to revisit this then can you live with it as-is in the v8 encoding and change later?

lukewagner · 2016-01-21T14:43:27Z

@JSStats With @titzer filing this issue, I think there is now consensus (this was discussed in #421) on it. It's pretty easy to make a function wrapper if you want to put an import in the func-ptr table, but as explained in #421, imports are not the familiar dllimports, they're more like syscalls and so I don't think this will be common. E.g., talking with @sunfishcode, the most natural way to expose call_import in clang might be to have some __builtin_call_import("func", "module", arg1, ...) which you use like a syscall and incidentally can't take the address of.

An entry in this table can be either: named or unnamed

@titzer I was implementing this recently and thinking that we could keep the same logical functionality of optionally naming functions but instead have a separate "function names" section that goes at the end of the module (that was simply a sequence of utf8 cstrings). The reason is that if the engine is doing streaming+parallel compilation, you want to get the code up front and pulling out the strings improves density. On the subject, I was also thinking about an optional "local names" that contains an array of arrays of utf8 cstrings naming each local of each function since this will greatly improve readability of the generated text format. Combined, this could be our MVP "symbol section".

ghost · 2016-01-21T21:47:54Z

@lukewagner A trampoline is less efficient than a direct call. 'familar dllimports' is not a technical reason. The 'clang' language's choices are not technical matters, rather it's own limitations. The environment could add new imported function table entries too and pass indexes to them into the code. Could the technical rationale be written up before people start work on changes?

The internal function label names would need to be at the end of the file, not just the end of the sections, as it seems that the bulk of the data will be OOL after the sections. Having these names in one after the other in order of their reference would help as if reduces redundancy in the encoding, as opposed to a separate file offset for each as the current encoding has for the import/export names. Perhaps you should also clarify that the import/example names might be needed early by the runtime, just not the meta information. So a runtime would have read all the sections and the import/export named before starting compilation.

@titzer Every time a new table is added there is redundancy in the ordering of entries in the table. Adding flags to function table entries for exports avoids this ordering redundancy - they use the same order as the functions. For example the names could be in the order of their reference in the functions section, reading one name each time that a function is flagged as an import or export. Less redundancy will make it easier to compare the output between producers for checking them against each other (sexp-asm and binaryen emit the names in different orders), and also helps a lot trying to avoid loss in a textual format (otherwise the ordering in these tables also needs to be encoded which might be meaningless!).

ghost · 2016-02-04T21:25:45Z

WebAssembly/wabt#33

axic · 2016-07-21T20:57:26Z

@lukewagner

E.g., talking with @sunfishcode, the most natural way to expose call_import in clang might be to have some __builtin_call_import("func", "module", arg1, ...) which you use like a syscall and incidentally can't take the address of.

Since clang seems to create imports for methods explicitly marked external this doesn't seem to be entirely needed:

extern unsigned long long test();

becomes

(import $test "env" "test" (result i64))

One drawback though it has the environment hardcoded as env. I think adding an attribute might be an alternative solution: __attribute(environment("myModule"))__

lukewagner · 2016-07-21T21:14:40Z

Good observation and totally agreed: one of the important consequences, and major motivations, for #682 is that there is now just one "kind" of import and it can be used to achieve both C/C++ shared-state dynamic linking as well as separate/encapsulated ES6-module-style imports: it's just a matter of what (if any) state (Memory/Table) is exported/imported. Given this, unresolved external imports is a fine way to tell a C-to-wasm compiler to emit a wasm import. (Personally, though, I'd like a way to tell the compiler the difference between an external symbol I expect to be resolved by static linking and which external symbols are intended to be wasm imports—I've suggested using ELF default visibility (__attribute__ ((visibility ("default"))) in the past—but this is more of a subjective/EIBTI question.)

Given the title, I think this issue is subsumed by #727, so I'll close it, but feel free to reopen if I've missed something.

lukewagner mentioned this issue Jan 21, 2016

binary encoding: remove the function name-present flag and require a name for imports and exports. #521

Closed

This was referenced Jan 23, 2016

Tables can include imports and their index space includes imports. WebAssembly/wabt#22

Closed

Allow imports in functions tables. #517

Closed

sunfishcode added this to the MVP milestone Feb 1, 2016

jfbastien added the binary format label Apr 5, 2016

lukewagner closed this as completed Jul 21, 2016

axic mentioned this issue Jul 27, 2016

Investigate C/C++ ewasm/design#16

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating import/export tables in the binary format #522

Updating import/export tables in the binary format #522

titzer commented Jan 21, 2016

ghost commented Jan 21, 2016

lukewagner commented Jan 21, 2016

ghost commented Jan 21, 2016

ghost commented Feb 4, 2016

axic commented Jul 21, 2016

lukewagner commented Jul 21, 2016

Updating import/export tables in the binary format #522

Updating import/export tables in the binary format #522

Comments

titzer commented Jan 21, 2016

ghost commented Jan 21, 2016

lukewagner commented Jan 21, 2016

ghost commented Jan 21, 2016

ghost commented Feb 4, 2016

axic commented Jul 21, 2016

lukewagner commented Jul 21, 2016