Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating import/export tables in the binary format #522

Closed
titzer opened this issue Jan 21, 2016 · 6 comments
Closed

Updating import/export tables in the binary format #522

titzer opened this issue Jan 21, 2016 · 6 comments
Milestone

Comments

@titzer
Copy link

titzer commented Jan 21, 2016

Pending #520, the binary format should be updated to match the design.

The difference between the current format and AstSemantics.md is as follows:

In the binary format, there is a single function declaration table with a single index space for functions, including imported functions. An entry in this table can be either: named or unnamed, either: an imported function or a defined function, either: exported or not exported.

Consequences of this:

  • call_function can index both imported and defined functions.
  • imported functions can appear in the indirect function table.
  • imported functions can be immediately re-exported

In the AstSemantics.md, there are imported functions and defined functions. The two index spaces are separate and there are separate call_function and call_import operators.

Consequences of this:

  • call_function and call_import index separate tables.
  • imports cannot appear in the indirect function table.
  • imported functions cannot be immediately re-exported

So this issue proposes updating the binary format to match the AstSemantics, introducing an import table and an export table. If there aren't any strong objections I will start working on implementing it and wait to describe it when we make more progress on #520.

@ghost
Copy link

ghost commented Jan 21, 2016

Perhaps there is consensus, or a pending decision, to move to the v8 encoding behaviour. It seems a feature to be able to include imports in the function table. Does anyone object to changing keeping the v8 behaviour in the experimental encoding? If you want to revisit this then can you live with it as-is in the v8 encoding and change later?

@lukewagner
Copy link
Member

@JSStats With @titzer filing this issue, I think there is now consensus (this was discussed in #421) on it. It's pretty easy to make a function wrapper if you want to put an import in the func-ptr table, but as explained in #421, imports are not the familiar dllimports, they're more like syscalls and so I don't think this will be common. E.g., talking with @sunfishcode, the most natural way to expose call_import in clang might be to have some __builtin_call_import("func", "module", arg1, ...) which you use like a syscall and incidentally can't take the address of.

An entry in this table can be either: named or unnamed

@titzer I was implementing this recently and thinking that we could keep the same logical functionality of optionally naming functions but instead have a separate "function names" section that goes at the end of the module (that was simply a sequence of utf8 cstrings). The reason is that if the engine is doing streaming+parallel compilation, you want to get the code up front and pulling out the strings improves density. On the subject, I was also thinking about an optional "local names" that contains an array of arrays of utf8 cstrings naming each local of each function since this will greatly improve readability of the generated text format. Combined, this could be our MVP "symbol section".

@ghost
Copy link

ghost commented Jan 21, 2016

@lukewagner A trampoline is less efficient than a direct call. 'familar dllimports' is not a technical reason. The 'clang' language's choices are not technical matters, rather it's own limitations. The environment could add new imported function table entries too and pass indexes to them into the code. Could the technical rationale be written up before people start work on changes?

The internal function label names would need to be at the end of the file, not just the end of the sections, as it seems that the bulk of the data will be OOL after the sections. Having these names in one after the other in order of their reference would help as if reduces redundancy in the encoding, as opposed to a separate file offset for each as the current encoding has for the import/export names. Perhaps you should also clarify that the import/example names might be needed early by the runtime, just not the meta information. So a runtime would have read all the sections and the import/export named before starting compilation.

@titzer Every time a new table is added there is redundancy in the ordering of entries in the table. Adding flags to function table entries for exports avoids this ordering redundancy - they use the same order as the functions. For example the names could be in the order of their reference in the functions section, reading one name each time that a function is flagged as an import or export. Less redundancy will make it easier to compare the output between producers for checking them against each other (sexp-asm and binaryen emit the names in different orders), and also helps a lot trying to avoid loss in a textual format (otherwise the ordering in these tables also needs to be encoded which might be meaningless!).

@ghost
Copy link

ghost commented Feb 4, 2016

WebAssembly/wabt#33

@axic
Copy link

axic commented Jul 21, 2016

@lukewagner

E.g., talking with @sunfishcode, the most natural way to expose call_import in clang might be to have some __builtin_call_import("func", "module", arg1, ...) which you use like a syscall and incidentally can't take the address of.

Since clang seems to create imports for methods explicitly marked external this doesn't seem to be entirely needed:

extern unsigned long long test();

becomes

(import $test "env" "test" (result i64))

One drawback though it has the environment hardcoded as env. I think adding an attribute might be an alternative solution: __attribute(environment("myModule"))__

@lukewagner
Copy link
Member

Good observation and totally agreed: one of the important consequences, and major motivations, for #682 is that there is now just one "kind" of import and it can be used to achieve both C/C++ shared-state dynamic linking as well as separate/encapsulated ES6-module-style imports: it's just a matter of what (if any) state (Memory/Table) is exported/imported. Given this, unresolved external imports is a fine way to tell a C-to-wasm compiler to emit a wasm import. (Personally, though, I'd like a way to tell the compiler the difference between an external symbol I expect to be resolved by static linking and which external symbols are intended to be wasm imports—I've suggested using ELF default visibility (__attribute__ ((visibility ("default"))) in the past—but this is more of a subjective/EIBTI question.)

Given the title, I think this issue is subsumed by #727, so I'll close it, but feel free to reopen if I've missed something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants