Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom section that documents producing toolchain? #63

Closed
lukewagner opened this issue Sep 11, 2018 · 14 comments
Closed

custom section that documents producing toolchain? #63

lukewagner opened this issue Sep 11, 2018 · 14 comments

Comments

@lukewagner
Copy link
Member

In a few contexts now, it has seemed useful to have a standardized custom section that declares the toolchain that produced the .wasm module. For example, browsers could use such a section to measure the rate of usage of various toolchains and languages via telemetry to help understand usage trends on the web. WDYT?

@lukewagner lukewagner changed the title custom section that declares toolchain? custom section that documents producing toolchain? Sep 11, 2018
@xtuc
Copy link
Contributor

xtuc commented Sep 12, 2018

I often had to ask where that wasm file is comming from, but that could ideally be encoded in the binary.

@sbc100
Copy link
Member

sbc100 commented Sep 13, 2018

LLVM produces a .comment section with this information but we currently don't include that in the wasm object format. I tried adding in the past but it relies on the SHF_STRINGS + SHF_MERGE support in the linker. I can take another look.

One question is weather to include this as custom section called ".comment" or to include it as a data segment. From a clarity point of view its probably best as custom section, but I think the linker implementation might be a little harder in this case.

@dschuff
Copy link
Member

dschuff commented Sep 13, 2018

The general idea makes sense to me. As @sbc100 said this is common in ELF. I think it would be much simpler (to make use of) as a custom section rather than putting it in the data section. I agree that would probably need some extra linker support; I think something like SHF_MERGE should be enough though. I'm i'm interpreting the sun doc correctly, adding SHF_STRINGS is partly an optimization allowing it to interned in the string table (which we don't have), and partly to allow the linker to locate divisions between multiple items. But for this use case (and probably some others) we could have a simple merge that would treat the whole section as a unit.

@lukewagner
Copy link
Member Author

lukewagner commented Sep 14, 2018

Great to see the interest! Next suggestion: rather than having a single string that will inevitably go the way of browser UA-strings, how about having a few fields that are intentionally coarse-grained and contain a fixed (but growable) set of values that we maintain in this repo? This would assist in implementing privacy-preserving telemetry that gave a good high-level picture of the toolchain usage. E.g.:

  • language (such as: wat, C++, Rust, C#, AssemblyScript, ...)
  • processed-by (a comma-delimited set containing all tools that were used in the pipeline to produce the final wasm, such as: llvm, binaryen, wabt, webpack, wasm-bindgen, ...)
  • sdk (the top-level tool the developer installed to produce the wasm: Emscripten, Blazor, Unity, Rust-wasm-pack, ...)

Edit: additionally, each of the above field values could be followed by a parenthesized version string, allowing versions that could be easily skipped by tools trying to build coarse-grained summaries.

@est31
Copy link

est31 commented Sep 14, 2018

rather than having a single string that will inevitably go the way of browser UA-strings

Just wanting to point out that this is precisely what happened in the non-wasm analogon of such a string value, the producer value. Please see this commit that made the Rust compiler claim to be clang because of a check by gdb, as well as the corresponding issue here.

@xtuc
Copy link
Contributor

xtuc commented Sep 14, 2018

I would also suggest adding a date and ensure that the processed-by/sdk includes a version?

@ashleygwilliams
Copy link

i'm def in favor of this with @xtuc's caveat of adding a version!

@MaxGraey
Copy link

Also will be great add signature of check sum like do in this project: https://github.com/frehberg/wasm-sign

@lukewagner
Copy link
Member Author

@xtuc @ashleygwilliams Good point. Perhaps then we would standardize how the version was embedded (say, between parens right after the tool/sdk name) so that it could be mechanically skipped over.

@MaxGraey Signature/check-sum mostly seem orthogonal to the use cases raised here and probably belongs in a separate custom section which could be proposed in a separate issue.

@binji
Copy link
Member

binji commented Oct 2, 2018

We discussed this in the Oct 2 CG meeting. There was general agreement that this is useful. There was a question about how "official" this custom section should be. We agreed that for now it should be documented in the tool-conventions repo, but not in the core spec.

@lukewagner
Copy link
Member Author

Yep! Unless anyone else is itching to, I'll make a PR to add a BinaryEncoding.md-esque description of the binary format.

@MaxGraey
Copy link

Also will be great have metadata field which recognize which post-MVP features used during compilation.

@lukewagner
Copy link
Member Author

That's an interesting idea, but I wonder if the use case is slightly different, deserving a different custom section. In particular, I can see the utility of a wasm module being able to declare which post-MVP features are allowed to be used in the wasm module, as a way to prevent intermediate optimization steps from mistakenly using something they shouldn't. Such a section would be stripped at the end of the compilation, though, since in theory the wasm engine can directly observe which post-MVP features are actually used in the module body for the purpose of telemetry etc.

@RReverser
Copy link
Member

I believe this issue can be closed as we have producers documented and supported by various toolchains today?

@tlively tlively closed this as completed Oct 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants