Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending the Name section for struct fields #193

Closed
jakobkummerow opened this issue Feb 8, 2021 · 7 comments · Fixed by #415
Closed

Extending the Name section for struct fields #193

jakobkummerow opened this issue Feb 8, 2021 · 7 comments · Fixed by #415

Comments

@jakobkummerow
Copy link
Contributor

For an improved debugging experience, it would be desirable to have debuggers show names for struct fields, so that developers can map objects fields they see in a Wasm debugger back to the class/etc member fields they defined in their source language. Extending the custom name section for this purpose seems straightforward: based on the extended-name-section proposal, I assume we will add:

id = 10, fieldnamesubsec ::= namesubsection_10 (indirectnamemap)

i.e. same structure as the "local names" subsection, just with different ID and meaning. Two-level index: type_index + field_index.

@RossTate
Copy link
Contributor

RossTate commented Feb 8, 2021

I understand the motivation, but how are you envisioning this working? Remember that this MVP is specifically designed so that types (and their fields) are identified with their structures. This means that Integer and all enum references in Java have the same compile-time and run-time type in the MVP, and all Java reference arrays have the same compile-time and run-time types. So you could easily have the same field of the same type being given multiple names within the same module, let alone across modules.

@jakobkummerow
Copy link
Contributor Author

It's up to toolchains to decide what to put in there. An approximate name is better than no name at all (or an auto-generated name like "field0").

Given our current prototype implementation (where automaton-minimization-and-hashing isn't implemented yet), a toolchain can simply emit distinct static types for distinct source types. Time will tell whether this will change... but even if it does: an approximate name, or partial naming information, is better than no names at all.

Also, it's an optional section, and all entries in it are optional too. If a toolchain doesn't want to bother with it (in general, or for a given struct/array type, or a given field in that type), it can just skip it. Code will still work, only the debugging experience will be degraded.

Also, the extended-name-section proposal already includes names for types (as a whole), and if we have that, it seems obvious that we also want names for the internal elements of these types.

This is 100% pragmatism. I'm working on allowing people (=toolchain authors, and toolchain-using application authors) to actually try this WasmGC thing that we're trying to build here. For that, having debugging information is immensely helpful. And before privately implementing some made-up behavior, I wanted to publicly post about it so the idea is documented.

@RossTate
Copy link
Contributor

RossTate commented Feb 8, 2021

Given our current prototype implementation (where automaton-minimization-and-hashing isn't implemented yet), a toolchain can simply emit distinct static types for distinct source types.

This was one of the suggestions I had in mind, but I wasn't sure if it would meet the pragmatic needs y'all had in hand as it would be much worse than even basic debuggers. Nonetheless, it might be the best option available. It's also worth noting that this relies on explicitly violating the expectation that generators will minimize the types used in modules.

@Horcrux7
Copy link

Horcrux7 commented Feb 8, 2021

In my Java compiler in the debug mode (which write the name section) every Java type has one related wasm type. There is a 1:1 mapping.

With the productive mode the types will be reduced to unique wasm types and no name section will be write. That I see no problem with writing field information.

kripken added a commit to WebAssembly/binaryen that referenced this issue Mar 1, 2021
Adds support for GC struct fields in the binary format, implementing
WebAssembly/gc#193

No extra tests needed, see the .fromBinary output which shows this working.

This also has a minor fix in the s-parser, we should not always add a name
to the map of index=>name - only if it exists. Without that fix, the binary
emitter would write out null strings.
@tlively
Copy link
Member

tlively commented May 18, 2021

@rossberg, this is where the struct field naming used in Binaryen was proposed. Can you elaborate on the issues you see with it?

@rossberg
Copy link
Member

rossberg commented May 19, 2021

@tlively, no, the name section is unrelated. What I was referring to was uses of duplicate symbolic field names I encountered in the J2CL example's .wat file, e.g., like this:

(type $s1 (struct (field $foo i32) (field $bar i32)))
(type $s2 (struct (field $bar i32) (field $baz i32)))

Here, $bar is bound in two different structs, with different values. The reference interpreter rejects duplicate bindings for the same symbolic field name.

Resolving such ambiguities can currently be made to work because field-related instructions annotate the struct type as well:

struct.get $s1 $bar

So by doing dependent lookup this becomes possible.

However, the plan of record is to get rid of the annotation $s1 here, in which case the $bar becomes syntactically ambiguous. It could not be resolved without type checking. So having dependent field name lookup is not gonna work moving forward.

To process the text files using dependent field names, I hacked the interpreter to rewrite field names $x used in the context of a type name $t to $t::$x. But that doesn't always work (e.g., when the type is unnamed).

@tlively
Copy link
Member

tlively commented May 19, 2021

Got it, I had forgotten or not known that the plan of record was to remove those type annotations. Assuming we don't want to have to track types while parsing the text format, I see that duplicate field names would be a problem. I'll work on getting this fixed in Binaryen.

tlively pushed a commit that referenced this issue Sep 8, 2023
This PR updates the spec to add additional subsections for type and field within the custom name section. struct.get and struct.set are also updated to indicate that their field indices are of type fieldidx, instead of u32. As a result, the metavariable representation was updated from i (integers) to y (indices).

Closes #193
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants