-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text format: type definition abbreviations #333
Comments
What is |
|
Actually, strtype stands for "structured type", i.e., a compound type like structs, funcs, arrays, as opposed to scalar types like ints and floats. |
There is also the issue of using symbolic references to struct fields, as discussed here: #193 (comment). I don't believe Binaryen's parser currently allows symbolic field references, but I agree they would be nice to have. Since we ended up keeping the type annotations on |
We could, but personally, I don't find it worth the added complexity. Currently, conflicting field symbols are simply an error, so you prefix them to disambiguate, like in assembly and the good old C days. : ) |
We should consider allowing un-prefixed names to be a requirement to make debugging in browser devtools more feasible. Binaryen supports this today (although it doesn’t allow using field names symbolically in the text), so losing this capability would regress the debugging experience. As we get closer to productionizing WasmGC, debugging has become an important area of user concern. |
I'm confused, where does it support them then? |
Field names in type definitions are parsed and end up in the names section, but the parser doesn’t support using those names as field indices in instructions. |
Now that we've settled on keeping an immediate for both the |
I filed #361 to continue the discussion about whether we should have dependent field name lookup. As promised, here are some examples of the abbreviations I suggested allowing in the opening post. Most are probably uncontroversial. ;; maximal type definition
(rec (type (sub (struct (field i32) (field i64)))))
;; abbreviate out singleton rec group
(type (sub (struct (field i32) (field i64))))
;; abbreviate out declaration of zero supertypes
(type (struct (field i32) (field i64)))
;; combine field declarations
(type (struct (field i32 i64)))
;; abbreviate out 'field' -- note: no longer allowed
(type (struct i32 i64)) |
These all sound good to me and are already supported by the interpreter. The only exception is the last one, which I'm a bit skeptical about. It has no real parallel anywhere else so far, and it might become a debt when we later want to add more elements to struct declarations (e.g. descriptors or static fields like we discussed). |
Fair enough. I would be fine dropping the last one. |
@tlively Could you post an updated grammar for abbreviations? Your example Also, your (3 and 4th) abbreviations for |
I've updated the grammar in the opening post to remove the abbreviation that allows having a list of field types without |
Okay, the updated grammar LGTM. Your previous comment with examples though still seems to not match the grammar, and I just want to make sure I understand correctly. My understanding of the proposed grammar is that the keyword nesting always goes: 'rec', 'type', and then 'sub'. With 'rec' and 'sub' being possible to omit, but 'type' always being present. |
Yes, your understanding is correct and my examples were wrong (again). I fixed the examples now. |
Another issue here is the 'final' keyword. My understanding from This is sort of weird, because to define a final type definition that has no super type you need to declare it like:
And that makes the This makes me wonder if the 'final' keyword should be inverted so the absence of |
@eqrion, unfortunately, we cannot change the meaning of the existing text format, plus we want to treat preexisting type declarations as final (in order to keep call_indirect unaffected). That pretty much necessitates the current syntax and interpretation, which also more or less matches what happens in the binary format. Edit: After rereading your comment I realised that I did not address your actual suggestion. Yes, we could invert the keyword, but that would add a lot of noise to the vastly more common case. And it would be the opposite of how it is presented in all other languages. So I'm not sure we really want to do that. If it helps, my suggested reading of |
Can't this be shortened to |
The odd thing to me is that wrapping a plain type definition I agree though that final=false will probably be the common case for 'sub', so inverting it would be verbose. I'm not sure that's an issue though, the text format is already very verbose. And the advantage is that the absence of 'nofinal' or 'extensible' would syntactically match the plain type definition syntax we have. |
The current text syntax has been causing some confusion over here at Mozilla. In particular, we had confusion over why a particular type was not extensible and why an empty
In my opinion the current syntax conflates two independent concepts: finality and base types. It's not clear that I would propose the following syntax for type definitions instead:
You can see I suggest the term "open" for "non-final" - this naturally follows from familiar phrases like "open to extension", and is the terminology used by Kotlin (which is also final-by-default). I also suggest that With this scheme, types would remain final by default. It's true that this would add noise to GC type hierarchy definitions, but I think it's better to be slightly more explicit than to have finality flip-flopping back and forth as you add Curious for your thoughts. |
I had never heard of "open" used for this before and I don't think it's immediately clear what that means without further explanation. I'd be open to it, though, and at least there is precedent. Having the |
Agreed with @tlively, changing the syntactic structure would create a mismatch with abstract and binary syntax, which the text format is supposed to reflect. Not too excited about requiring |
In that case, I think it would still be an improvement to use So to be clear, that would look like:
In this world, an empty If others agreed with this, I would be happy to try and update the spec, interpreter, tests, etc. myself, with the disclaimer that it would be my first time doing such things 🙂 |
Oh interesting, I was thinking of having A PR updating everything for this change would be very welcome. Also note that it should still be valid to have this:
|
@bvisness, are you still planning to take a stab at implementing this? If you need any help or cannot prioritize this, let me know and I can help out. |
Yes, I was planning to do this roughly next week or so. If it's holding anything up, though, I'd be happy to jump on it sooner. Another thing I just realized - while I would prefer to have |
No, it's not that urgent, next week would be fine. We just need to make sure we finish up well before September. This change should be fine regarding back compat. Anyone depending on the text format would be using it as an input to Binaryen, and I can handle coordinating with users and updating Binaryen separately from this spec update. |
Since we write a lot of text format tests in Binaryen, we've implemented a number of abbreviations to make type definitions shorter. Here's the grammar I've implemented the new Binaryen parser that combines the text format used in this repo with the abbreviations used in Binaryen.
Comments and suggested edits welcome. Once it looks good, we should document it in the MVP doc. For the actual spec, we'll have to extract the abbreviations from the grammar and turn them into rewrite rules.
The text was updated successfully, but these errors were encountered: