-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a parser for the metadata CLI #9908
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9908 +/- ##
==========================================
- Coverage 58.31% 58.29% -0.02%
==========================================
Files 612 614 +2
Lines 75112 75214 +102
==========================================
+ Hits 43798 43843 +45
- Misses 30761 30821 +60
+ Partials 553 550 -3 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I absolutely love those railroad track diagrams!
Our variant of JSON is more liberal than the official JSON spec out of convenience. You can now have vertical linefeeds and \x?? in your JSON. You can also have a trailing comma.
I am kind of ambivalent about being JSON-like rather than a strict subset or superset of JSON (this notation is not a subset, because some things it allows are invalid JSON; nor is itt a superset, because e.g. numbers are invalid metadata notation).
Finally, you can probably write {"key":"value""key2":"value2"}.
I think that this should be forbidden.
I have fuzz tested the parser for about 12 hours x 32 cores, because it will panic when the parser selects a path that causes the lexer to not advance. These are mostly caused by @@* ("pick 0 or more of this big subobject") directives; that is why an entirely empty KVs is handled by checking that strings.TrimSpaces() == "".
I think that with the correct grammar, the parser should be able to run without that sort of special case; it feels like there’s something overlooked there.
With all that in mind, although we allow a bunch of weird stuff, we don't reject or misparse anything valid, so I'm happy.
I think that allowing weirdness is a sign of a bad grammar. This can be un-weird.
Of course, this is client-side, not server-side, so the risks are a lot less.
One final note: I think that the railroad track diagrams should specify where whitespace is allowed or forbidden. |
most of the open feedback right now is focused on the grammar structure and that all seems very design focused to me. The Design included a lot of thought and I think that unless we see major issues, we should move forward with its implementation. If we think that we want to iterate and clarify aspects of the design we can add it to our Backlog to both document and expand the Syntax. With all that in mind, can we move forward with the review here and get this first version out to users? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
020831a
to
08a4c6a
Compare
… this helps) also document
14aa0ec
to
221c569
Compare
This replaces an inflexible parser that's 1 line of code with a more complicated one that can parse any valid metadata. I changed the "replace" metadata op to
replace
in the CLI.set
is short, but I can never remember it.The grammar looks like this:
add
andedit
take aKV
.replace
takes aKVs
.Double-quoted strings are the same as Go's, except that
\U12345678
syntax isn't supported because the spec doesn't explain how it works and I don't know how it works.Single-quoted strings are very simple; they can't have any escapes in them.
Our variant of JSON is more liberal than the official JSON spec out of convenience. You can now have vertical linefeeds and
\x??
in your JSON. You can also have a trailing comma. Finally, you can probably write{"key":"value""key2":"value2"}
.I have fuzz tested the parser for about 12 hours x 32 cores, because it will panic when the parser selects a path that causes the lexer to not advance. These are mostly caused by
@@*
("pick 0 or more of this big subobject") directives; that is why an entirely empty KVs is handled by checking thatstrings.TrimSpaces() == ""
. Allowing empty KVs in the grammar is hard to get right without lookahead (which makes error messages bad); lookahead is especially bad for this grammar because how many tokens you need to look ahead depends on the exact content of double-quoted strings (because(escape | double quoted chars)*
is the content of the string and different strings have different numbers of tokens). So we avoid lookahead entirely and deal with requring some valid data to parse. The JSON parser is also not the ideal implementation which is why quirks like omitting commas works. Frankly, JSON made its grammar harder to implement than it needed to be to make it more restrictive than it needed to be. With all that in mind, although we allow a bunch of weird stuff, we don't reject or misparse anything valid, so I'm happy.