-
-
Notifications
You must be signed in to change notification settings - Fork 36
Drop the Literal quoted flag from the data model #443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
aphillips
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start. Some wording tweaks suggested.
| Both _quoted_ and _unquoted_ values are represented by `Literal`, | ||
| as the use or lack of quotation is a presentation detail | ||
| which has no effect on the meaning of the _literal_. | ||
| The `value` of `Literal` is the "cooked" value (i.e. escape sequences are processed). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would avoid the jargon-ish "cooked" here and I would note the non-inclusion of the quotes (where they exist and as you did for reserved's sigils elsewhere)
| The `value` of `Literal` is the "cooked" value (i.e. escape sequences are processed). | |
| The `value` of `Literal` does not include surrounding quotes (where present) | |
| and replaces `quoted-escape` sequences with the unescaped character. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like for this to be considered in a separate issue or PR, for two reasons:
- The data model uses the "raw" and "cooked" terms also with respect to
TextandReserved, which ought to be updated simultaneously. I would rather keep that outside the scope of this rather focused PR. - At the moment, the data model is described and explained via equivalences with the MF2 syntax. If there is a desire to describe it as an explicit result of parsing the syntax, as suggested by the term "replaces" here, that's a much bigger change that ought to be accompanied by some additional documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If "raw" and "cooked" are terms, define them has terms (in a terminology section that needs to be added) and link them.
A different way of saying this would be to go more Unicode-like:
| The `value` of `Literal` is the "cooked" value (i.e. escape sequences are processed). | |
| The `value` of `Literal` is the code point sequence contained by the _literal_, | |
| with external syntax (such as quotes) removed | |
| and escape sequences resolved to the characters that they represent. |
This would apply to any representation, not just MF2. For example, a JS string would replace \u20ac notation with € in the Literal.
If there is a desire to describe it as an explicit result of parsing the syntax, as suggested by the term "replaces" here, that's a much bigger change that ought to be accompanied by some additional documentation.
I think we should stipulate that the data model representation can round-trip any MF2 string without loss of information, although doing so would canonicalize the representation, such as syntax (non-literal) whitespace and the presence or absence of quotes around literals such that the resulting round-trip string might not be a character-by-character match to the original.
Co-authored-by: Addison Phillips <addisonI18N@gmail.com>
ryzokuken
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the absence of this flag in the data model, do we still need to make this explicit distinction between "quoted" and "unquoted" values ?
Literalrepresents all literal values, both quoted and unquoted.
The presence or absence of quotes is not preserved by the data model.
If this is implied by the absence of the flag, then perhaps it's more confusing to leave it in as opposed to just dropping these lines.
I think it's good to include, to clarify that the mapping of these potentially separately representable syntax rules into a single data model interface is wholly intentional. |
At the moment, the data model includes a boolean
quotedproperty on theLiteralconstruct. This should be dropped, as it's not meant to effect anything during the runtime.As the data model is extensible by implementations, this does allow for an implementation to add the field back in as a private extension, should it have a need to track this information.