Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization of the outermost nonterminal as an attribute should be a grammar error #47

Closed
cmsmcq opened this issue Feb 22, 2022 · 7 comments
Assignees
Labels
bug Something isn't working specification
Milestone

Comments

@cmsmcq
Copy link
Contributor

cmsmcq commented Feb 22, 2022

Consider the grammar

@S: 'a'.

The current spec says the sentence "a" should be serialized as <S>a</S> despite the @ marker, since otherwise we won't have XML output.

But in other cases which seem parallel (i.e. they involve a conflict between the rule that the output of an ixml processor is XML and the rule that the marks on nonterminals are to be obeyed), the spec -- or what I take to be consensus in the group -- says (sometimes not quite explicitly) that there is an error in the grammar:

  • serialization of a nonterminal with a non-XML name
  • serialization of a non-XML character
  • multiple attributes of the same name on a given element

All these cases should be treated on the same footing. They can all be grammar errors, or they can all be situations for which recovery is allowed or prescribed. Their common characteristic is that they involve a conflict among the principles:

  • ixml output is XML.
  • ixml output obeys the structural constraints implied by the grammar and the marks in the grammar (and will be schema-valid against an appropriately constructed schema).
  • ixml output contains every character of the input not marked in the grammar as hidden.

I think these principles are all good ones and should be preserved. Since any recovery from the error will involve violating one or the other of them, I do not think error recovery should be prescribed or allowed as conformant behavior.

Concrete proposal: in the section "Serialization", delete

If the root node is marked as an attribute, that marking is ignored.

and insert

If the root node is marked as an attribute, a dynamic error should be signaled.

This assumes that at some point we are going to insert a definition of dynamic errors and say something about how they are signaled.

@ndw
Copy link
Contributor

ndw commented Feb 22, 2022

👍

@spemberton
Copy link
Member

spemberton commented Feb 22, 2022 via email

@spemberton
Copy link
Member

spemberton commented Feb 22, 2022

I was wrong, we had not agreed to it, just raised it as an issue.
However, just referring to the root node is insufficient. Consider:

-root: a; b.
@ a: "a".
@ b: "b".

The proposed text

"The first rule in a grammar may not be marked as an attribute. If it is marked as hidden, all of its productions must produce exactly one non-hidden non-attribute nonterminal and no non-hidden terminals before or after that nonterminal."

attempts to cover such cases.

@spemberton
Copy link
Member

spemberton commented Feb 22, 2022

It should be noted in passing that such a restriction doesn't reduce the power of a grammar, since

@ root: a, b, c.

can be rewritten

-dummy: ^root.
@ root: a, b, c.

@cmsmcq
Copy link
Contributor Author

cmsmcq commented Feb 22, 2022

... However, just referring to the root node is insufficient. Consider:

-root: a; b. @ a: "a". @ b: "b".

I agree that referring to the start symbol does not suffice, but the proposed wording does not refer to the start symbol. In the wording change I made, I read the preceding sentence as having implicitly defined the term "the root node" as denoting the outermost nonterminal being serialized, not necessarily the start symbol of the grammar.

If there are other or better ways to express the change, that's fine.

@cmsmcq
Copy link
Contributor Author

cmsmcq commented Feb 28, 2022

We discussed this on the call of 22 February 2022.

ACTION (20220222-03): Steven - Issue #47 - @root (Outermost effective nonterminal as attribute) - to be proposed in the spec as a dynamic error.

ACTION (20220222-04): Steven - simplify the spec to require the output of 'well-formed' XML. Does that have some subcodes in terms of the violation of well-formedness. (Issues #25, #18, #23, #31, #47.)

@ndw ndw added the bug Something isn't working label Apr 3, 2022
@ndw ndw added this to the Version 1.0 milestone Apr 3, 2022
@ndw
Copy link
Contributor

ndw commented Apr 5, 2022

Steven reports this is resolved.

@ndw ndw closed this as completed Apr 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working specification
Projects
None yet
Development

No branches or pull requests

3 participants