You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Shortly after we agreed to adopt the insertions proposal using ^ for the mark (instead of + as originally proposed), I wrote:
[...] using “^” for insertions has another consequence: it leaves no mark that means “match and insert this”.
[Using the previous draft],
S: "(", "a", ")".
will match “(a)” producing, <S>(a)</S>. If I want, I can exlude some of those literals. Now:
S: -"(", "a", -")".
will match “(a)” producing, <S>a</S>. If I want, I can also be more explicit about matching those literals. This:
^S: ^"(", "a", ^")".
still matches “(a)” and produces, <S>(a)</S>.
But if we overload “^” for insertions, then the semantics of the previous rule change. It will only match “a” and it will produce <S>(a)</S>.
There’s no longer a mark that means “I want to be explicit that this literal should be matched and inserted.”
I’m not sure that’s a deal breaker, but it doesn’t sound easy to explain.
We can’t remove “^” on nonterminals, because we need it for this use case:
S: A, B.
-A: "a".
-B: "b"; ^A, "b".
It would be trivial to rewrite that grammar so that the “^” wasn’t needed, but that’s necessarily going to be the case for more complicated grammars.
There’s no corresponding use case for “^” on terminals-to-be-inserted so I don’t think we’ve lost any functionality, but I really don’t like the loss of symmetry.
I don’t think the observation that “^” looks a little bit like a proofreader’s “insertion mark” really helps the explanation very much. (If it was a proofer’s insertion mark, it would go below the insertion point not before it, so it’s kind of weak at best.)
I still think + or even the much maligned & would be better.
Michael agreed:
Any language in which the default behavior cannot be explicitly named has a gap and is suspect.
I liked "+" as a signal for insertion, since it's a conventional opposite of "-", and insertion is notionally opposite to suppression.
Michael went on to add (in a comment on the original insertions proposal issue):
Using "^" for insertion bothers me a great deal. It suggests that the case is parallel to that of "^" on nonterminals, which it is not, and it means that there is no way to make the default marking of a terminal explicit by supplying an explicit mark.
The mark "+" has neither of those objections and is a clear antonym to "-". I was not present for the call on which the change was agreed to, so I may have missed my chance to argue against the change from "+" to "^", but I think the change was a serious error.
The text of 12 April attempts to make the two cases parallel by saying that a nonterminal may be
inserted as an element with its children (^),
but I think this is a recipe for confusion. The difference between a nonterminal marked '-' and one marked '^' is not that one is omitted and one is inserted; it is that in one case, the string generated by the nonterminal is tagged as an element and in the other it is not tagged. The current wording seems to be unclear on the difference between tags and elements and needs to be cleaned up.
This actually has a practical consequence in my implementation because I make all marks explicit when I load a grammar. That avoids subsequently testing for "this mark or no mark". Leaving no explicit mark for "^" in the old sense of "match and insert this" means I have to "fake" an explicit mark character.
The text was updated successfully, but these errors were encountered:
Shortly after we agreed to adopt the insertions proposal using
^
for the mark (instead of+
as originally proposed), I wrote:Michael agreed:
Michael went on to add (in a comment on the original insertions proposal issue):
This actually has a practical consequence in my implementation because I make all marks explicit when I load a grammar. That avoids subsequently testing for "this mark or no mark". Leaving no explicit mark for "^" in the old sense of "match and insert this" means I have to "fake" an explicit mark character.
The text was updated successfully, but these errors were encountered: