From bbf688023eb5de1f9fdedb81ff4f2f48e3a6a23f Mon Sep 17 00:00:00 2001 From: Bruno Haible Date: Fri, 15 Mar 2024 01:46:29 +0100 Subject: [PATCH] Disallow whitespace as the first character of a reserved-body in a reserved-statement. In the 'reserved-statement' nonterminal, there is an ambiguity if there is more than one whitespace character between the 'reserved-keyword' and the first non-whitespace character of the 'reserved-body', because these whitespace characters can be seen as part of the 's' nonterminal or as part of the 'reserved-body' nonterminal. According to the principles explained in #725 and the proposed resolution of #721, it is not desired that a 'reserved-body' starts with a whitespace character; rather, such a whitespace character is meant to be interpreted as part of the preceding 's' nonterminal. Test case: ``` .regex /foo/{xyz}{{hello}} ``` This patch removes this ambiguity, by disallowing whitespace as the first character of a 'reserved-body' in a reserved-statement. It thus fixes the first part of #721. Details: - Other occurrences of 'resolved-body' (after a 'reserved-annotation' or 'private-use-annotation') are not affected. - A new nonterminal 'resolved-body-part' is introduced, referenced twice. - A new nonterminal 'reserved-body-in-statement' is introduced, referenced once. Its purpose is to clarify that the two parts belong together. - A new nonterminal 'reserved-body-in-statement-start' is introduced, in order to follow the common *-start / *-part idiom. --- spec/message.abnf | 7 +++++-- spec/syntax.md | 7 +++++-- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/spec/message.abnf b/spec/message.abnf index 8436fb9c99..c437c5e0cb 100644 --- a/spec/message.abnf +++ b/spec/message.abnf @@ -54,11 +54,13 @@ local = %s".local" match = %s".match" ; Reserve additional .keywords for use by future versions of this specification. -reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression) +reserved-statement = reserved-keyword [s reserved-body-in-statement] 1*([s] expression) ; Note that the following production is a simplification, ; as this rule MUST NOT be considered to match existing keywords ; (`.input`, `.local`, and `.match`). reserved-keyword = "." name +reserved-body-in-statement = reserved-body-in-statement-start *([s] 1*reserved-body-part) +reserved-body-in-statement-start = reserved-body-part ; Reserve additional sigils for use by future versions of this specification. reserved-annotation = reserved-annotation-start reserved-body @@ -67,7 +69,8 @@ reserved-annotation-start = "!" / "%" / "*" / "+" / "<" / ">" / "?" / "~" ; Reserve sigils for private-use by implementations. private-use-annotation = private-start reserved-body private-start = "^" / "&" -reserved-body = *([s] 1*(reserved-char / reserved-escape / quoted)) +reserved-body = *([s] 1*reserved-body-part) +reserved-body-part = reserved-char / reserved-escape / quoted ; Names and identifiers ; identifier matches https://www.w3.org/TR/REC-xml-names/#NT-QName diff --git a/spec/syntax.md b/spec/syntax.md index 3b4384c8e8..0cfd5b1452 100644 --- a/spec/syntax.md +++ b/spec/syntax.md @@ -222,8 +222,10 @@ a similarly wide range of content as _reserved annotations_, but it MUST end with one or more _expressions_. ```abnf -reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression) +reserved-statement = reserved-keyword [s reserved-body-in-statement] 1*([s] expression) reserved-keyword = "." name +reserved-body-in-statement = reserved-body-in-statement-start *([s] 1*reserved-body-part) +reserved-body-in-statement-start = reserved-body-part ``` > [!Note] @@ -656,7 +658,8 @@ unrecognized _reserved-annotations_ or _private-use-annotations_ have no meaning reserved-annotation = reserved-annotation-start reserved-body reserved-annotation-start = "!" / "%" / "*" / "+" / "<" / ">" / "?" / "~" -reserved-body = *([s] 1*(reserved-char / reserved-escape / quoted)) +reserved-body = *([s] 1*reserved-body-part) +reserved-body-part = reserved-char / reserved-escape / quoted ``` ## Markup