-
-
Notifications
You must be signed in to change notification settings - Fork 36
Description
There is an ambiguity in
https://github.com/unicode-org/message-format-wg/blob/main/spec/syntax.md
and
https://github.com/unicode-org/message-format-wg/blob/main/spec/message.abnf
Inside an expression, the annotation may be followed by whitespace:
expression = literal-expression
/ variable-expression
/ annotation-expression
literal-expression = "{" [s] literal [s annotation] *(s attribute) [s] "}"
variable-expression = "{" [s] variable [s annotation] *(s attribute) [s] "}"
annotation-expression = "{" [s] annotation *(s attribute) [s] "}"
An annotation of type private-use-annotation or reserved-annotation ends with a resolved-body:
reserved-annotation = reserved-annotation-start reserved-body
private-use-annotation = private-start reserved-body
A resolved-body may end with U+3000:
reserved-body = *([s] 1*(reserved-char / reserved-escape / quoted))
reserved-char = content-char / "."
Examples (using \u escapes for legibility): The input strings
{ % foo bar \u3000\u3000 }
and
{ & foo bar \u3000\u3000 @x }
contain a resolved-body as part of the annotation in the expression.
How does it get parsed? There are 3 possibilities:
' foo bar \u3000\u3000' parsed as reserved-body
' ' parsed as [s]
' foo bar \u3000' parsed as reserved-body
'\u3000 ' parsed as [s]
' foo bar' parsed as reserved-body
' \u3000\u3000 ' parsed as [s]
It appears that the contents of the reserved-body is meant to appear as the body field of an UnsupportedStatement element in the data model (cf. https://github.com/unicode-org/message-format-wg/blob/main/spec/data-model/README.md ). Therefore it matters which of these 3 possibilities the parser chooses.
Please, specify how this ambiguity should be resolved.