diff --git a/spec/01-lexical-syntax.md b/spec/01-lexical-syntax.md index 3af89a807b1..d778c209142 100644 --- a/spec/01-lexical-syntax.md +++ b/spec/01-lexical-syntax.md @@ -42,6 +42,7 @@ classes (Unicode general category given in parentheses): 1. Operator characters. These consist of all printable ASCII characters (`\u0020` - `\u007E`) that are in none of the sets above, mathematical symbols (`Sm`) and other symbols (`So`). +1. [Bidirectional explicit formatting](https://www.unicode.org/reports/tr9/#Bidirectional_Character_Types) characters. The nine characters `\u202a - \u202e` and `\u2066 - \u2069`, inclusive. These are forbidden from appearing in character and string literals; in Scala 2.12 they may not appear in source files at all. ## Identifiers @@ -403,12 +404,12 @@ members of type `Boolean`. ### Character Literals ```ebnf -characterLiteral ::= ‘'’ (charNoQuoteOrNewline | UnicodeEscape | charEscapeSeq) ‘'’ +characterLiteral ::= ‘'’ (charNoQuoteOrNewlineOrBidiFormatting | UnicodeEscape | charEscapeSeq) ‘'’ ``` A character literal is a single character enclosed in quotes. The character can be any Unicode character except the single quote -delimiter or `\u000A` (LF) or `\u000D` (CR); +delimiter or `\u000A` (LF) or `\u000D` (CR) or a bidirectional formatting character; or any Unicode character represented by either a [Unicode escape](01-lexical-syntax.html) or by an [escape sequence](#escape-sequences). @@ -426,12 +427,12 @@ which can also be written using the escape sequence `'\n'`. ```ebnf stringLiteral ::= ‘"’ {stringElement} ‘"’ -stringElement ::= charNoDoubleQuoteOrNewline | UnicodeEscape | charEscapeSeq +stringElement ::= charNoDoubleQuoteOrNewlineOrBidiFormatting | UnicodeEscape | charEscapeSeq ``` A string literal is a sequence of characters in double quotes. The characters can be any Unicode character except the double quote -delimiter or `\u000A` (LF) or `\u000D` (CR); +delimiter or `\u000A` (LF) or `\u000D` (CR) or a bidirectional formatting character; or any Unicode character represented by either a [Unicode escape](01-lexical-syntax.html) or by an [escape sequence](#escape-sequences). @@ -449,13 +450,13 @@ The value of a string literal is an instance of class `String`. ```ebnf stringLiteral ::= ‘"""’ multiLineChars ‘"""’ -multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuote} {‘"’} +multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuoteOrBidiFormatting} {‘"’} ``` A multi-line string literal is a sequence of characters enclosed in triple quotes `""" ... """`. The sequence of characters is arbitrary, except that it may contain three or more consecutive quote characters -only at the very end. Characters +only at the very end; the only forbidden Unicode characters are the bidirectional formatting ones. Characters must not necessarily be printable; newlines or other control characters are also permitted. Unicode escapes work as everywhere else, but none of the escape sequences [here](#escape-sequences) are interpreted. diff --git a/spec/13-syntax-summary.md b/spec/13-syntax-summary.md index 0e844bf2af2..0602549837d 100644 --- a/spec/13-syntax-summary.md +++ b/spec/13-syntax-summary.md @@ -31,6 +31,7 @@ opchar ::= // printableChar not matched by (whiteSpace | upper | lower // letter | digit | paren | delim | opchar | Unicode_Sm | Unicode_So) printableChar ::= // all characters in [\u0020, \u007F] inclusive charEscapeSeq ::= ‘\’ (‘b’ | ‘t’ | ‘n’ | ‘f’ | ‘r’ | ‘"’ | ‘'’ | ‘\’) +bidiFormatting ::= // all characters in [\u202a, \u202e] and [\u2066, \u2069], inclusive op ::= opchar {opchar} varid ::= lower idrest @@ -57,14 +58,14 @@ floatType ::= ‘F’ | ‘f’ | ‘D’ | ‘d’ booleanLiteral ::= ‘true’ | ‘false’ -characterLiteral ::= ‘'’ (charNoQuoteOrNewline | UnicodeEscape | charEscapeSeq) ‘'’ +characterLiteral ::= ‘'’ (charNoQuoteOrNewlineOrBidiFormatting | UnicodeEscape | charEscapeSeq) ‘'’ stringLiteral ::= ‘"’ {stringElement} ‘"’ | ‘"""’ multiLineChars ‘"""’ -stringElement ::= charNoDoubleQuoteOrNewline +stringElement ::= charNoDoubleQuoteOrNewlineOrBidiFormatting | UnicodeEscape | charEscapeSeq -multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuote} {‘"’} +multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuoteOrBidiFormatting} {‘"’} symbolLiteral ::= ‘'’ plainid