Permalink
Browse files

SI-5086 clean up EBNF

- sequences of block statements were wrong
  (btw, note that BlockStat matches the empty sequence of tokens)
- lexical syntax was pretty messy: clarified, removed extraneous backslashes
  • Loading branch information...
1 parent 32e0943 commit 5135bae5a7c2d156dd55dfc0aabf8b41d393f4a2 @adriaanm adriaanm committed Mar 12, 2014
Showing with 78 additions and 68 deletions.
  1. +12 −13 03-lexical-syntax.md
  2. +43 −42 08-expressions.md
  3. +23 −13 15-scala-syntax-summary.md
View
@@ -408,8 +408,7 @@ members of type `Boolean`.
### Character Literals
```
-characterLiteral ::= ‘'’ printableChar ‘'’
- | ‘'’ charEscapeSeq ‘'’
+characterLiteral ::= ‘'’ (printableChar | charEscapeSeq) ‘'’
```
A character literal is a single character enclosed in quotes.
@@ -432,7 +431,7 @@ the octal escape `'\12'` ([see here](#escape-sequences)).
### String Literals
```
-stringLiteral ::= ‘\"’ {stringElement} ‘\"’
+stringLiteral ::= ‘"’ {stringElement} ‘"’
stringElement ::= printableCharNoDoubleQuote | charEscapeSeq
```
@@ -510,16 +509,16 @@ Because there is a predefined
The following escape sequences are recognized in character and string literals.
-| | | | |
-|-------|----------|-----------------|------|
-| `\b` | `\u0008` | backspace | BS |
-| `\t` | `\u0009` | horizontal tab | HT |
-| `\n` | `\u000a` | linefeed | LF |
-| `\f` | `\u000c` | form feed | FF |
-| `\r` | `\u000d` | carriage return | CR |
-| `\"` | `\u0022` | double quote | " |
-| `\'` | `\u0027` | single quote | ' |
-| `\\` | `\u005c` | backslash | `\` |
+| charEscapeSeq | unicode | name | char |
+|---------------|----------|-----------------|--------|
+| `‘\‘ ‘b‘` | `\u0008` | backspace | `BS` |
+| `‘\‘ ‘t‘` | `\u0009` | horizontal tab | `HT` |
+| `‘\‘ ‘n‘` | `\u000a` | linefeed | `LF` |
+| `‘\‘ ‘f‘` | `\u000c` | form feed | `FF` |
+| `‘\‘ ‘r‘` | `\u000d` | carriage return | `CR` |
+| `‘\‘ ‘"‘` | `\u0022` | double quote | `"` |
+| `‘\‘ ‘'‘` | `\u0027` | single quote | `'` |
+| `‘\‘ ‘\‘` | `\u005c` | backslash | `\` |
A character with Unicode between 0 and 255 may also be represented by
View
@@ -1,44 +1,44 @@
# Expressions
```
- Expr ::= (Bindings | id | `_') `=>' Expr
- | Expr1
- Expr1 ::= `if' `(' Expr `)' {nl} Expr [[semi] `else' Expr]
- | `while' `(' Expr `)' {nl} Expr
- | `try' (`{' Block `}' | Expr) [`catch' `{' CaseClauses `}'] [`finally' Expr]
- | `do' Expr [semi] `while' `(' Expr ')'
- | `for' (`(' Enumerators `)' | `{' Enumerators `}') {nl} [`yield'] Expr
- | `throw' Expr
- | `return' [Expr]
- | [SimpleExpr `.'] id `=' Expr
- | SimpleExpr1 ArgumentExprs `=' Expr
- | PostfixExpr
- | PostfixExpr Ascription
- | PostfixExpr `match' `{' CaseClauses `}'
- PostfixExpr ::= InfixExpr [id [nl]]
- InfixExpr ::= PrefixExpr
- | InfixExpr id [nl] InfixExpr
- PrefixExpr ::= [`-' | `+' | `~' | `!'] SimpleExpr
- SimpleExpr ::= `new' (ClassTemplate | TemplateBody)
- | BlockExpr
- | SimpleExpr1 [`_']
- SimpleExpr1 ::= Literal
- | Path
- | `_'
- | `(' [Exprs] `)'
- | SimpleExpr `.' id s
- | SimpleExpr TypeArgs
- | SimpleExpr1 ArgumentExprs
- | XmlExpr
- Exprs ::= Expr {`,' Expr}
- BlockExpr ::= `{' CaseClauses `}'
- | `{' Block `}'
- Block ::= {BlockStat semi} [ResultExpr]
- ResultExpr ::= Expr1
- | (Bindings | ([`implicit'] id | `_') `:' CompoundType) `=>' Block
- Ascription ::= `:' InfixType
- | `:' Annotation {Annotation}
- | `:' `_' `*'
+Expr ::= (Bindings | id | `_') `=>' Expr
+ | Expr1
+Expr1 ::= `if' `(' Expr `)' {nl} Expr [[semi] `else' Expr]
+ | `while' `(' Expr `)' {nl} Expr
+ | `try' (`{' Block `}' | Expr) [`catch' `{' CaseClauses `}'] [`finally' Expr]
+ | `do' Expr [semi] `while' `(' Expr ')'
+ | `for' (`(' Enumerators `)' | `{' Enumerators `}') {nl} [`yield'] Expr
+ | `throw' Expr
+ | `return' [Expr]
+ | [SimpleExpr `.'] id `=' Expr
+ | SimpleExpr1 ArgumentExprs `=' Expr
+ | PostfixExpr
+ | PostfixExpr Ascription
+ | PostfixExpr `match' `{' CaseClauses `}'
+PostfixExpr ::= InfixExpr [id [nl]]
+InfixExpr ::= PrefixExpr
+ | InfixExpr id [nl] InfixExpr
+PrefixExpr ::= [`-' | `+' | `~' | `!'] SimpleExpr
+SimpleExpr ::= `new' (ClassTemplate | TemplateBody)
+ | BlockExpr
+ | SimpleExpr1 [`_']
+SimpleExpr1 ::= Literal
+ | Path
+ | `_'
+ | `(' [Exprs] `)'
+ | SimpleExpr `.' id s
+ | SimpleExpr TypeArgs
+ | SimpleExpr1 ArgumentExprs
+ | XmlExpr
+Exprs ::= Expr {`,' Expr}
+BlockExpr ::= ‘{’ CaseClauses ‘}’
+ | ‘{’ Block ‘}’
+Block ::= BlockStat {semi BlockStat} [ResultExpr]
+ResultExpr ::= Expr1
+ | (Bindings | ([`implicit'] id | `_') `:' CompoundType) `=>' Block
+Ascription ::= `:' InfixType
+ | `:' Annotation {Annotation}
+ | `:' `_' `*'
```
Expressions are composed of operators and operands. Expression forms are
@@ -558,8 +558,9 @@ where `anon\$X` is some freshly created name.
## Blocks
```
-BlockExpr ::= `{' Block `}'
-Block ::= {BlockStat semi} [ResultExpr]
+BlockExpr ::= ‘{’ CaseClauses ‘}’
+ | ‘{’ Block ‘}’
+Block ::= BlockStat {semi BlockStat} [ResultExpr]
```
A block expression `{$s_1$; $\ldots$; $s_n$; $e\,$}` is
@@ -1319,10 +1320,10 @@ include at least the expressions of the following forms:
```
BlockStat ::= Import
- | {Annotation} [`implicit'] Def
+ | {Annotation} [implicit’ | ‘lazy’] Def
| {Annotation} {LocalModifier} TmplDef
| Expr1
- |
+ |
TemplateStat ::= Import
| {Annotation} {Modifier} Def
| {Annotation} {Modifier} Dcl
@@ -1,25 +1,36 @@
# Scala Syntax Summary
-<!-- TODO: introduce SeqPattern syntax -->
+The following descriptions of Scala tokens uses literal characters `‘c’` when referring to the ASCII fragment `\u0000``\u007F`.
-The lexical syntax of Scala is given by the following grammar in EBNF
-form.
+_Unicode escapes_ are used to represent the Unicode character with the given hexadecimal code:
+
+```
+UnicodeEscape ::= ‘\‘ ‘u‘ {‘u‘} hexDigit hexDigit hexDigit hexDigit
+hexDigit ::= ‘0’ | … | ‘9’ | ‘A’ | … | ‘F’ | ‘a’ | … | ‘f’
+```
+
+The lexical syntax of Scala is given by the following grammar in EBNF form:
```
-upper ::= ‘A’ | … | ‘Z’ | ‘\$’ | ‘_’ // and Unicode category Lu
+whiteSpace ::= ‘\u0020’ | ‘\u0009’ | ‘\u000D’ | ‘\u000A’
+upper ::= ‘A’ | … | ‘Z’ | ‘$’ | ‘_’ // and Unicode category Lu
lower ::= ‘a’ | … | ‘z’ // and Unicode category Ll
letter ::= upper | lower // and Unicode categories Lo, Lt, Nl
digit ::= ‘0’ | … | ‘9’
-opchar ::= // “all other characters in \u0020-\u007F and Unicode
- // categories Sm, So except parentheses ([{}]) and periods”
+paren ::= ‘(’ | ‘)’ | ‘[’ | ‘]’ | ‘{’ | ‘}’
+delim ::= ‘`’ | ‘'’ | ‘"’ | ‘.’ | ‘;’ | ‘,’
+opchar ::= // printableChar not matched by (whiteSpace | upper | lower |
+ // letter | digit | paren | delim | opchar | Unicode_Sm | Unicode_So)
+printableChar ::= // all characters in [\u0020, \u007F] inclusive
+charEscapeSeq ::= ‘\‘ (‘b‘ | ‘t‘ | ‘n‘ | ‘f‘ | ‘r‘ | ‘"‘ | ‘'‘ | ‘\‘)
op ::= opchar {opchar}
varid ::= lower idrest
plainid ::= upper idrest
| varid
| op
id ::= plainid
- | ‘\`’ stringLiteral ‘\`’
+ | ‘`’ stringLiteral ‘`’
idrest ::= {letter | digit} [‘_’ op]
integerLiteral ::= (decimalNumeral | hexNumeral) [‘L’ | ‘l’]
@@ -38,18 +49,17 @@ floatType ::= ‘F’ | ‘f’ | ‘D’ | ‘d’
booleanLiteral ::= ‘true’ | ‘false’
-characterLiteral ::= ‘\'‘ printableChar ‘\'’
- | ‘\’ charEscapeSeq ‘\'’
+characterLiteral ::= ‘'’ (printableChar | charEscapeSeq) ‘'’
stringLiteral ::= ‘"’ {stringElement} ‘"’
| ‘"""’ multiLineChars ‘"""’
-stringElement ::= printableCharNoDoubleQuote
+stringElement ::= (printableChar except ‘"’)
| charEscapeSeq
multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuote} {‘"’}
symbolLiteral ::= ‘'’ plainid
-comment ::= ‘/*’ “any sequence of characters” ‘*/’
+comment ::= ‘/*’ “any sequence of characters; nested comments are allowed” ‘*/’
| ‘//’ “any sequence of characters up to end of line”
nl ::= $\mathit{“new line character”}$
@@ -141,7 +151,7 @@ grammar.
| [nl] BlockExpr
BlockExpr ::= ‘{’ CaseClauses ‘}’
| ‘{’ Block ‘}’
- Block ::= {BlockStat semi} [ResultExpr]
+ Block ::= BlockStat {semi BlockStat} [ResultExpr]
BlockStat ::= Import
| {Annotation} [‘implicit’ | ‘lazy’] Def
| {Annotation} {LocalModifier} TmplDef
@@ -210,7 +220,6 @@ grammar.
Annotation ::= ‘@’ SimpleType {ArgumentExprs}
ConstrAnnotation ::= ‘@’ SimpleType ArgumentExprs
- NameValuePair ::= ‘val’ id ‘=’ PrefixExpr
TemplateBody ::= [nl] ‘{’ [SelfType] TemplateStat {semi TemplateStat} ‘}’
TemplateStat ::= Import
@@ -287,6 +296,7 @@ grammar.
```
<!-- TODO add:
+SeqPattern ::= ...
SimplePattern ::= StableId [TypePatArgs] [‘(’ [SeqPatterns] ‘)’]
TypePatArgs ::= ‘[’ TypePatArg {‘,’ TypePatArg} ‘]’

0 comments on commit 5135bae

Please sign in to comment.