From 8bd0d5d6db61c3646bb9a6dab20a8672f187423e Mon Sep 17 00:00:00 2001 From: Eric Huss Date: Sat, 20 Dec 2025 13:42:59 -0800 Subject: [PATCH 1/5] Fix PUNCTUATION alternation order This fixes the order of PUNCTUATION given our interpretation of "first match wins" for alternation. This moves the longer strings to the start to prevent any prefix matching from happening (for example `.` instead of `...`). --- src/tokens.md | 78 +++++++++++++++++++++++++-------------------------- 1 file changed, 39 insertions(+), 39 deletions(-) diff --git a/src/tokens.md b/src/tokens.md index f34fcb92d6..dbbb0a6fa9 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -693,58 +693,58 @@ Punctuation tokens are used as operators, separators, and other parts of the gra r[lex.token.punct.syntax] ```grammar,lexer PUNCTUATION -> - `=` - | `<` - | `<=` - | `==` + `...` + | `..=` + | `<<=` + | `>>=` | `!=` - | `>=` - | `>` + | `%=` | `&&` - | `||` - | `!` - | `~` - | `+` - | `-` - | `*` - | `/` - | `%` - | `^` - | `&` - | `|` - | `<<` - | `>>` + | `&=` + | `*=` | `+=` | `-=` - | `*=` - | `/=` - | `%=` - | `^=` - | `&=` - | `|=` - | `<<=` - | `>>=` - | `@` - | `.` + | `->` | `..` - | `...` - | `..=` - | `,` - | `;` - | `:` + | `/=` | `::` - | `->` | `<-` + | `<<` + | `<=` + | `==` | `=>` + | `>=` + | `>>` + | `>` + | `^=` + | `|=` + | `||` + | `!` | `#` | `$` + | `%` + | `&` + | `(` + | `)` + | `*` + | `+` + | `,` + | `-` + | `.` + | `/` + | `:` + | `;` + | `<` + | `=` | `?` - | `{` - | `}` + | `@` | `[` | `]` - | `(` - | `)` + | `^` + | `{` + | `|` + | `}` + | `~` ``` > [!NOTE] From babb1efd2d59d3c971ad0acb24b2e419ad42cdd0 Mon Sep 17 00:00:00 2001 From: Eric Huss Date: Sat, 20 Dec 2025 13:38:01 -0800 Subject: [PATCH 2/5] Fix LIFETIME_TOKEN alternation order This fixes the order of LIFETIME_TOKEN (and LIFETIME_OR_LABEL) given our interpretation of "first match wins" for alternation. This moves RAW_LIFETIME to the start, otherwise a raw lifetime `'r#foo` would be interpreted as LIFETIME_TOKEN (`'r`) PUNCTUATION (`#`) IDENTIFIER_OR_KEYWORD (`foo`). --- src/tokens.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/tokens.md b/src/tokens.md index dbbb0a6fa9..b747f3eadf 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -655,12 +655,12 @@ r[lex.token.life] r[lex.token.life.syntax] ```grammar,lexer LIFETIME_TOKEN -> - `'` IDENTIFIER_OR_KEYWORD _not immediately followed by `'`_ - | RAW_LIFETIME + RAW_LIFETIME + | `'` IDENTIFIER_OR_KEYWORD _not immediately followed by `'`_ LIFETIME_OR_LABEL -> - `'` NON_KEYWORD_IDENTIFIER _not immediately followed by `'`_ - | RAW_LIFETIME + RAW_LIFETIME + | `'` NON_KEYWORD_IDENTIFIER _not immediately followed by `'`_ RAW_LIFETIME -> `'r#` IDENTIFIER_OR_KEYWORD _not immediately followed by `'`_ From 4c936735d9be7458d6ba975b748b20f0641ccefe Mon Sep 17 00:00:00 2001 From: Eric Huss Date: Sat, 20 Dec 2025 13:34:47 -0800 Subject: [PATCH 3/5] Fix FLOAT_LITERAL alternation order This fixes the order of FLOAT_LITERAL given our interpretation of "first match wins" for alternation. This moves the first rule (which matches things like `3.`) to the end so that other two rules have a chance to match something. Otherwise, input like `3.14` would be FLOAT_LITERAL (3.) INTEGER_LITERAL (14). --- src/tokens.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/tokens.md b/src/tokens.md index b747f3eadf..eefb96ba15 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -556,9 +556,9 @@ r[lex.token.literal.float] r[lex.token.literal.float.syntax] ```grammar,lexer FLOAT_LITERAL -> - DEC_LITERAL `.` _not immediately followed by `.`, `_` or an XID_Start character_ + DEC_LITERAL (`.` DEC_LITERAL)? FLOAT_EXPONENT SUFFIX? | DEC_LITERAL `.` DEC_LITERAL SUFFIX_NO_E? - | DEC_LITERAL (`.` DEC_LITERAL)? FLOAT_EXPONENT SUFFIX? + | DEC_LITERAL `.` _not immediately followed by `.`, `_` or an XID_Start character_ FLOAT_EXPONENT -> (`e`|`E`) (`+`|`-`)? (DEC_DIGIT|`_`)* DEC_DIGIT (DEC_DIGIT|`_`)* From f10d6b17b36766a0f32c860634b9b2400a81a74c Mon Sep 17 00:00:00 2001 From: Eric Huss Date: Sat, 20 Dec 2025 12:12:39 -0800 Subject: [PATCH 4/5] Fix INTEGER_LITERAL alternation order This fixes the order of INTEGER_LITERAL given our interpretation of "first match wins" for alternation. This moves DEC_LITERAL to the end of the list. Otherwise, an input of `0b1` would be interpreted as INTEGER_LITERAL (0) IDENTIFIER_OR_KEYWORD (b) INTEGER_LITERAL (1) instead of INTEGER_LITERAL (0b1). --- src/tokens.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/tokens.md b/src/tokens.md index eefb96ba15..9efc21912d 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -441,7 +441,7 @@ r[lex.token.literal.int] r[lex.token.literal.int.syntax] ```grammar,lexer INTEGER_LITERAL -> - ( DEC_LITERAL | BIN_LITERAL | OCT_LITERAL | HEX_LITERAL ) SUFFIX_NO_E? + ( BIN_LITERAL | OCT_LITERAL | HEX_LITERAL | DEC_LITERAL ) SUFFIX_NO_E? DEC_LITERAL -> DEC_DIGIT (DEC_DIGIT|`_`)* From 782cf078be828b6ed6f4b2cef6a9a7b407a41d39 Mon Sep 17 00:00:00 2001 From: Eric Huss Date: Sat, 20 Dec 2025 12:08:12 -0800 Subject: [PATCH 5/5] Fix order of Token wrt integer and float With our interpretation of "first match wins" with alternation, this fixes a problem where INTEGER_LITERAL was incorrectly in front of FLOAT_LITERAL. Otherwise, an input of `1.2` would be interpreted as `INTEGER_LITERAL`, `PUNCTUATION`, `INTEGER_LITERAL` instead of `FLOAT_LITERAL`. --- src/tokens.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/tokens.md b/src/tokens.md index 9efc21912d..3848839950 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -14,8 +14,8 @@ Token -> | RAW_BYTE_STRING_LITERAL | C_STRING_LITERAL | RAW_C_STRING_LITERAL - | INTEGER_LITERAL | FLOAT_LITERAL + | INTEGER_LITERAL | LIFETIME_TOKEN | PUNCTUATION | IDENTIFIER_OR_KEYWORD