-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Escapes are broken #521
Comments
Related: effekt> '\n' == '\t'
true Let's just please fix the lexer for strings and characters incl. escapes. 🙏 |
Another issue:
Currently, we choose 2, but it’s inconsistent with what I presumed when making the highlighting (1 — what Scala does), and with common expectations of programmers & LLMs. |
In #622, I changed it to 1. |
This gets us half-way there for escapes in Strings: - Allow escapes in character literals, i.e. `'\n'` is a newline character - Note: This disallows `'\'` for \, but requires `'\\'` - Fixes `'\n' == '\t'` (cmp. #521) - Check that character literals consist of only one codepoint - Report wrong escape sequences in parser, but generate the error in lexer - this only happens for SOME of those - We should generate error tokens everywhere instead, different issue - Escape ourselves in Lexer and build up a string - Add additional escapes: `\$` (because of unquotes), `\u{...}` - In chez, escape all characters in one function, use chez's unicode handling functions for characters beyond 3 octals - Fixes #596
Currently, unicode escapes themselves are valid characters which is somewhat confusing, I'd expect the following escape to have to be written as
'\u0a00'
instead of\u0a00
:I'd propose only allowing unicode escapes inside of string or char literals.
Here's the relevant code:
effekt/effekt/shared/src/main/scala/effekt/Lexer.scala
Line 584 in d98362e
effekt/effekt/shared/src/main/scala/effekt/Lexer.scala
Lines 448 to 469 in d98362e
Related to this: we allow wildly nonstandard escapes in string literals ("anything goes"):
effekt/effekt/shared/src/main/scala/effekt/Lexer.scala
Lines 408 to 414 in d98362e
This requires some design, but I'd prefer to agree on a common format of an escape, for example Zig has:
\u{NNNNNN}
as "hexadecimal Unicode scalar value UTF-8 encoded (1 or more digits)"I like it because it allows multiple digits (subsuming the need for separate
\xNN
and\uNNNN
and\uNNNNNN
and whatever else) and it is easy to parse (<escape> ::= '\u{' <hex>+ '}'
)I think it's also fine to invest a little bit into lexing escapes properly by ourselves :)
The text was updated successfully, but these errors were encountered: