Inconsistent whitespace definitions in string literals and language itself #60209
Labels
A-frontend
Area: frontend (errors, parsing and HIR)
A-parser
Area: The parsing of Rust source code to an AST.
A-unicode
Area: Unicode
C-bug
Category: This is a bug.
T-compiler
Relevant to the compiler team, which will review and decide on the PR/issue.
Lexer uses Pattern_White_Space unicode property when skipping over trivia. However, when we process string literals with escaped newlines, we only skip ASCII whitespace:
rust/src/libsyntax/parse/mod.rs
Line 379 in fe0a415
Here's an example program that shows that U+200F is ignored in program text, but not in the string literal
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=ec59778d31dde69f29f1095aff2c9b66
Here's the text of the program in Debug format, to make whitespace slightly more visible
The text was updated successfully, but these errors were encountered: