New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix highlighting of byte escape sequences #15303
Conversation
Currently non-UTF8 escape sequences in byte strings and any escape sequences in byte literals are ignored.
crates/syntax/src/ast/token_ext.rs
Outdated
// XXX: `Mode::CStr` is not supported by `unescape_literal` of ra-ap-rustc_lexer yet. | ||
// Here we pretend it to be a byte string. | ||
const MODE: Mode = Mode::ByteStr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems unescape_literal()
not supporting c strings is intentional. There's rustc_lexer::unescape::unescape_c_string()
for c strings instead. See this comment for why this is the case.
I'm starting to doubt whether IsString::escaped_char_ranges()
is the right abstraction as unescape_literal()
and unescape_c_string()
take a closure of different signature 🤔 But it's out of scope for this PR to come up with something to replace it I guess.
Can you override <CString as IsString>::escaped_char_ranges()
so that it uses unescape_c_string()
? Since it's only used for highlighting where the actual unescaped bytes aren't relevant, we can discard CStrUnit
and pass e.g. Ok(' ')
to the callback for the time being. A comment would be nice too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to doubt whether
IsString::escaped_char_ranges()
is the right abstraction asunescape_literal()
andunescape_c_string()
take a closure of different signature thinking
I think it's "more correct" to use CStrUnit
aka. Either<char, u8>
for all these functions. But the name CStrUnit
is exclusive and not really suitable.
But it's out of scope for this PR to come up with something to replace it I guess.
I agree.
Can you override
<CString as IsString>::escaped_char_ranges()
so that it usesunescape_c_string()
? Since it's only used for highlighting where the actual unescaped bytes aren't relevant, we can discardCStrUnit
and pass e.g.Ok(' ')
to the callback for the time being. A comment would be nice too!
I only found the format specifier parser which makes use of the range information as well as unescaped string content. But it takes ast::String
so should not be affected by a placeholder ' '
.
The change is pushed now.
I'm also thinking about highlighting erroneous escape sequences as a special color, instead of leaving them the same as literals. But there is only a similar |
I think that's reasonable (I actually thought something like that would be cool while reviewing!). Do you want to implement it yourself, either in this PR or in a separate PR? |
Implemented. I'm not sure how to provide a default color (like, red?) for different editors. I cannot find related color settings for |
Sorry for the delay. The implementation looks good! Do we also want to highlight invalid escape sequences in As for the styling, it seems we have little control over it. Looks like we can define some fallback TextMate scopes, but seeing we don't provide it for |
I don't think it will help much since
The link is broken. And I'm not familiar with TextMate thus prefer to skip it for now. |
Makes sense. Could you add a comment explaining that rationale? r=me with that, thanks!
🤦♂️ My bad, fixed. @bors delegate+ |
☀️ Test successful - checks-actions |
Currently non-UTF8 escape sequences in byte strings and any escape sequences in byte literals are ignored.