Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust: "different styles between \r and \n" at end of line comments #34

Closed
rdipardo opened this issue Oct 13, 2021 · 3 comments
Closed
Labels
rust Caused by the rust lexer

Comments

@rdipardo
Copy link
Contributor

While trying out a patch for #33, TestLexers found this additional problem:

Lexing rust\Issue33.rs
C:\Users\Rob\source\git\scintilla-contib\lexilla\test\examples\rust\Issue33.rs:1: different styles between \r and \n at 21: 4, 0

C:\Users\Rob\source\git\scintilla-contib\lexilla\test\examples\rust\Issue33.rs:1: has different styles with \n versus \r\n line ends

SciTE (5.1.4) visually confirmed premature style switching after line comments in a file saved with Windows EOLs:

LexRust-EOL-splitting

This patch attempts to correct the EOL splitting: LexRust-Fix-EOL-splitting-at-comment-ends.zip

It would be ideal to have an omnibus lexer test (e.g. "AllStyles.rs"), but I leave that for someone who knows the language better.

@nyamatongwe
Copy link
Member

That loop appears to have two jobs:

  1. find the place to stop styling as comment
  2. set the line state to mark this line as a comment for when resuming from the next line.

Line state is retrieved near the top of Lex but set at various points when features are recognized.

The current code calls SetLineState at the '\r' in a '\r\n' (Windows) line end sequence but never for a '\n' (Unix) line end. Since this code is dead for '\n' we can assume the author intended that code to have some effect so was likely using '\r\n'. With the patch, SetLineState is not called for either '\n' or '\r\n'. While the line state code is a bit uncertain, it would be best to call it both for '\r\n' and '\n' so could probably be moved after the loop.

The loop is incrementing pos up to the start of the line end characters which is provided by styler.LineEnd() so the loop could be replaced with pos = style.LineEnd() although this should be checked. LineEnd is also safe if this lexer were ever to be extended to allow Unicode line ends (NEL, LS, PS).

@rdipardo
Copy link
Contributor Author

A further observation: I don't know any modern system that writes files with old Mac EOLs (\r), but a user remains free to choose them, which at the moment completely breaks the line-end-dependent styles:

unpatched-CR

@rdipardo
Copy link
Contributor Author

the loop could be replaced with pos = style.LineEnd() although this should be checked.

Done and done: LexRust-Prefer-LineEnd-to-While-Loop.zip

A second patch adds new tests; the *.properties file is a duplicate of the one used on #33, except comment folding is also on.

Some of the colours in these SciTE captures were altered for better contrast:

patched-CRLF-win32

patched-LF-win32

patched-CR-win32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rust Caused by the rust lexer
Projects
None yet
Development

No branches or pull requests

2 participants