Do not trim whitespace when part of strings or regexes #198164

333fred · 2023-11-14T01:18:02Z

Fixes #195010. If the token to be removed is part of a string or a regex, then we do not want to remove the character, as it is very likely semantically important.

Before I mark this as ready to review, I need some help with how to approach testing this. Presumably, tests need to go in trimTrailingWhitespaceCommand.test.ts, but I'm going to need a language in the unit test suite with standard token types. Is there an example of this that isn't creating an entire language provider just for the test? Every example of testTextModel.createTextModel that I can see that has a language makes an entire custom one just for the test; if this is the only approach, I could use someone to point out what things I'll need to mock to test this. I would also like to add some integration tests, particularly around whitespace after the end a string, to ensure that it works with real tokenization as well.

The other open question I have is whether this behavior should be on by default, or if we need to update the trim whitespace option to have a "trim except in strings and regexes" option, as this behavior may not be desired by some?

Fixes #195010. If the token to be removed is part of a string or a regex, then we do not want to remove the character, as it is very likely semantically important.

333fred · 2023-11-14T01:18:48Z

@hediet, since you're the assignee on #195010, would you be able to comment on my questions above?

coreyward · 2023-11-15T16:32:11Z

Not sure how viable this is, but I actually prefer trailing whitespace in multiline strings to be trimmed most of the time (in JS), but in some cases it's not desirable (specifically for tests). Perhaps there is a way to alter this behavior just for test files?

333fred · 2023-11-15T20:00:17Z

@hediet or @alexdima, thoughts on my questions? Given the comment from @coreyward on just this draft PR, I'm definitely leaning towards "needs an option switch". Personally, I do not want whitespace trimming in a multiline string, ever, in any file, regardless of whether it's a test file or not. It could be a regex component, or part of a piece of markdown where trailing matters.

hediet · 2023-11-17T09:38:39Z

I think this should be behind an opt-in flag.
Also, I think when this flag is enabled, trimming whitespace should wait on the tokenization to complete.

1. Add a new setting to control whether to trim whitespace in multiline strings/regexes. This setting is then piped through to everywhere that is calling trim whitespace, which is file save editing, the trim whitespace command, and notebooks. 2. Force line tokenization to complete when the setting is enabled so that trim is accurate.

333fred · 2023-11-17T23:30:05Z

@hediet I think I've plumbed through a setting for this correctly, but I could still use some pointers on how to test this appropriately.

333fred · 2023-11-27T17:10:58Z

Talked with @hediet offline. With the end of year coming up there isn't a lot of bandwidth to look at a change like this at the moment, given the potential costs of tokenization for the file. I'll ping again in January.

src/vs/editor/common/commands/trimTrailingWhitespaceCommand.ts

src/vs/workbench/contrib/files/browser/files.contribution.ts

…or a given line and do not format if there are none.

hediet · 2024-01-08T07:24:37Z

For testing, please see existing unit tests for tokenization. Basically you should create a fake language with a static tokenizer where you set tokens manually. Then look for tests that test TrimTrailingWhitespaceCommand (or other commands) and try to run the command on the text model with the fake language.

alexdima

Thank you!

) * Do not trim whitespace when part of strings or regexes Fixes microsoft#195010. If the token to be removed is part of a string or a regex, then we do not want to remove the character, as it is very likely semantically important. * Address initial feedback: 1. Add a new setting to control whether to trim whitespace in multiline strings/regexes. This setting is then piped through to everywhere that is calling trim whitespace, which is file save editing, the trim whitespace command, and notebooks. 2. Force line tokenization to complete when the setting is enabled so that trim is accurate. * Don't force tokenization; instead, check to see if there are tokens for a given line and do not format if there are none. * Look for syntactical tokens * Fix compilation errors * Add a test --------- Co-authored-by: Alex Dima <alexdima@microsoft.com>

333fred · 2024-03-18T18:11:34Z

@alexdima thanks for picking this up! Really appreciate it, I've been very busy with other things and was unlikely to get back to this for another few months.

Do not trim whitespace when part of strings or regexes

dbe54d6

Fixes #195010. If the token to be removed is part of a string or a regex, then we do not want to remove the character, as it is very likely semantically important.

VSCodeTriageBot assigned alexdima Nov 14, 2023

alexdima requested changes Dec 6, 2023

View reviewed changes

src/vs/editor/common/commands/trimTrailingWhitespaceCommand.ts Show resolved Hide resolved

src/vs/workbench/contrib/files/browser/files.contribution.ts Show resolved Hide resolved

Don't force tokenization; instead, check to see if there are tokens f…

2aa1e70

…or a given line and do not format if there are none.

alexdima added 4 commits March 15, 2024 21:31

Merge remote-tracking branch 'origin/main' into pr/333fred/198164

600a67e

Look for syntactical tokens

baa1c18

Fix compilation errors

4b7a641

Add a test

a2df484

alexdima marked this pull request as ready for review March 15, 2024 22:29

alexdima approved these changes Mar 15, 2024

View reviewed changes

alexdima added this to the March 2024 milestone Mar 15, 2024

alexdima enabled auto-merge (squash) March 15, 2024 22:30

Tyriar approved these changes Mar 16, 2024

View reviewed changes

alexdima merged commit c7578a3 into microsoft:main Mar 16, 2024
6 checks passed

333fred deleted the trimwhitespace-strings-and-regexes branch March 18, 2024 18:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not trim whitespace when part of strings or regexes #198164

Do not trim whitespace when part of strings or regexes #198164

333fred commented Nov 14, 2023

333fred commented Nov 14, 2023

coreyward commented Nov 15, 2023

333fred commented Nov 15, 2023

hediet commented Nov 17, 2023

333fred commented Nov 17, 2023

333fred commented Nov 27, 2023

hediet commented Jan 8, 2024

alexdima left a comment

333fred commented Mar 18, 2024

Do not trim whitespace when part of strings or regexes #198164

Do not trim whitespace when part of strings or regexes #198164

Conversation

333fred commented Nov 14, 2023

333fred commented Nov 14, 2023

coreyward commented Nov 15, 2023

333fred commented Nov 15, 2023

hediet commented Nov 17, 2023

333fred commented Nov 17, 2023

333fred commented Nov 27, 2023

hediet commented Jan 8, 2024

alexdima left a comment

Choose a reason for hiding this comment

333fred commented Mar 18, 2024