jslint: remove whitespaces using simple regex #8674

Merged
merged 2 commits into from Aug 7, 2014

Projects

None yet

5 participants

@MartinMa
Contributor
MartinMa commented Aug 6, 2014

Small speed improvement to the default lintjs extension. Replacing obsolete for loop with a simple regex to remove whitespaces, if a line contains only whitespace.

@MartinMa MartinMa jslint: remove whitespaces using simple regex
Replacing obsolete for loop with a simple regex to remove whitespaces, if a line contains only whitespace.
67bdd4b
@ingorichter ingorichter self-assigned this Aug 6, 2014
@ingorichter ingorichter and 4 others commented on an outdated diff Aug 6, 2014
src/extensions/default/JSLint/main.js
@@ -72,15 +72,7 @@ define(function (require, exports, module) {
*/
function lintOneFile(text, fullPath) {
// If a line contains only whitespace, remove the whitespace
- // This should be doable with a regexp: text.replace(/\r[\x20|\t]+\r/g, "\r\r");,
- // but that doesn't work.
- var i, arr = text.split("\n");
- for (i = 0; i < arr.length; i++) {
- if (!arr[i].match(/\S/)) {
- arr[i] = "";
- }
- }
- text = arr.join("\n");
+ text = text.replace(/^[\x20\t\f]+$/gm, "");
@ingorichter
ingorichter Aug 6, 2014 Contributor

What is the \f token for?

@MartinMa
MartinMa Aug 6, 2014 Contributor

\f stands for form feed. I don't know if it is necessary.

Only the characters "space" (U+0020), "tab" (U+0009), "line feed" (U+000A), "carriage return" (U+000D), and "form feed" (U+000C) can occur in whitespace.

As per http://www.w3.org/TR/css3-selectors/#whitespace

Maybe \f is catched by $. I'm checking it later ...

edit A line containing spaces only, but with \f at the end isn't matched if you omit the regex token for it. Form feed (sometimes called page break character) is kind of a weird character as you basically can't easily enter it directly using Brackets. I'd keep it in the regex though, for the sake of completeness. Should I remove it?

@peterflynn
peterflynn Aug 6, 2014 Member

I think just using /^[ \t]+$/gm should be fine. If there are other weird whitespace-like chars in the file, we probably want to let JSLint flag it as an error anyway...

@peterflynn
peterflynn Aug 6, 2014 Member

(Fwiw, I think the only problem with the original regex in the comment is that it was using \r instead of \n -- all editor text in Brackets is normalized to \n. The new regex is cleaner anyway, though)

@peterflynn
peterflynn Aug 6, 2014 Member

@redmunds I think the issue there is that \s matches \n chars too, so we lose the ability to scope the regexp to just one line. E.g. "foo\nbar\nbaz".match(/^bar\sbaz$/gm) --> "bar\nbaz". So what we really want is \s except for \n, but I don't think there's a syntax in JS regexps to express that.

Since this is really just intended to screen out the blank-line indent that Brackets automatically generates, it seems safe enough to only look at spaces & tabs though, since we never insert anything else.

@redmunds
redmunds Aug 6, 2014 Contributor

RegExp should only span lines if you use the m flag.

@MarcelGerber
MarcelGerber Aug 6, 2014 Member

@peterflynn You can use /^[^\S\n]+$/g to have \s w/o \n. See http://regex101.com/r/uN6dC3/1

@peterflynn
peterflynn Aug 7, 2014 Member

@redmunds But this one does use the m flag, by design since it relies on ^ and $ to mark line boundaries.

@MarcelGerber's Cool, that seems like a viable option too. So @MartinMa I think either /^[ \t]+$/gm or /^[^\S\n]+$/gm is fine -- personally I don't have a strong opinion between the two.

@redmunds
redmunds Aug 7, 2014 Contributor

Maybe the trick here is to use minimum match length (the default is maximum) by changing + to +? like /^\s+?$/gm

@MarcelGerber
MarcelGerber Aug 7, 2014 Member

@peterflynn redmunds' RegEx works too (I even tested it) and is the shortest here.

@redmunds You don't need that extra stuff.

@MartinMa MartinMa updated regex to catch spaces and tabs only
seems to be cleaner, also other "weird whitespace characters" in an isolated line will be flagged as an error by jslint
986eda7
@MartinMa
Contributor
MartinMa commented Aug 7, 2014

@peterflynn @MarcelGerber I updated the regex to /^[ \t]+$/gm because I think it is more straightforward and easier to read than the other variant. Also, I like the idea that other weird whitespace-like chars in an isolated line will be flagged as an error by jslint.

@ingorichter
Contributor

@MartinMa Thank you.

@ingorichter ingorichter merged commit f30febb into adobe:master Aug 7, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment