-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update: Fix no-useless-escape
false negative in regexes (fixes #7424)
#7425
Conversation
@not-an-aardvark, thanks for your PR! By analyzing the history of the files in this pull request, we identified @onurtemizkan, @vitorbal and @kaicataldo to be potential reviewers. |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment about a complex arrow function, otherwise LGTM.
* characters to be valid in general, and filter out '-' characters that appear in the middle of a | ||
* character class. | ||
*/ | ||
.filter((charInfo, index, array) => charInfo.text !== "-" || !charInfo.inCharClass || index === 0 || index === array.length - 1 || !array[index - 1].inCharClass || !array[index + 1].inCharClass) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see this extracted to a function.
* @param {Set} setB The second set | ||
* @returns {Set} The union of the two sets | ||
*/ | ||
function union(setA, setB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so cool.. I never knew abt yield*
.
Had to read up on couple of things to understand this but so awesome. 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a super clever way to merge Sets, i like it 💯
} else { | ||
return; | ||
} | ||
parseRegExp(node.raw.slice(1, -1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I found a problem: .slice(1, -1)
will be incorrect if the regex has flags, since the trailing slash won't be at index -1.
f5b4021
to
3f4a7ca
Compare
LGTM |
@platinumazure Separated the complex behavior into its own function. The number of linting errors this causes on the existing ESLint codebase is a bit concerning 😕 |
* @returns {Set} The union of the two sets | ||
*/ | ||
function union(setA, setB) { | ||
return new Set(function *() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not function* (
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the generator-star-spacing
rule configured to enforce this spacing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, interesting choice
* @param {Set} setB The second set | ||
* @returns {Set} The union of the two sets | ||
*/ | ||
function union(setA, setB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a super clever way to merge Sets, i like it 💯
CI is failing due to this line and this line. Both of these seem to be valid linting errors, but why are they only getting reported on AppVeyor? Travis didn't report any issues, and I can't reproduce the linting error when running edit: Actually, it doesn't seem to report any linting errors in |
LGTM |
@@ -105,7 +105,7 @@ module.exports = { | |||
|
|||
const tokenAfterArrowBody = sourceCode.getTokenAfter(arrowBody); | |||
|
|||
if (tokenAfterArrowBody && tokenAfterArrowBody.type === "Punctuator" && /^[(\[\/`+-]/.test(tokenAfterArrowBody.value)) { | |||
if (tokenAfterArrowBody && tokenAfterArrowBody.type === "Punctuator" && /^[([\/`+-]/.test(tokenAfterArrowBody.value)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the \/
in this example also be flagged by the rule? It's in a character class but I don't know if the slash token is consumed greedily by the tokenizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, looks like you're right; I had thought /
always needed to be escaped, but apparently it only needs to be escaped in character classes.
Sidenote: This implies that /[/]/
is a valid regex, but oddly, /[/]/.toString()
returns /[\/]/
(i.e. it auto-escapes the slash, at least in Chrome). It doesn't seem to do this for other escaped characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for extracting that function.
I left one question, but I'm half certain of the answer (and that it would require no changes).
LGTM |
1 similar comment
LGTM |
1e32241
to
6d9bea0
Compare
LGTM |
New open questions on the best approach for this rule (should we put this behind an option, should we enhance with exception lists, etc.). Possible large ecosystem impact.
As we find new problems, there are concerns about ecosystem impact. The rule is correct in reporting all the new useless escapes, but there may be some cases where users might want to intentionally allow (or prefer) some useless escapes for readability/maintainability. So we may need to consider carefully how best to release this. |
I think we should do the following:
|
} | ||
} | ||
return { | ||
charList: state.charList.concat({text: char, escaped: state.escapeNextChar, inCharClass: state.inCharClass, index}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need to create an array instance for each character? It looks to generate many useless copies.
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me.
But there is a few matter about unnecessary escapes.
"\\", | ||
".", | ||
"-", | ||
"^", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The escape of ^
in character class is unnecessary except the 1st character.
/[\^a]/; // this escape is necessary; it becomes `not a` if the `\` is removed.
/[a\^]/; // this escape is unnecessary; the `^` is a character.
"]", | ||
"|", | ||
"(", | ||
")", | ||
"b", | ||
"B", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The escape of B
in character class is unnecessary.
/[\B]/; // this escape is unnecessary; the `B` is a character.
On the other hand, the escape of \b
is necessary. \b
in character class is meaning a backspace character. (\b
outside of character class is a word boundary)
LGTM |
35ec446
to
4611d81
Compare
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, awesome!
TSC Summary: TSC question: Should we merge this PR for the upcoming release and consider the opt-out option separately? If not, how should we handle this fix? |
TSC Resolution: Merge as is. Opt-out option could be considered in the future. |
@eslint/eslint-tsc @not-an-aardvark Thanks very much for taking the time to carefully deliberate this. |
What is the purpose of this pull request? (put an "X" next to item)
[ ] Documentation update
[x] Bug fix (template)
[ ] New rule (template)
[ ] Changes an existing rule (template)
[ ] Add autofixing to a rule
[ ] Add a CLI option
[ ] Add something to the core
[ ] Other, please explain:
See #7424
What changes did you make? (Give an overview)
This updates
no-useless-escape
to verify whether a character needs to be escaped based on its position in a regular expression. Previously,no-useless-escape
had a list of escapable characters for regexes, and it would never report cases where any of them were escaped. However, some characters only need to be escaped if they appear in a character class, and other characters only need to be escaped if they appear outside of a character class. This PR updates the rule to parse the regular expression to determine which characters are in character classes, and report the characters if they are escaped inside a character class and only need to be escaped outside a class (or vice versa).Is there anything you'd like reviewers to focus on?
We should make sure that the
REGEX_GENERAL_ESCAPES
,REGEX_CHARCLASS_ESCAPES
, andREGEX_NON_CHARCLASS_ESCAPES
lists are correct.