`prefer-regex-literals` suggestions #15029

Yash-Singh1 · 2021-09-04T22:01:25Z

What rule do you want to change?

prefer-regex-literals

Does this change cause the rule to produce more or fewer warnings?

No

How will the change be implemented? (New option, new default behavior, etc.)?

Adds fixer

Please provide some example code that this change will affect:

new RegExp('\\.')

What does the rule currently do for this code?

Reports error

What will the rule do after it's changed?

Fix it to:

/\./

The text was updated successfully, but these errors were encountered:

nzakas · 2021-09-08T00:50:36Z

Seems possible to do.

Are you willing to submit a pull request?

Yash-Singh1 · 2021-09-08T00:51:21Z

Sure, I will give this a shot.

mdjermanovic · 2021-09-08T12:28:50Z

It would be a useful enhancement.

@Yash-Singh1 any ideas on how this could be implemented?

We generally suggest that contributors wait for the issue to be accepted before working on it, but in this case I think it would be good to evaluate this enhancement on a prototype of the solution.

This is a nontrivial code transformation that should be aware of all differences between the syntax of regular expression literals and the syntax of RegExp constructor input patterns.

For example:

new RegExp("\n");

It's a valid RegExp input pattern consisting of a single code point for \n, but in regular expression literals it would be invalid:

/
/

One of the possible solutions is to use the built-in RegExp.prototype.toString. By the specification, it should be safe:

The returned String has the form of a RegularExpressionLiteral that evaluates to another RegExp object with the same behaviour as this object.

[...new RegExp("\n").toString()]; // ['/', '\', 'n', '/']

Some downsides of this solution would be:

It depends on the execution environment - if the environment doesn't support syntax used in the given regular expression, we can't transform.
It doesn't preserve syntactic choices from the original string literal:

[...new RegExp("\t").toString()]; // ['/', '\t', '/'] - this would be /    / in the source code, valid but unusual

[...new RegExp("\u1234").toString()]; // ['/', 'ሴ', '/']

nzakas · 2021-09-08T19:52:37Z

I think we can simplify this proposal such that we eliminate any regular expressions that would be problematic. Maybe to start we just say anything with \t, \u, \n, etc. won't be autofixed?

lydell · 2021-09-08T20:29:12Z

/ also sometimes needs to be escaped.

Another thing: For example RegExp("+") would throw a runtime error. Converting that to /+/ would now be a syntax error. Is it ok for an autofix to introduce syntax errors?

Similarly, RegExp("*") -> /*/ would start a block comment, which – in the worst case – could be closed by some other */ sequence and still produce “valid” code. Confusing!

So the autofix might need to know regex syntax. Just thinking out loud.

We also need to be careful so that 5/RegExp("a") isn’t converted to 5//a/. That’s a single-line comment now.

nzakas · 2021-09-09T00:41:46Z

@lydell good points!

Given all the edge cases we’ve identified, so we feel like it’s worth autofixing at this point?

Yash-Singh1 · 2021-09-09T03:09:39Z

Sorry for the late reply, I was working on the changes and doing a bit of research on how to implement this. So far I got the basic functionality written, I just need to work on the edge cases.

So the autofix might need to know regex syntax.

Yes, there should be some form of validation for the regular expression. Since ESLint already depends on regexpp and it seems to have a validator function. There is an option on the ecmaVersion:

https://github.com/mysticatea/regexpp/blob/master/src/validator.ts#L137

And I think we can pass parserOptions.ecmaVersion to the function and validate the regular expression string.

We also need to be careful so that 5/RegExp("a") isn’t converted to 5//a/. That’s a single-line comment now.

Yes, that will ensure checking for division and not autofixing when there is division. But should your example: 5/RegExp("abc") autofix to 5/RegExp(/abc/) or 5/(/abc/)?

lydell · 2021-09-09T06:00:35Z

But should your example: 5/RegExp("abc") autofix to 5/RegExp(/a/) or (/a/)?

Or 5/ /a/ (space)

ljharb · 2021-09-09T06:01:57Z

Hopefully it'd autofix to 5/(/abc/), since /a/ doesn't match the original pattern.

lydell · 2021-09-09T06:06:20Z

That was an obvious typo to me. Of course the discussion is about which solution to use for the “division plus regex should not become a singleline comment” edge case.

ljharb · 2021-09-09T06:10:51Z

Right - but what would be the point of converting RegExp('abc') to RegExp(/abc/)? The RegExp constructor is redundant there. I think it'd convert to /abc/, and only when necessary, would be parenthesized (like when it's the divisor, or when the next line has an identifier that might be interpreted as flags, or something)

mdjermanovic · 2021-09-09T08:55:58Z

We can make a list of safe preceding tokens, e.g., = and ( will cover most common cases. We could add to the list other tokens for which we are sure that scanning cannot interpret our regex literal as something else. In all other cases we can wrap the literal in (), it doesn't have to be perfect.

As for the right side, I think the only problem is adjacent words like in. This can be handled by adding a space.

mdjermanovic · 2021-09-09T10:26:01Z

I think this feature should be suggestion, rather than autofix. Often there are multiple valid patterns with the same behavior, so if user doesn't like the suggested one, they can opt to not apply it. The suggested fix should still produce valid and equivalent code, but it doesn't necessarily have to be the preferred variant, since we might not always be able to figure out which one is it.

Here's a babel repl with an example of this transformation (babel-plugin-transform-regexp-constructors). /ሴ/ is a valid replacement for RegExp("\u1234"), but probably not the preferred one.

Also, writing regex patterns in string literals is generally error-prone. For example, RegExp("a\.b") is a common mistake reported by no-useless-escape. If we autofix this to /a.b/, the error will be hidden.

Yash-Singh1 · 2021-09-09T23:33:56Z

A few more edge cases: /abc/*something would start a block comment. /abc//something would start a single-line comment.

mdjermanovic · 2021-09-10T08:34:41Z

A few more edge cases: /abc/*something would start a block comment. /abc//something would start a single-line comment.

I think this can happen only if the leftmost / was already interpreted as division, which depends only on the code before it, so the adjacent *something or /something on the right side are not causing the problem.

mdjermanovic · 2021-09-10T09:58:32Z

I'm not too concerned about the part where we are inserting /foo/ in the place of RegExp("foo") in the source code. We can make a list of tokens that cannot be at the end of expressions (that's where / would be seen as division) and check if the token before RegExp("foo") is in the list. If it is, then it's safe to insert /foo/ without parentheses, and astUtils.canTokensBeAdjacent can tell us if it needs spaces on both ends. If it isn't in the list, we'll insert (/foo/). This check doesn't have to be exhaustive (I believe only =, ( and , would cover the majority of cases where regular expressions appear in a source code), the worst case is that we'll sometimes insert a pair of parentheses that aren't really necessary.

I'm more concerned about how to get /foo/ from RegExp("foo"). We have a string value of the pattern and a raw string literal, but neither of them can be used directly as the content between /.../

mdjermanovic · 2021-09-10T10:22:14Z

Another thing: For example RegExp("+") would throw a runtime error. Converting that to /+/ would now be a syntax error. Is it ok for an autofix to introduce syntax errors?

That wouldn't be ok, RegExp("+") can be in a dead code or wrapped in try-catch, while /+/ breaks parsing and thus the whole script (or at least it should by the spec, in V8 it's a runtime error). We should only fix valid regular expressions.

Yash-Singh1 · 2021-09-12T18:22:04Z

Should RegExp(1) autofix to /1/? Currently, the prefer-regex-literals rule only detects strings.

ljharb · 2021-09-12T18:40:06Z

Yes, I’d think so.

Yash-Singh1 · 2021-09-13T00:32:48Z

I'm not too concerned about the part where we are inserting /foo/ in the place of RegExp("foo") in the source code. We can make a list of tokens that cannot be at the end of expressions (that's where / would be seen as division) and check if the token before RegExp("foo") is in the list. If it is, then it's safe to insert /foo/ without parentheses
...
If it isn't in the list, we'll insert (/foo/). This check doesn't have to be exhaustive (I believe only =, ( and , would cover the majority of cases where regular expressions appear in a source code), the worst case is that we'll sometimes insert a pair of parentheses that aren't really necessary.

I am just curious, instead of whitelisting tokens, why can't we disallow the / token instead, because that would cover more edge case tokens.

, and astUtils.canTokensBeAdjacent can tell us if it needs spaces on both ends.

Can you help me with an example where we would require space on both ends?

mdjermanovic · 2021-09-13T07:58:16Z

Should RegExp(1) autofix to /1/? Currently, the prefer-regex-literals rule only detects strings.

No, if it isn't reported then it shouldn't be fixed. We could discuss in a separate issue whether or not this should be reported.

mdjermanovic · 2021-09-13T08:09:51Z

, and astUtils.canTokensBeAdjacent can tell us if it needs spaces on both ends.

Can you help me with an example where we would require space on both ends?

a/RegExp("foo")in b would need spaces on both ends: a/ /foo/ in b

Without spaces, it would be a//foo/in b - which is a followed by a line comment. Because of the adjacent / on the left side, we have to add a space on the left, and then it becomes a/ /foo/in b, but this would be interpreted as a regular expression /foo/in ("SyntaxError: Invalid regular expression flags"), so we have to add a space on the right side too.

mdjermanovic · 2021-09-13T08:36:35Z

I am just curious, instead of whitelisting tokens, why can't we disallow the / token instead, because that would cover more edge case tokens.

/ can be in the list of allowed tokens, it just needs a space.

Disallowed preceding tokens should be ), literals, identifiers etc., for example:

x = y
RegExp("foo").test(x) ? bar() : baz()

When autofixed to /foo/, this becomes:

x = y
/foo/.test(x) ? bar() : baz()

This is interpreted as x = y / foo...

Now I realize that parentheses wouldn't solve the problem as they would be interpreted as function call, so it's probably best to not autofix the RegExp if it isn't preceded by one of the allowed tokens (=, (, ;, ,, ...).

Fixes eslint#15029

github-actions · 2021-11-30T22:34:01Z

Oops! It looks like we lost track of this issue. What do we want to do here? This issue will auto-close in 7 days without an update.

ljharb · 2021-11-30T22:53:46Z

There's an open PR: #15077

nzakas · 2021-12-01T01:23:44Z

@mdjermanovic seems like you were driving this. How should we proceed?

mdjermanovic · 2021-12-01T10:39:01Z

By the proof of concept #15077, this looks feasible and safe to me. I support this change, and I'm willing to champion it.

nzakas · 2021-12-02T01:19:14Z

I’ll 👍

* New: Autofix support to prefer-regex-literals Fixes #15029 * Use canTokensBeAdjacent * Fix for NULL * Switch to validatePattern and validateFlags * Fix for unicode * Apply a few suggestions from code review * Fix: Double Escaping? * Tests and fixes for no-unicode regexp * New: Drop usage of getStaticValue * Fix: Remove whitespace changes, fix jsdoc type, and convert to suggestion for unexpectedRegExp * New: More test cases for . * Remove meta.docs.suggestion Co-authored-by: Milos Djermanovic <milos.djermanovic@gmail.com> * Fix linting * Don't fix NULL * Remove redundant wrapping suggestions for now * String.raw can have problematic chars * Remove fixable * Fix messed up char increase * Apply suggestion from code review Co-authored-by: Milos Djermanovic <milos.djermanovic@gmail.com> * chore: use characterNode.raw instead of characterNode.value * chore: do a bit of simplification of onCharacterEnter * Apply suggestions from code review Co-authored-by: Milos Djermanovic <milos.djermanovic@gmail.com> * chore: more changes following code review * chore: Use reliable way of testing if spacing needed Co-authored-by: Milos Djermanovic <milos.djermanovic@gmail.com> * diff msg for suggestion than main warning * chore: stricter testing * Apply suggestions from code review Co-authored-by: Milos Djermanovic <milos.djermanovic@gmail.com> Co-authored-by: Milos Djermanovic <milos.djermanovic@gmail.com>

Yash-Singh1 added enhancement This change enhances an existing feature of ESLint rule Relates to ESLint's core rules triage An ESLint team member will look at this issue soon labels Sep 4, 2021

eslint-github-bot bot added this to Needs Triage in Triage Sep 4, 2021

nzakas removed the triage An ESLint team member will look at this issue soon label Sep 8, 2021

nzakas moved this from Needs Triage to Feedback Needed in Triage Sep 8, 2021

Yash-Singh1 changed the title ~~prefer-regex-literal autofix~~ prefer-regex-literals autofix Sep 9, 2021

Yash-Singh1 added a commit to Yash-Singh1/eslint that referenced this issue Sep 17, 2021

Add autofix support to prefer-regex-literals

04a3fab

Fixes eslint#15029

Yash-Singh1 added a commit to Yash-Singh1/eslint that referenced this issue Sep 17, 2021

New: Autofix support to prefer-regex-literals

d6d044a

Fixes eslint#15029

Yash-Singh1 mentioned this issue Sep 17, 2021

feat: Suggestions support for prefer-regex-literals #15077

Merged

1 task

Yash-Singh1 added a commit to Yash-Singh1/eslint that referenced this issue Sep 17, 2021

New: Autofix support to prefer-regex-literals

f3fb852

Fixes eslint#15029

Yash-Singh1 added a commit to Yash-Singh1/eslint that referenced this issue Sep 17, 2021

New: Autofix support to prefer-regex-literals

bac91cb

Fixes eslint#15029

Yash-Singh1 changed the title ~~prefer-regex-literals autofix~~ prefer-regex-literals suggestions Oct 1, 2021

github-actions bot added the Stale label Nov 30, 2021

mdjermanovic removed the Stale label Dec 1, 2021

mdjermanovic self-assigned this Dec 1, 2021

nzakas added the accepted There is consensus among the team that this change meets the criteria for inclusion label Dec 2, 2021

nzakas moved this from Feedback Needed to Ready to Implement in Triage Dec 2, 2021

mdjermanovic moved this from Ready to Implement to Pull Request Opened in Triage Dec 2, 2021

mdjermanovic closed this as completed in #15077 Dec 17, 2021

Triage automation moved this from Pull Request Opened to Complete Dec 17, 2021

eslint-github-bot bot locked and limited conversation to collaborators Jun 16, 2022

eslint-github-bot bot added the archived due to age This issue has been archived; please open a new issue for any further discussion label Jun 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`prefer-regex-literals` suggestions #15029

`prefer-regex-literals` suggestions #15029

Yash-Singh1 commented Sep 4, 2021 •

edited

nzakas commented Sep 8, 2021

Yash-Singh1 commented Sep 8, 2021

mdjermanovic commented Sep 8, 2021

nzakas commented Sep 8, 2021

lydell commented Sep 8, 2021 •

edited

nzakas commented Sep 9, 2021

Yash-Singh1 commented Sep 9, 2021 •

edited

lydell commented Sep 9, 2021

ljharb commented Sep 9, 2021

lydell commented Sep 9, 2021

ljharb commented Sep 9, 2021 •

edited

mdjermanovic commented Sep 9, 2021

mdjermanovic commented Sep 9, 2021

Yash-Singh1 commented Sep 9, 2021 •

edited

mdjermanovic commented Sep 10, 2021

mdjermanovic commented Sep 10, 2021

mdjermanovic commented Sep 10, 2021

Yash-Singh1 commented Sep 12, 2021

ljharb commented Sep 12, 2021

Yash-Singh1 commented Sep 13, 2021

mdjermanovic commented Sep 13, 2021

mdjermanovic commented Sep 13, 2021

mdjermanovic commented Sep 13, 2021

github-actions bot commented Nov 30, 2021

ljharb commented Nov 30, 2021

nzakas commented Dec 1, 2021

mdjermanovic commented Dec 1, 2021

nzakas commented Dec 2, 2021

prefer-regex-literals suggestions #15029

prefer-regex-literals suggestions #15029

Comments

Yash-Singh1 commented Sep 4, 2021 • edited

nzakas commented Sep 8, 2021

Yash-Singh1 commented Sep 8, 2021

mdjermanovic commented Sep 8, 2021

nzakas commented Sep 8, 2021

lydell commented Sep 8, 2021 • edited

nzakas commented Sep 9, 2021

Yash-Singh1 commented Sep 9, 2021 • edited

lydell commented Sep 9, 2021

ljharb commented Sep 9, 2021

lydell commented Sep 9, 2021

ljharb commented Sep 9, 2021 • edited

mdjermanovic commented Sep 9, 2021

mdjermanovic commented Sep 9, 2021

Yash-Singh1 commented Sep 9, 2021 • edited

mdjermanovic commented Sep 10, 2021

mdjermanovic commented Sep 10, 2021

mdjermanovic commented Sep 10, 2021

Yash-Singh1 commented Sep 12, 2021

ljharb commented Sep 12, 2021

Yash-Singh1 commented Sep 13, 2021

mdjermanovic commented Sep 13, 2021

mdjermanovic commented Sep 13, 2021

mdjermanovic commented Sep 13, 2021

github-actions bot commented Nov 30, 2021

ljharb commented Nov 30, 2021

nzakas commented Dec 1, 2021

mdjermanovic commented Dec 1, 2021

nzakas commented Dec 2, 2021

`prefer-regex-literals` suggestions #15029

`prefer-regex-literals` suggestions #15029

Yash-Singh1 commented Sep 4, 2021 •

edited

lydell commented Sep 8, 2021 •

edited

Yash-Singh1 commented Sep 9, 2021 •

edited

ljharb commented Sep 9, 2021 •

edited

Yash-Singh1 commented Sep 9, 2021 •

edited