Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply escape-case to regex literal and remove transformation of \c escape on string literal #294

Merged
merged 7 commits into from
Jun 10, 2019

Conversation

JLHwung
Copy link
Contributor

@JLHwung JLHwung commented May 2, 2019

This PR addresses issue and feature request in #273. I've made separate commit on this.

Although the owner suggested using https://github.com/mysticatea/regexpp, I decide not to use it for performance considerations -- There would be many regular expression in a mature codebase and we can not afford the extra regular expression parse + replace cycle only to find whether there is any escapable sequence.

The original escapeWithLowercase pattern should work good enough on regular expression under most situation. I have minimally modified the code to support slightly difference escape syntax between string literal and RegExp literal/objects.

Fixes #273

`\c` is not related to control characters in strings.
@futpib
Copy link
Contributor

futpib commented May 3, 2019

A slightly off-topic nitpick.

Although the owner suggested using mysticatea/regexpp, I decide not to use it for performance considerations -- There would be many regular expression in a mature codebase and we can not afford the extra regular expression parse + replace cycle only to find whether there is any escapable sequence.

Would it really be that slow? We could eventually cache the result and reuse it in other regexp-related rules (or maybe even across projects). Even if it were substantially slower, we may value correctness and maintainability over speed, especially for a linter (unless it was really slow).

Also a mature codebase with a lot of unique regular expressions sounds like a nightmare.

I'm not saying this PR should not be accepted, only that the argument against regexpp is not convincing.

@MrHen
Copy link
Contributor

MrHen commented May 3, 2019

I agree with @futpib. Performance issues should be addressed once they have been confirmed. Does the slower approach add 0.5 seconds for every 1,000 regex statements? 1 minute for every 10 regex statements?

@JLHwung
Copy link
Contributor Author

JLHwung commented May 4, 2019

we may value correctness and maintainability over speed, especially for a linter (unless it was really slow).

Make sense, I can spare some time to try on the regexpp approach.

@JLHwung
Copy link
Contributor Author

JLHwung commented May 8, 2019

@futpib Upgraded the branch. The new lowercase detection approach is

  1. use regexpp to parse regular expression literal
  2. tap on onCharacterLeave handler, check if it is the first occurrence of escaped lowercase sequence
  3. report if invalid

The fixer approach is

  1. use regexpp to parse regular expression literal
  2. tap on onCharacterLeave handler, check if it is the first occurrence of lowercase sequence, record the position of escaped sequence in the regular expression
  3. replace the original escaped sequence with the one applied with the string escape fixer
  4. return the replaced string as the fixed regular expression literal

@sindresorhus
Copy link
Owner

@JLHwung Can you give the PR a proper title that describes what it fixes?

@sindresorhus
Copy link
Owner

I pushed some minor formatting tweaks: 38f91bb

const matches = node.raw.match(escapePatternWithLowercase);

if (matches && matches[2].slice(1).match(hasLowercaseCharacter)) {
escapeNodePosition = [node.start, node.end];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot use .start and .end. See: #272

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is different to fixer.replaceTextRange([node.start, node.end]) as the node here is actually the ASTNode from regexpp, while other node is the AST node from eslint parsers.

As start and end is official properties of a ASTNode in regexpp, I think we can leave it as-is, but I can add a comment here as a reminder.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. I missed that. It's ok to leave it. My mistake.

@@ -30,7 +81,18 @@ const create = context => {
context.report({
node,
message,
fix: fixer => fixer.replaceTextRange([node.start, node.end], fix(node.raw))
fix: fixer => fixer.replaceTextRange([node.start, node.end], fix(node.raw, escapeWithLowercase))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use replaceText(node instead of replaceTextRange[node.start, node.end]

Find the `[start, end]` position of the lowercase escape sequence in a regular expression literal ASTNode.

@param {string} value - String representation of a literal ASTNode.
@returns {number[] | null} The `[start, end]` pair if found, or null if not.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer returning undefined instead of null.

@JLHwung JLHwung changed the title Fix 273 Apply escape-case to regex literal and remove transformation of \c escape on string literal May 22, 2019
@JLHwung
Copy link
Contributor Author

JLHwung commented May 23, 2019

Thanks for the review comments. I am sorry that I am taking vacation off my laptop for a couple days. I will come back and revise on the next week. Thank you.

@JLHwung
Copy link
Contributor Author

JLHwung commented May 30, 2019

Regards to the feedback, I have pushed fixes and they are ready to review.

@sindresorhus sindresorhus merged commit 79748e1 into sindresorhus:master Jun 10, 2019
@sindresorhus
Copy link
Owner

Thanks for working on this, @JLHwung 🙌

@JLHwung JLHwung deleted the fix-273 branch June 10, 2019 15:23
@fisker fisker mentioned this pull request Feb 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apply escape-case to regex literal sequences
4 participants