Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: no-misleading-character-class granular errors #17515

Conversation

JoshuaKGoldberg
Copy link
Contributor

@JoshuaKGoldberg JoshuaKGoldberg commented Aug 30, 2023

Prerequisites checklist

What is the purpose of this pull request? (put an "X" next to an item)

[ ] Documentation update
[ ] Bug fix (template)
[ ] New rule (template)
[x] Changes an existing rule (template)
[ ] Add autofix to a rule
[ ] Add a CLI option
[ ] Add something to the core
[ ] Other, please explain:

What rule do you want to change?

no-misleading-character-class

What change do you want to make (place an "X" next to just one item)?

[x] Generate more warnings
[ ] Generate fewer warnings
[ ] Implement autofix
[ ] Implement suggestions

How will the change be implemented (place an "X" next to just one item)?

[ ] A new option
[x] A new default behavior
[ ] Other

Please provide some example code that this change will affect:

var r = /[👶🏻]/

What does the rule currently do for this code?

Currently it gives one complaint for the two matched surrogate pairs.

What will the rule do after it's changed?

Give one complaint per surrogate pair.

Latest playground of cases

Fixes #16682.

@netlify
Copy link

netlify bot commented Aug 30, 2023

Deploy Preview for docs-eslint canceled.

Name Link
🔨 Latest commit b838339
🔍 Latest deploy log https://app.netlify.com/sites/docs-eslint/deploys/659b10d530f9340008b04a37

@eslint-github-bot eslint-github-bot bot added the feature This change adds a new feature to ESLint label Aug 30, 2023
@JoshuaKGoldberg JoshuaKGoldberg marked this pull request as ready for review August 30, 2023 18:22
@JoshuaKGoldberg JoshuaKGoldberg requested a review from a team as a code owner August 30, 2023 18:22
@mdjermanovic mdjermanovic added rule Relates to ESLint's core rules accepted There is consensus among the team that this change meets the criteria for inclusion labels Aug 30, 2023
@JoshuaKGoldberg JoshuaKGoldberg marked this pull request as draft August 31, 2023 14:16
@JoshuaKGoldberg JoshuaKGoldberg marked this pull request as ready for review September 5, 2023 02:32
JoshuaKGoldberg and others added 2 commits September 5, 2023 13:46
Co-authored-by: Francesco Trotta <github@fasttime.org>
@fasttime
Copy link
Member

fasttime commented Sep 6, 2023

Thanks for this PR @JoshuaKGoldberg! Reconsidering the original intent of the rule I'm not sure if we should try to report the exact position of every invalid/misleading character in all possible situations.

The problem is that sometimes there is just no one-to-one correspondence between a positions in the source code and a resulting character in the regex pattern. Even when a correspondence exists, it is not always trivial to calculate just looking at the AST.

The following are examples where the reported start or end positions are off somehow in the new implementation (the patterns should be all equivalent IINM, but it doesn't even matter here):

Screen Shot 2023-09-06 at 07 23 02
Playground link

To improve this situation I imagine that we could:

  • Check if the first argument of RegExp is a simple string, like a string literal or a template literal without interpolations.
  • If the first argument is not a simple string, do not report granular problems.
  • If the first argument is a simple string, figure out how to match characters in the regexp pattern to ranges in the source code (not sure what's the best way to do that). If no bijective mapping can be found, do not report granular problems.
  • When granular problems are reported, use the start and end positions in the source code as determined by the previously generated mapping.

Maybe you have an idea that works better. I would also like to hear what @mdjermanovic thinks.

@fasttime
Copy link
Member

fasttime commented Sep 6, 2023

Also noticed that sometimes the rule suggests to add the u flag where there is already one, for example with new RegExp("[\\ud83d\\u{dc4d}]", "u"); (playground link). This is an existing issue not related to this PR.

@mdjermanovic
Copy link
Member

Also noticed that sometimes the rule suggests to add the u flag where there is already one, for example with new RegExp("[\\ud83d\\u{dc4d}]", "u"); (playground link). This is an existing issue not related to this PR.

Good catch! Can you open an issue for this bug?

@mdjermanovic
Copy link
Member

  • Check if the first argument of RegExp is a simple string, like a string literal or a template literal without interpolations.
  • If the first argument is not a simple string, do not report granular problems.
  • If the first argument is a simple string, figure out how to match characters in the regexp pattern to ranges in the source code (not sure what's the best way to do that). If no bijective mapping can be found, do not report granular problems.
  • When granular problems are reported, use the start and end positions in the source code as determined by the previously generated mapping.

I think that reporting exact positions should be limited to cases where it's a simple string/template literal and the raw text is 1-1 with the pattern. The logic could be the same as in #16544 , where we decided to provide suggestions (which need exact positions to fix the code) only in those cases. Here's related discussion on that PR: #16544 (comment).

@fasttime
Copy link
Member

fasttime commented Sep 6, 2023

Also noticed that sometimes the rule suggests to add the u flag where there is already one, for example with new RegExp("[\\ud83d\\u{dc4d}]", "u"); (playground link). This is an existing issue not related to this PR.

Good catch! Can you open an issue for this bug?

Done: #17537

@github-actions
Copy link

Hi everyone, it looks like we lost track of this pull request. Please review and see what the next steps are. This pull request will auto-close in 7 days without an update.

@github-actions github-actions bot added the Stale label Sep 16, 2023
@JoshuaKGoldberg
Copy link
Contributor Author

Ah, I'd been waiting to see conversation resolve and missed that it had - will pick this back up. Thanks!

@Rec0iL99 Rec0iL99 removed the Stale label Sep 18, 2023
@nzakas
Copy link
Member

nzakas commented Sep 20, 2023

@JoshuaKGoldberg just a heads up that there's a merge conflict.

@github-actions
Copy link

Hi everyone, it looks like we lost track of this pull request. Please review and see what the next steps are. This pull request will auto-close in 7 days without an update.

@github-actions github-actions bot added the Stale label Sep 30, 2023
@mdjermanovic
Copy link
Member

There are merge conflicts in the test file due to 70a686b.

Copy link

Hi everyone, it looks like we lost track of this pull request. Please review and see what the next steps are. This pull request will auto-close in 7 days without an update.

@nzakas
Copy link
Member

nzakas commented Dec 28, 2023

Sorry, we are deep in v9 planning and focusing on things that are blockers for v9. We will get back around to this!

In the meantime, can you please rebase to get the latest updates?

Copy link
Member

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for all your work on this.

Waiting for @mdjermanovic to verify his review comments.

Comment on lines +338 to +346
if (kind === "zwj" && locs.length > 1) {
context.report({
loc: {
start: locs[0].start,
end: locs[1].end
},
messageId: kind,
suggest
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't show all if there are more of these in the pattern, but we can fix that in a follow-up.

@mdjermanovic
Copy link
Member

I'll just close-reopen for a moment to see if there's anything that should be updated for CI to pass.

@mdjermanovic mdjermanovic reopened this Jan 7, 2024
Copy link
Member

@mdjermanovic mdjermanovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mdjermanovic mdjermanovic merged commit 287c4b7 into eslint:main Jan 7, 2024
15 checks passed
@JoshuaKGoldberg JoshuaKGoldberg deleted the no-misleading-character-class-granular-errors branch January 7, 2024 23:18
@JoshuaKGoldberg
Copy link
Contributor Author

Thank you for all the help! I know this was a slog to review on your end, and really appreciate it. And I learned a lot about regular expressions as a result. ❤️

Frodo tearfully saying 'it's done' in front of flowing magma

bmish added a commit to bmish/eslint that referenced this pull request Jan 16, 2024
* main:
  chore: Split Docs CI from core CI (eslint#17897)
  fix: Ensure config keys are printed for config errors (eslint#17980)
  chore: delete relative-module-resolver.js (eslint#17981)
  docs: migration guide entry for `no-inner-declarations` (eslint#17977)
  docs: Update README
  feat: maintain latest ecma version in ESLint (eslint#17958)
  feat!: no-inner-declaration new default behaviour and option (eslint#17885)
  feat: add `no-useless-assignment` rule (eslint#17625)
  fix: `no-misleading-character-class` edge cases with granular errors (eslint#17970)
  feat: `no-misleading-character-class` granular errors (eslint#17515)
  docs: fix number of code-path events on custom rules page (eslint#17969)
  docs: reorder entries in v9 migration guide (eslint#17967)
  fix!: handle `--output-file` for empty output when saving to disk (eslint#17957)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted There is consensus among the team that this change meets the criteria for inclusion feature This change adds a new feature to ESLint rule Relates to ESLint's core rules
Projects
Status: Complete
5 participants