Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: require-unicode-regexp support v flag #17402

Merged
merged 13 commits into from Jul 28, 2023
127 changes: 126 additions & 1 deletion docs/src/rules/require-unicode-regexp.md
Expand Up @@ -21,7 +21,37 @@ RegExp `u` flag has two effects:

The `u` flag disables the recovering logic Annex B defined. As a result, you can find errors early. This is similar to [the strict mode](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Strict_mode).

Therefore, the `u` flag lets us work better with regular expressions.
The RegExp `v` flag, introduced in ECMAScript 2024, is a superset of the `u` flag, and offers two more features:

1. **Unicode properties of strings**

With the Unicode property escape, you can use properties of strings.

```js
const re = /^\p{RGI_Emoji}$/v;

// Match an emoji that consists of just 1 code point:
re.test('⚽'); // '\u26BD'
// → true ✅

// Match an emoji that consists of multiple code points:
re.test('👨🏾‍⚕️'); // '\u{1F468}\u{1F3FE}\u200D\u2695\uFE0F'
// → true ✅
```

2. **Set notation**

It allows for set operations between character classes.

```js
const re = /[\p{White_Space}&&\p{ASCII}]/v;
re.test('\n'); // → true
re.test('\u2028'); // → false
```

Please see <https://github.com/tc39/proposal-regexp-v-flag> and <https://v8.dev/features/regexp-v-flag> for more details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These links should go in a "Further Reading" section at the bottom of the page instead of here. See the semi rule as an example: https://eslint.org/docs/latest/rules/semi#further-reading


Therefore, the `u` and `v` flag lets us work better with regular expressions.
sosukesuzuki marked this conversation as resolved.
Show resolved Hide resolved

## Rule Details

Expand Down Expand Up @@ -54,6 +84,101 @@ const b = /bbb/giu
const c = new RegExp("ccc", "u")
const d = new RegExp("ddd", "giu")

const e = /aaa/v
const f = /bbb/giv
const g = new RegExp("ccc", "v")
const h = new RegExp("ddd", "giv")

// This rule ignores RegExp calls if the flags could not be evaluated to a static value.
function f(flags) {
return new RegExp("eee", flags)
}
```

:::

## Options

This rule has one object option.

### `requiredUnicodeFlag`

This option can be set to `"u"` or `"v"`. By default, nothing is set.

#### `u`

Examples of **incorrect** code for this rule with `requiredUnicodeFlag` is `u`:

::: incorrect

```js
/*eslint require-unicode-regexp: ["error", { "requiredUnicodeFlag": "u" }] */

const a = /aaa/
const b = /bbb/gi
const c = new RegExp("ccc")
const d = new RegExp("ddd", "gi")
const e = /aaa/v
const f = /bbb/giv
const g = new RegExp("ccc", "v")
const h = new RegExp("ddd", "giv")
```

:::

Examples of **correct** code for this rule:

::: correct

```js
/*eslint require-unicode-regexp: ["error", { "requiredUnicodeFlag": "u" }] */

const a = /aaa/u
const b = /bbb/giu
const c = new RegExp("ccc", "u")
const d = new RegExp("ddd", "giu")

// This rule ignores RegExp calls if the flags could not be evaluated to a static value.
function f(flags) {
return new RegExp("eee", flags)
}
```

:::

#### `v`

Examples of **incorrect** code for this rule with `requiredUnicodeFlag` is `v`:

::: incorrect

```js
/*eslint require-unicode-regexp: ["error", { "requiredUnicodeFlag": "v" }] */

const a = /aaa/
const b = /bbb/gi
const c = new RegExp("ccc")
const d = new RegExp("ddd", "gi")
const e = /aaa/u
const f = /bbb/giu
const g = new RegExp("ccc", "u")
const h = new RegExp("ddd", "giu")
```

:::

Examples of **correct** code for this rule:

::: correct

```js
/*eslint require-unicode-regexp: ["error", { "requiredUnicodeFlag": "v" }] */

const a = /aaa/v
const b = /bbb/giv
const c = new RegExp("ccc", "v")
const d = new RegExp("ddd", "giv")

// This rule ignores RegExp calls if the flags could not be evaluated to a static value.
function f(flags) {
return new RegExp("eee", flags)
Expand Down
156 changes: 115 additions & 41 deletions lib/rules/require-unicode-regexp.js
Expand Up @@ -28,7 +28,7 @@ module.exports = {
type: "suggestion",

docs: {
description: "Enforce the use of `u` flag on RegExp",
description: "Enforce the use of `u` and `v` flag on RegExp",
sosukesuzuki marked this conversation as resolved.
Show resolved Hide resolved
recommended: false,
url: "https://eslint.org/docs/latest/rules/require-unicode-regexp"
},
Expand All @@ -37,36 +37,102 @@ module.exports = {

messages: {
addUFlag: "Add the 'u' flag.",
requireUFlag: "Use the 'u' flag."
requireUFlag: "Use the 'u' flag.",
addVFlag: "Add the 'v' flag.",
requireVFlag: "Use the 'v' flag."
},

schema: []
schema: [{
type: "object",
properties: {
requiredUnicodeFlag: {
type: "string",
enum: ["u", "v"]
mdjermanovic marked this conversation as resolved.
Show resolved Hide resolved
}
},
additionalProperties: false,
description: "The required flag. If this is not specified, this rule allows both 'u' and 'v' flags."
}]
},

create(context) {

const sourceCode = context.sourceCode;

const config = context.options[0] || {};
const requiredUnicodeFlag = config.requiredUnicodeFlag || null;

const isRequiredUFlagOnly = requiredUnicodeFlag === "u";
const isRequiredVFlagOnly = requiredUnicodeFlag === "v";

/**
* Checks whether or not the given flags include the required flags.
* @param {string} flags RegExp flags
* @returns {{ shouldReportForUFlag: boolean, shouldReportForVFlag: boolean }} `true` if the given flags include the required flags.
*/
function shouldReport(flags) {
const includesUFlag = flags.includes("u");
const includesVFlag = flags.includes("v");

const shouldReportForUFlag =
(isRequiredUFlagOnly && !includesUFlag) ||
(!isRequiredUFlagOnly && !isRequiredVFlagOnly && !includesUFlag && !includesVFlag);
const shouldReportForVFlag = isRequiredVFlagOnly && !includesVFlag;

return { shouldReportForUFlag, shouldReportForVFlag };
}

return {
"Literal[regex]"(node) {
const flags = node.regex.flags || "";

if (!flags.includes("u")) {
const { shouldReportForUFlag, shouldReportForVFlag } = shouldReport(flags);

/**
* Reports a RegExp literal without unicode flag.
* @param {boolean} forVFlag `true` if the required flag is 'v'.
* @returns {void}
*/
function reportRegExpLiteral(forVFlag) {
context.report({
messageId: "requireUFlag",
messageId: forVFlag ? "requireVFlag" : "requireUFlag",
node,
suggest: isValidWithUnicodeFlag(context.languageOptions.ecmaVersion, node.regex.pattern)
? [
{
fix(fixer) {
return fixer.insertTextAfter(node, "u");

// /test/gimuy -> /test/gimvy, /test/gimvy -> /test/gimuy
if ((forVFlag && flags.includes("u")) || (!forVFlag && flags.includes("v"))) {
const searchValue = forVFlag ? "u" : "v";
const replaceValue = forVFlag ? "v" : "u";

const newFlags = flags.replace(searchValue, replaceValue);

/**
* /test/gimyu
* ^^^^^
*/
const rangeForFlags = [node.range[1] - flags.length, node.range[1]];

return fixer.replaceTextRange(rangeForFlags, newFlags);
}

// /test/g -> /test/gu
return fixer.insertTextAfter(node, forVFlag ? "v" : "u");
},
messageId: "addUFlag"
messageId: forVFlag ? "addVFlag" : "addUFlag"
}
]
: null
});
}

if (shouldReportForUFlag) {
reportRegExpLiteral(/* forVFlag */ false);
} else if (shouldReportForVFlag) {
reportRegExpLiteral(/* forVFlat */ true);
}
},

Program(node) {
Expand All @@ -85,42 +151,50 @@ module.exports = {
const pattern = getStringIfConstant(patternNode, scope);
const flags = getStringIfConstant(flagsNode, scope);

if (!flagsNode || (typeof flags === "string" && !flags.includes("u"))) {
context.report({
messageId: "requireUFlag",
node: refNode,
suggest: typeof pattern === "string" && isValidWithUnicodeFlag(context.languageOptions.ecmaVersion, pattern)
? [
{
fix(fixer) {
if (flagsNode) {
if ((flagsNode.type === "Literal" && typeof flagsNode.value === "string") || flagsNode.type === "TemplateLiteral") {
const flagsNodeText = sourceCode.getText(flagsNode);

return fixer.replaceText(flagsNode, [
flagsNodeText.slice(0, flagsNodeText.length - 1),
flagsNodeText.slice(flagsNodeText.length - 1)
].join("u"));
if (!flagsNode || (typeof flags === "string")) {
const { shouldReportForUFlag, shouldReportForVFlag } = shouldReport(flags || "");

if (shouldReportForUFlag || shouldReportForVFlag) {
const requireMessageId = shouldReportForVFlag ? "requireVFlag" : "requireUFlag";
const addMessageId = shouldReportForVFlag ? "addVFlag" : "addUFlag";
const flag = shouldReportForVFlag ? "v" : "u";

context.report({
messageId: requireMessageId,
node: refNode,
suggest: typeof pattern === "string" && isValidWithUnicodeFlag(context.languageOptions.ecmaVersion, pattern)
? [
{
fix(fixer) {
if (flagsNode) {
if ((flagsNode.type === "Literal" && typeof flagsNode.value === "string") || flagsNode.type === "TemplateLiteral") {
const flagsNodeText = sourceCode.getText(flagsNode);

return fixer.replaceText(flagsNode, [
flagsNodeText.slice(0, flagsNodeText.length - 1),
flagsNodeText.slice(flagsNodeText.length - 1)
].join(flag));
}

// We intentionally don't suggest concatenating + "u" to non-literals
return null;
}

// We intentionally don't suggest concatenating + "u" to non-literals
return null;
}

const penultimateToken = sourceCode.getLastToken(refNode, { skip: 1 }); // skip closing parenthesis

return fixer.insertTextAfter(
penultimateToken,
astUtils.isCommaToken(penultimateToken)
? ' "u",'
: ', "u"'
);
},
messageId: "addUFlag"
}
]
: null
});
const penultimateToken = sourceCode.getLastToken(refNode, { skip: 1 }); // skip closing parenthesis

return fixer.insertTextAfter(
penultimateToken,
astUtils.isCommaToken(penultimateToken)
? ` "${flag}", `
: `, "${flag}"`
);
},
messageId: addMessageId
}
]
: null
});
}
}
}
}
Expand Down