Skip to content

Commit

Permalink
Implement 're' shorthand with single-escape policy
Browse files Browse the repository at this point in the history
  • Loading branch information
DmitrySoshnikov committed Apr 24, 2017
1 parent 2c4ef19 commit 4f05044
Show file tree
Hide file tree
Showing 9 changed files with 217 additions and 50 deletions.
101 changes: 73 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,9 @@ Enables modern RegExp features in JavaScript.
- [Named capturing groups](#named-capturing-groups)
- [Extended x-flag](#extended-x-flag)
- [Plugin options](#plugin-options)
- [`includeRuntime` option](#includeruntime-option)
- [`features` option](#features-option)
- [`useRe` option](#usere-option)
- [`useRuntime` option](#useruntime-option)
- [Usage](#usage)
- [Via `.babelrc`](#via-babelrc)
- [Via CLI](#via-cli)
Expand Down Expand Up @@ -99,18 +100,85 @@ new RegExp(`
Translated into:

```js
new RegExp('(\\d{4})-(\\d{2})-(\\d{2})', '');
/(\d{4})-(\d{2})-(\d{2})/;
```

## Plugin options

The plugin supports the following options.

### `includeRuntime` option
### `features` option

This options allows choosing which specific transformations to apply. Available features are:

- `dotAll`
- `namedCapturingGroups`
- `xFlag`

which can be specified as an extra object for the plugin:

```json
{
"plugins": ["transform-modern-regexp", {
"features": [
"namedCapturingGroups",
"xFlag"
]
}]
}
```

> NOTE: if omitted, all features are used by default.
### `useRe` option

This option enables a convenient `re` shorthand, which allows using multiline regexes with _single escape for meta-characters_ (just like in regular expression literals).

Taking example of the date regexep using standard `RegExp` constructor:

```js
new RegExp(`
# A regular expression for date.
(?<year>\\d{4})- # year part of a date
(?<month>\\d{2})- # month part of a date
(?<day>\\d{2}) # day part of a date
`, 'x');
```

we see inconvenient double-escaping of `\\d` (and similarly for other meta-characters). The `re` shorthand allows using single escaping:

> NOTE: the `includeRuntime` option is not implemented yet. Track [issue #3](https://github.com/DmitrySoshnikov/babel-plugin-transform-modern-regexp/issues/3) for details.
```js
re`/
# A regular expression for date.
(?<year>\d{4})- # year part of a date
(?<month>\d{2})- # month part of a date
(?<day>\d{2}) # day part of a date
/x`;
```

As we can see, `re` accepts a regexp in the _literal notation_, which unifies the usage format.

In both cases it's translated to simple regexp literal, so no any runtime overhead:

> NOTE: `includeRuntime` is not required: if e.g. named groups are used mostly for readability, the `includeRuntime` can be omitted. If you need to access actual group names on the matched results, the runtime support should be used.
```js
/(\d{4})-(\d{2})-(\d{2})/
```

> NOTE: it supports only template string literals, you can't use expressions there. Be careful also with `/${4}/` -- this is treated as a template literal expression, and should be written as `/\${4}/` instead.
> NOTE: `\\1` backreferences should still be escaped with _double slashes_. This is due template literal strings do not allow `\1` treating them as Octal numbers.
### `useRuntime` option

> NOTE: the `useRuntime` option is not implemented yet. Track [issue #3](https://github.com/DmitrySoshnikov/babel-plugin-transform-modern-regexp/issues/3) for details.
> NOTE: `useRuntime` is not required: if e.g. named groups are used mostly for readability, the `useRuntime` can be omitted. If you need to access actual group names on the matched results, the runtime support should be used.
This option enables usage of a supporting runtime for the transformed regexes. The `RegExpTree` class is a thin wrapper on top of a native regexp, and has identical API.

Expand Down Expand Up @@ -140,29 +208,6 @@ const result = re.exec('2017-04-17');
console.log(result.groups.year); // 2017
```

### `features` option

This options allows choosing which specific transformations to apply. Available features are:

- `dotAll`
- `namedCapturingGroups`
- `xFlag`

which can be specified as an extra object for the plugin:

```json
{
"plugins": ["transform-modern-regexp", {
"features": [
"namedCapturingGroups",
"xFlag"
]
}]
}
```

> NOTE: if omitted, all features are used by default.
## Usage

### Via `.babelrc`
Expand Down
2 changes: 1 addition & 1 deletion __tests__/fixtures/integration/expected-subset.js
Original file line number Diff line number Diff line change
@@ -1 +1 @@
const re = new RegExp('(.)+\\1\\1', 'su');
const re = /(.)+\1\1/su;
2 changes: 1 addition & 1 deletion __tests__/fixtures/integration/expected.js
Original file line number Diff line number Diff line change
@@ -1 +1 @@
const re = new RegExp('([\\0-\\u{10FFFF}])+\\1\\1', 'u');
const re = /([\0-\u{10FFFF}])+\1\1/u;
3 changes: 3 additions & 0 deletions __tests__/fixtures/re/expected.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
const dateRe = /(\d{4})-(\d{2})-(\d{2})/;

const otherRe = /(x)\1\1/;
10 changes: 10 additions & 0 deletions __tests__/fixtures/re/input.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
const dateRe = re`/
# A regular expression for date.
(?<year>\d{4})- # year part of a date
(?<month>\d{2})- # month part of a date
(?<day>\d{2}) # day part of a date
/x`;

const otherRe = re`/(?<name>x)\\1\k<name>/`;
3 changes: 3 additions & 0 deletions __tests__/fixtures/re/options.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"useRe": true
}
2 changes: 1 addition & 1 deletion __tests__/fixtures/x-flag/expected.js
Original file line number Diff line number Diff line change
@@ -1 +1 @@
const re = new RegExp('(\\d{4})-(\\d{2})-(\\d{2})', '');
const re = /(\d{4})-(\d{2})-(\d{2})/;
10 changes: 9 additions & 1 deletion __tests__/modern-regexp-test.js
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,17 @@ describe('modern-regexp-test', () => {
const fixtureDir = path.join(fixturesDir, caseName);
const inputPath = path.join(fixtureDir, 'input.js');

let options = {};

const optionsFile = path.join(fixtureDir, 'options.json');

if (fs.existsSync(optionsFile)) {
options = require(optionsFile);
}

const actual = transformFileSync(inputPath, {
'plugins': [
plugin
[plugin, options]
]
}).code;

Expand Down
134 changes: 116 additions & 18 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,20 @@ const regexpTree = require('regexp-tree');
*
* /(\d{4})-(\d{2})-(\d{2})/
*
* Note: if `includeRuntime` option is passed, this is transalted into:
* ------------------------------------------------------------------
* 1. The `features` option.
*
* The `features` option allows specifying specific regexp features
* to be applied. Available are:
*
* - `dotAll` - enables handling of `s` flag
* - namedCapturingGroups - enables handling of named groups
* - xFlag - enables handling of `x` flag
*
* ------------------------------------------------------------------
* 2. The `useRuntime` option.
*
* Note: if `useRuntime` option is passed, this is transalted into:
*
* const RegExpTree = require('regexp-tree-runtime');
*
Expand All @@ -52,53 +65,129 @@ const regexpTree = require('regexp-tree');
* In case of using runtime, it should be included as a dependency in your
* package.json.
*
* If group names are used mostly for readability, `includeRuntime` may be
* If group names are used mostly for readability, `useRuntime` may be
* omitted.
*
* ------------------------------------------------------------------
* 3. The `re` shorthand (`useRe` option)
*
* The `useRe` option, enables usage of the re`...` pattern. This handles
* global `re` function, where regular expressions can be used with
* single escaping.
*
* Using simple `RegExp` (note double escape `\\d` as per JS strings):
*
* new RegExp(`
*
* (?<year>\\d{2})-
* (?<month>\\d{2})-
* (?<day>\\d{2})
*
* `, 'x');
*
* vs. using `re` (not single escape for `\d`):
*
* re`/
*
* (?<year>\d{2})-
* (?<month>\d{2})-
* (?<day>\d{2})
*
* /x`
*/
module.exports = ({types: t}) => {

/**
* Creates a `RegExpLiteral` node.
*/
function toRegExpLiteral(raw) {
const slashIndex = raw.lastIndexOf('/');

const pattern = raw.slice(1, slashIndex);
const flags = raw.slice(slashIndex);

const re = t.regExpLiteral(
pattern,
flags,
);

re.extra = {
raw,
};

return re;
}

return {
pre(state) {
if (state.opts.includeRuntime) {
throw new Error(`includeRuntime is not implemented yet.`);
if (state.opts.useRuntime) {
throw new Error(`useRuntime is not implemented yet.`);
}
},

visitor: {

// Handle `/foo/i`.
/**
* Handle `/foo/i`.
*/
RegExpLiteral({node}, state) {
Object.assign(node, getTranslatedData(node.extra.raw, state));
},

// Handle `new RegExp('foo', 'i')`.
NewExpression({node}, state) {
/**
* Handle re`/<body>/<flags>` pattern.
* Translate to `/doubleEscape(<body>)/<flags>`
*/
TaggedTemplateExpression(path, state) {
const {node} = path;

if (!state.opts.useRe || !isReTemplate(node)) {
return;
}

let re = node.quasi.quasis[0].value.raw;

// Handle \\\\1 -> \\1. In templates \\1 should be used instead of
// \1 since \1 is treated as an octal number, which is not allowed
// in template strings.
re = re.replace(/\\\\(\d+)/g, '\\$1');

path.replaceWith(toRegExpLiteral(re));
},

/**
* Handle `new RegExp(<body>, <flags>)`.
*
* Translate to /<body>/<flags>
*/
NewExpression(path, state) {
const {node} = path;

if (!isNewRegExp(node)) {
return;
}

let origPattern;
let pattern;

if (node.arguments[0].type === 'StringLiteral') {
origPattern = node.arguments[0].value;
pattern = node.arguments[0].value;
} else if (node.arguments[0].type === 'TemplateLiteral') {
origPattern = node.arguments[0].quasis[0].value.cooked;
pattern = node.arguments[0].quasis[0].value.cooked;
}

let origFlags = '';
let flags = '';

if (node.arguments[1]) {
if (node.arguments[1].type === 'StringLiteral') {
origFlags = node.arguments[1].value;
flags = node.arguments[1].value;
} else if (node.arguments[1].type === 'TemplateLiteral') {
origFlags = node.arguments[1].quasis[0].value.cooked;
flags = node.arguments[1].quasis[0].value.cooked;
}
}

const origRe = `/${origPattern}/${origFlags}`;
const {pattern, flags} = getTranslatedData(origRe, state);
const re = `/${pattern}/${flags}`;

node.arguments[0] = t.stringLiteral(pattern);
node.arguments[1] = t.stringLiteral(flags);
path.replaceWith(toRegExpLiteral(re));
}
},
};
Expand Down Expand Up @@ -136,4 +225,13 @@ function isNewRegExp(node) {
node.arguments[0].quasis.length === 1)
)
);
}
}

function isReTemplate(node) {
return (
node.tag.type === 'Identifier' &&
node.tag.name === 're' &&
node.quasi.type === 'TemplateLiteral' &&
node.quasi.quasis.length === 1
)
}

0 comments on commit 4f05044

Please sign in to comment.