regexpu’s core functionality, i.e. `rewritePattern(pattern, flag, options)`, which enables rewriting regular expressions that make use of the ES6 `u` flag into equivalent ES5-compatible regular expression patterns.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
scripts
tests
.editorconfig
.gitattributes
.gitignore
.travis.yml
LICENSE-MIT.txt
README.md
demo.js
package.json
property-escapes.md
rewrite-pattern.js

README.md

regexpu-core Build status Code coverage status

regexpu is a source code transpiler that enables the use of ES6 Unicode regular expressions in JavaScript-of-today (ES5).

regexpu-core contains regexpu’s core functionality, i.e. rewritePattern(pattern, flag), which enables rewriting regular expressions that make use of the ES6 u flag into equivalent ES5-compatible regular expression patterns.

Installation

To use regexpu-core programmatically, install it as a dependency via npm:

npm install regexpu-core --save

Then, require it:

const rewritePattern = require('regexpu-core');

API

This module exports a single function named rewritePattern.

rewritePattern(pattern, flags, options)

This function takes a string that represents a regular expression pattern as well as a string representing its flags, and returns an ES5-compatible version of the pattern.

rewritePattern('foo.bar', 'u');
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uD7FF\\uDC00-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF])bar'

rewritePattern('[\\u{1D306}-\\u{1D308}a-z]', 'u');
// → '(?:[a-z]|\\uD834[\\uDF06-\\uDF08])'

rewritePattern('[\\u{1D306}-\\u{1D308}a-z]', 'ui');
// → '(?:[a-z\\u017F\\u212A]|\\uD834[\\uDF06-\\uDF08])'

regexpu-core can rewrite non-ES6 regular expressions too, which is useful to demonstrate how their behavior changes once the u and i flags are added:

// In ES5, the dot operator only matches BMP symbols:
rewritePattern('foo.bar');
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF])bar'

// But with the ES6 `u` flag, it matches astral symbols too:
rewritePattern('foo.bar', 'u');
// → 'foo(?:[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uD7FF\\uDC00-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF])bar'

The optional options argument recognizes the following properties:

dotAllFlag (default: false)

Setting this option to true enables experimental support for the s (dotAll) flag.

rewritePattern('.');
// → '[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF]'

rewritePattern('.', '', {
  'dotAllFlag': true
});
// → '[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF]'

rewritePattern('.', 's', {
  'dotAllFlag': true
});
// → '[\\0-\\uFFFF]'

rewritePattern('.', 'su', {
  'dotAllFlag': true
});
// → '(?:[\\0-\\uD7FF\\uE000-\\uFFFF]|[\\uD800-\\uDBFF][\\uDC00-\\uDFFF]|[\\uD800-\\uDBFF](?![\\uDC00-\\uDFFF])|(?:[^\\uD800-\\uDBFF]|^)[\\uDC00-\\uDFFF])'

unicodePropertyEscape (default: false)

Setting this option to true enables experimental support for Unicode property escapes:

rewritePattern('\\p{Script_Extensions=Anatolian_Hieroglyphs}', 'u', {
  'unicodePropertyEscape': true
});
// → '(?:\\uD811[\\uDC00-\\uDE46])'

lookbehind (default: false)

Setting this option to true enables experimental support for lookbehind assertions.

rewritePattern('(?<=.)a', '', {
  'lookbehind': true
});
// → '(?<=[\\0-\\t\\x0B\\f\\x0E-\\u2027\\u202A-\\uFFFF])a'

namedGroup (default: false)

Setting this option to true enables experimental support for named capture groups.

rewritePattern('(?<name>.)\k<name>', '', {
  'namedGroups': true
});
// → '(.)\1'

onNamedGroup

This option is a function that gets called when a named capture group is found. It receives two parameters: the name of the group, and its index.

rewritePattern('(?<name>.)\k<name>', '', {
  'namedGroups': true,
  onNamedGroup(name, index) {
    console.log(name, index);
    // → 'name', 1
  }
});

useUnicodeFlag (default: false)

Setting this option to true enables the use of Unicode code point escapes of the form \u{…}. Note that in regular expressions, such escape sequences only work correctly when the ES6 u flag is set. Enabling this setting often results in more compact output, although there are cases (such as \p{Lu}) where it actually increases the output size.

rewritePattern('\\p{Script_Extensions=Anatolian_Hieroglyphs}', 'u', {
  'unicodePropertyEscape': true,
  'useUnicodeFlag': true
});
// → '[\\u{14400}-\\u{14646}]'

Author

twitter/mathias
Mathias Bynens

License

regexpu-core is available under the MIT license.