Skip to content
ECMAScript RegExp Match Indices
HTML JavaScript
Branch: master
Clone or download
Latest commit 4981597 Oct 3, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.vscode Reorganize patches Jul 18, 2019
spec Make .indices a data property. Closes #29. (#31) Aug 22, 2019
.gitattributes Initial explainer May 2, 2018
.gitignore Add azure pipelines build for CI and PR build previews (#26) [skip ci] Jul 26, 2019
.yo-rc.json Initial explainer May 2, 2018
LICENSE Initial explainer May 2, 2018
README.md Update README.md Oct 2, 2019
azure-pipelines.yml
gulpfile.js Update to build scripts Jul 26, 2019
package-lock.json Update to build scripts Jul 26, 2019
package.json Update to build scripts Jul 26, 2019

README.md

RegExp Match Indices for ECMAScript

ECMAScript RegExp Match Indices provide additional information about the start and end indices of captured substrings relative to the start of the input string.

A polyfill can be found in the regexp-match-indices package on NPM.

NOTE: This proposal was previously named "RegExp Match Array Offsets", but has been renamed to more accurately represent the current status of the proposal.

Status

Stage: 3 Champion: Ron Buckton (@rbuckton)

For detailed status of this proposal see TODO, below.

Authors

  • Ron Buckton (@rbuckton)

Motivations

Today, ECMAScript RegExp objects can provide information about a match when calling the exec method. This result is an Array containing information about the substrings that were matched, along with additional properties to indicate the input string, the index in the input at which the match was found, as well as a groups object containing the substrings for any named capture groups.

However, there are several more advanced scenarios where this information may not necessarily be sufficient. For example, an ECMAScript implementation of TextMate Language syntax highlighting needs more than just the index of the match, but also the start and end indices for individual capture groups.

As such, we propose the adoption of an additional indices property on the array result (the substrings array) of RegExp.prototype.exec(). This property would itself be an indices array containing a pair of start and end indices for each captured substring. Any unmatched capture groups would be undefined, similar to their corresponding element in the substrings array. In addition, the indices array would itself have a groups property containing the start and end indices for each named capture group.

Prior Art

Examples

const re1 = /a+(?<Z>z)?/;

// indices are relative to start of the input string:
const s1 = "xaaaz";
const m1 = re1.exec(s1);
m1.indices[0][0] === 1;
m1.indices[0][1] === 5;
s1.slice(...m1.indices[0]) === "aaaz";

m1.indices[1][0] === 4;
m1.indices[1][1] === 5;
s1.slice(...m1.indices[1]) === "z";

m1.indices.groups["Z"][0] === 4;
m1.indices.groups["Z"][1] === 5;
s1.slice(...m1.indices.groups["Z"]) === "z";

// capture groups that are not matched return `undefined`:
const m2 = re1.exec("xaaay");
m2.indices[1] === undefined;
m2.indices.groups["Z"] === undefined;

TODO

The following is a high-level list of tasks to progress through each stage of the TC39 proposal process:

Stage 1 Entrance Criteria

  • Identified a "champion" who will advance the addition.
  • Prose outlining the problem or need and the general shape of a solution.
  • Illustrative examples of usage.
  • High-level API.

Stage 2 Entrance Criteria

Stage 3 Entrance Criteria

Stage 4 Entrance Criteria

  • Test262 acceptance tests have been written for mainline usage scenarios and merged.
  • Two compatible implementations which pass the acceptance tests:
    • V8 (tracking bug) — Implemented in v7.8 behind the --harmony-regexp-match-indices flag
    • SpiderMonkey (tracking bug) — Not yet implemented.
    • JavaScriptCore (tracking bug) — Not yet implemented.
  • A pull request has been sent to tc39/ecma262 with the integrated spec text.
  • The ECMAScript editor has signed off on the pull request.
You can’t perform that action at this time.