New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web compatibility risk of specified RegExp lastIndex semantics #625

Closed
littledan opened this Issue Jun 29, 2016 · 13 comments

Comments

Projects
None yet
6 participants
@littledan
Member

littledan commented Jun 29, 2016

I tried to ship a more spec-compliant implementation of RegExps in Chrome, but a user reported a web compat issue at https://bugs.chromium.org/p/chromium/issues/detail?id=624318

Seems like, historically, browsers did not throw for this test case:

x = /a/
Object.freeze(x)
"b".match(x)

However, ES2015 and ES5 are fairly clear that when a match isn't found, then a strict-mode write is done to set lastIndex to 0. This is tested by test262 https://github.com/tc39/test262/blob/master/test/built-ins/RegExp/prototype/exec/y-fail-lastindex-no-write.js .

Has anyone else tried shipping these semantics? I'm inclined to revert to legacy semantics for now until I can quantify the size of the breakage and know if it's OK.

[Sidebar: One argument for permanent "legacy" semantics is that it enhances the utility of frozen RegExps, but on the other hand, RegExps are already somewhat limited when frozen (e.g., in global mode, they don't make much sense).]

@allenwb

This comment has been minimized.

Show comment
Hide comment
@allenwb

allenwb Jun 30, 2016

Member

I don't think "legacy semantics" is the right term here? Perhaps legacy ES5 implementation bug would be a better characterization Do all major browsers actually have this bug?

Some background, Object.freeze, strict mode, "strict writes", and the use of a strict write when updating lastIndex were all introduced by the ES5 spec. So, prior to ES5 there was no way to "freeze" lastIndex. Any code using Object.freeze could not be legacy code. If ES5 implementations allowed RegExp built-ins to silently update (or try to update) a non-writable, non-configurable lastIndex property, the implementations were buggy.

What is the buggy semantics you want and is this only match you are talking about or does the bug occur everywhere lastIndex is updated? Allowing the value of a frozen lastIndex to be modified would violate one of the fundamental invariants and would be a security hole. Silently ignoring such a write would be a contract violation that may hides bugs (because lastIndex is not updated in the manner that the spec. requires.).

Given the ES3 semantics of RegExp algorithms that update lastIndex, any code that is trying to freeze an RegExp is probably ill-conceived and likely buggy. I don't think we should memorialize such bugs.

We have talked in the past about developing a functional RegExp form that didn't use any mutable internal state such as lastIndex. That still seems like it would be a good direction to pursue.

Member

allenwb commented Jun 30, 2016

I don't think "legacy semantics" is the right term here? Perhaps legacy ES5 implementation bug would be a better characterization Do all major browsers actually have this bug?

Some background, Object.freeze, strict mode, "strict writes", and the use of a strict write when updating lastIndex were all introduced by the ES5 spec. So, prior to ES5 there was no way to "freeze" lastIndex. Any code using Object.freeze could not be legacy code. If ES5 implementations allowed RegExp built-ins to silently update (or try to update) a non-writable, non-configurable lastIndex property, the implementations were buggy.

What is the buggy semantics you want and is this only match you are talking about or does the bug occur everywhere lastIndex is updated? Allowing the value of a frozen lastIndex to be modified would violate one of the fundamental invariants and would be a security hole. Silently ignoring such a write would be a contract violation that may hides bugs (because lastIndex is not updated in the manner that the spec. requires.).

Given the ES3 semantics of RegExp algorithms that update lastIndex, any code that is trying to freeze an RegExp is probably ill-conceived and likely buggy. I don't think we should memorialize such bugs.

We have talked in the past about developing a functional RegExp form that didn't use any mutable internal state such as lastIndex. That still seems like it would be a good direction to pursue.

@claudepache

This comment has been minimized.

Show comment
Hide comment
@claudepache

claudepache Jun 30, 2016

Contributor

Seems like, historically, browsers did not throw for this test case:

x = /a/
Object.freeze(x)
"b".match(x)

Note that Firefox and Safari do throw when the value of x.lastIndex will attempted to be set to a different value, e.g.:

"ca".match(Object.freeze(/a/g))

According to my tests, Edge 13 is buggy in its own way: although it allows to freeze a regexp (and report that regexp as frozen through Object.isFrozen), it leaves the lastIndex property of that regexp as writable.

Contributor

claudepache commented Jun 30, 2016

Seems like, historically, browsers did not throw for this test case:

x = /a/
Object.freeze(x)
"b".match(x)

Note that Firefox and Safari do throw when the value of x.lastIndex will attempted to be set to a different value, e.g.:

"ca".match(Object.freeze(/a/g))

According to my tests, Edge 13 is buggy in its own way: although it allows to freeze a regexp (and report that regexp as frozen through Object.isFrozen), it leaves the lastIndex property of that regexp as writable.

@claudepache

This comment has been minimized.

Show comment
Hide comment
@claudepache

claudepache Jun 30, 2016

Contributor

Don’t be confused by my last comment: the issue seems more related to non-global regexps for which the lastIndex property was historically ignored (neither read, nor updated) in some cases (including String#match).

Contributor

claudepache commented Jun 30, 2016

Don’t be confused by my last comment: the issue seems more related to non-global regexps for which the lastIndex property was historically ignored (neither read, nor updated) in some cases (including String#match).

@littledan

This comment has been minimized.

Show comment
Hide comment
@littledan

littledan Jun 30, 2016

Member

For V8'S part, our RegExp implementation executes largely based on sloppy mode JS, so we allow a bunch of failed writes. There is no information leak--the object is really frozen, just non-throwing due to the sloppy mode write.

Personally, I like the Firefox/Safari semantics, if we need to do something non-throwing for this particular case: just pass false to theone relevant Set call, right?

Member

littledan commented Jun 30, 2016

For V8'S part, our RegExp implementation executes largely based on sloppy mode JS, so we allow a bunch of failed writes. There is no information leak--the object is really frozen, just non-throwing due to the sloppy mode write.

Personally, I like the Firefox/Safari semantics, if we need to do something non-throwing for this particular case: just pass false to theone relevant Set call, right?

@claudepache

This comment has been minimized.

Show comment
Hide comment
@claudepache

claudepache Jun 30, 2016

Contributor

In order to remove some confusion from my comment. In reality, Firefox and Safari always throw when attempting to set the lastIndex property of a frozen regexp to some value, even when the value is unchanged. However:

  • For Firefox you should test it on Aurora or Nightly (the new implementation of String#match is not yet in stable release).
  • Safari (Technology Preview) seems to incorrectly ignore (neither read nor update) the lastIndex property of non-global RegExp.
Contributor

claudepache commented Jun 30, 2016

In order to remove some confusion from my comment. In reality, Firefox and Safari always throw when attempting to set the lastIndex property of a frozen regexp to some value, even when the value is unchanged. However:

  • For Firefox you should test it on Aurora or Nightly (the new implementation of String#match is not yet in stable release).
  • Safari (Technology Preview) seems to incorrectly ignore (neither read nor update) the lastIndex property of non-global RegExp.
@claudepache

This comment has been minimized.

Show comment
Hide comment
@claudepache

claudepache Jun 30, 2016

Contributor

See PR #627 for a possible fix.

Contributor

claudepache commented Jun 30, 2016

See PR #627 for a possible fix.

@erights

This comment has been minimized.

Show comment
Hide comment
@erights

erights Jun 30, 2016

I like the #627 idea. When the property update is useless any, better not to do it than to cope with a meaningless failure to do it.

erights commented Jun 30, 2016

I like the #627 idea. When the property update is useless any, better not to do it than to cope with a meaningless failure to do it.

@allenwb

This comment has been minimized.

Show comment
Hide comment
@allenwb

allenwb Jun 30, 2016

Member

@erights

When the property update is useless any, better not to do it than to cope with a meaningless failure to do it.

Note that the use of a RegExp instance can be distant from the original provider of the instance (ie, it may have been past through multiple layers of function parameters, value containers, etc). The actual usage site may well be dependent upon use of the updated lastIndex value and completely unprepared to deal with the fact that a "frozen" instance was passed to it. Ignoring property updates in that case may silently produce looping or erroneous results. On the other hand, throwing on such updates is more likely to call attention that this is a malformed program that needs to be repaired. BTW, I believe that was the original motivation for using strict puts in these situations (actual its the motivation for even having strict puts).

Member

allenwb commented Jun 30, 2016

@erights

When the property update is useless any, better not to do it than to cope with a meaningless failure to do it.

Note that the use of a RegExp instance can be distant from the original provider of the instance (ie, it may have been past through multiple layers of function parameters, value containers, etc). The actual usage site may well be dependent upon use of the updated lastIndex value and completely unprepared to deal with the fact that a "frozen" instance was passed to it. Ignoring property updates in that case may silently produce looping or erroneous results. On the other hand, throwing on such updates is more likely to call attention that this is a malformed program that needs to be repaired. BTW, I believe that was the original motivation for using strict puts in these situations (actual its the motivation for even having strict puts).

@allenwb

This comment has been minimized.

Show comment
Hide comment
@allenwb

allenwb Jun 30, 2016

Member

Before we rush into changing things, can we step back a bit and survey the actual situations?

First, concerning the various places that may try to updates to non-writable, non-configurable lastIndex RegExp properties. Are there any cases, not in conformance to the ES5/ES6 specs that are implemented with the same interoperable but non-conforming behavior across all major browser platforms. It isn't clear to me from Claude's comments that any such fully interoperable bugs exist. Can we verify? Are there test that cover these cases? If there aren't such universally implemented bugs, then we don't have a web compatibility issue that needs remediation. We just have buggy implementations that are not interoperable in some rare situations. In this case, the behavior specified in the standards should apply and implementations should fix their bugs.

Second, it would be a non-breaking spec. change to make the updates to lastIndex be non-strict puts. But that doesn't mean it is a good idea. I've already argued why it might not be. Regardless, it would be a normative change that needs to be justified on its on merit rather than as an accommodation to a buggy implementation.

I assume that somebody will bring this issue and the supporting data to the next TC39 meeting.

Member

allenwb commented Jun 30, 2016

Before we rush into changing things, can we step back a bit and survey the actual situations?

First, concerning the various places that may try to updates to non-writable, non-configurable lastIndex RegExp properties. Are there any cases, not in conformance to the ES5/ES6 specs that are implemented with the same interoperable but non-conforming behavior across all major browser platforms. It isn't clear to me from Claude's comments that any such fully interoperable bugs exist. Can we verify? Are there test that cover these cases? If there aren't such universally implemented bugs, then we don't have a web compatibility issue that needs remediation. We just have buggy implementations that are not interoperable in some rare situations. In this case, the behavior specified in the standards should apply and implementations should fix their bugs.

Second, it would be a non-breaking spec. change to make the updates to lastIndex be non-strict puts. But that doesn't mean it is a good idea. I've already argued why it might not be. Regardless, it would be a normative change that needs to be justified on its on merit rather than as an accommodation to a buggy implementation.

I assume that somebody will bring this issue and the supporting data to the next TC39 meeting.

@claudepache

This comment has been minimized.

Show comment
Hide comment
@claudepache

claudepache Jun 30, 2016

Contributor

@allenwb According to my testings, given

var rx = Object.freeze(/a/)

the following expressions used to consistently not throw in browsers for various reasons:

"b".match(rx)
"b".search(rx)
  • in Firefox and Safari because they forget to update the lastIndex property to 0 (according to tests in #627 (comment) — although there is still some mystery for me on how Safari TP implements RegExp.prototype.@@search);
  • in Edge, because it lies about freezing (the lastIndex property remains writable);
  • in Chrome ≤ 50, apparently because it ignored the failed write to lastIndex.

However, the following expressions do throw in Firefox (but not in Safari or Edge for the same reasons as above):

rx.exec("b")
rx.test("b")
Contributor

claudepache commented Jun 30, 2016

@allenwb According to my testings, given

var rx = Object.freeze(/a/)

the following expressions used to consistently not throw in browsers for various reasons:

"b".match(rx)
"b".search(rx)
  • in Firefox and Safari because they forget to update the lastIndex property to 0 (according to tests in #627 (comment) — although there is still some mystery for me on how Safari TP implements RegExp.prototype.@@search);
  • in Edge, because it lies about freezing (the lastIndex property remains writable);
  • in Chrome ≤ 50, apparently because it ignored the failed write to lastIndex.

However, the following expressions do throw in Firefox (but not in Safari or Edge for the same reasons as above):

rx.exec("b")
rx.test("b")
@littledan

This comment has been minimized.

Show comment
Hide comment
@littledan

littledan Jul 1, 2016

Member

@allenwb I won't be able to collect much data beyond the user bug report for a number of months; that's just how long it takes. This issue might not make it to the next meeting and result in a PR that is ready to merge because of that, and that's OK with me; it just means that Chrome might not ship the standard semantics in new versions due to the issue until we get this worked out, and other implementers might want to be similarly cautious.

Member

littledan commented Jul 1, 2016

@allenwb I won't be able to collect much data beyond the user bug report for a number of months; that's just how long it takes. This issue might not make it to the next meeting and result in a PR that is ready to merge because of that, and that's OK with me; it just means that Chrome might not ship the standard semantics in new versions due to the issue until we get this worked out, and other implementers might want to be similarly cautious.

@ljharb

This comment has been minimized.

Show comment
Hide comment
@ljharb

ljharb Mar 21, 2018

Member

@littledan were we able to get data on this one?

Member

ljharb commented Mar 21, 2018

@littledan were we able to get data on this one?

@littledan

This comment has been minimized.

Show comment
Hide comment
@littledan

littledan Apr 4, 2018

Member

I believe we ended up settling on this as the web compat fix: #627

Member

littledan commented Apr 4, 2018

I believe we ended up settling on this as the web compat fix: #627

@littledan littledan closed this Apr 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment