Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upWeb compatibility risk of specified RegExp lastIndex semantics #625
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
allenwb
Jun 30, 2016
Member
I don't think "legacy semantics" is the right term here? Perhaps legacy ES5 implementation bug would be a better characterization Do all major browsers actually have this bug?
Some background, Object.freeze, strict mode, "strict writes", and the use of a strict write when updating lastIndex were all introduced by the ES5 spec. So, prior to ES5 there was no way to "freeze" lastIndex. Any code using Object.freeze could not be legacy code. If ES5 implementations allowed RegExp built-ins to silently update (or try to update) a non-writable, non-configurable lastIndex property, the implementations were buggy.
What is the buggy semantics you want and is this only match you are talking about or does the bug occur everywhere lastIndex is updated? Allowing the value of a frozen lastIndex to be modified would violate one of the fundamental invariants and would be a security hole. Silently ignoring such a write would be a contract violation that may hides bugs (because lastIndex is not updated in the manner that the spec. requires.).
Given the ES3 semantics of RegExp algorithms that update lastIndex, any code that is trying to freeze an RegExp is probably ill-conceived and likely buggy. I don't think we should memorialize such bugs.
We have talked in the past about developing a functional RegExp form that didn't use any mutable internal state such as lastIndex. That still seems like it would be a good direction to pursue.
|
I don't think "legacy semantics" is the right term here? Perhaps legacy ES5 implementation bug would be a better characterization Do all major browsers actually have this bug? Some background, What is the buggy semantics you want and is this only Given the ES3 semantics of RegExp algorithms that update We have talked in the past about developing a functional RegExp form that didn't use any mutable internal state such as |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
claudepache
Jun 30, 2016
Contributor
Seems like, historically, browsers did not throw for this test case:
x = /a/ Object.freeze(x) "b".match(x)
Note that Firefox and Safari do throw when the value of x.lastIndex will attempted to be set to a different value, e.g.:
"ca".match(Object.freeze(/a/g))According to my tests, Edge 13 is buggy in its own way: although it allows to freeze a regexp (and report that regexp as frozen through Object.isFrozen), it leaves the lastIndex property of that regexp as writable.
Note that Firefox and Safari do throw when the value of "ca".match(Object.freeze(/a/g))According to my tests, Edge 13 is buggy in its own way: although it allows to freeze a regexp (and report that regexp as frozen through |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
claudepache
Jun 30, 2016
Contributor
Don’t be confused by my last comment: the issue seems more related to non-global regexps for which the lastIndex property was historically ignored (neither read, nor updated) in some cases (including String#match).
|
Don’t be confused by my last comment: the issue seems more related to non-global regexps for which the lastIndex property was historically ignored (neither read, nor updated) in some cases (including String#match). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
littledan
Jun 30, 2016
Member
For V8'S part, our RegExp implementation executes largely based on sloppy mode JS, so we allow a bunch of failed writes. There is no information leak--the object is really frozen, just non-throwing due to the sloppy mode write.
Personally, I like the Firefox/Safari semantics, if we need to do something non-throwing for this particular case: just pass false to theone relevant Set call, right?
|
For V8'S part, our RegExp implementation executes largely based on sloppy mode JS, so we allow a bunch of failed writes. There is no information leak--the object is really frozen, just non-throwing due to the sloppy mode write. Personally, I like the Firefox/Safari semantics, if we need to do something non-throwing for this particular case: just pass |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
claudepache
Jun 30, 2016
Contributor
In order to remove some confusion from my comment. In reality, Firefox and Safari always throw when attempting to set the lastIndex property of a frozen regexp to some value, even when the value is unchanged. However:
- For Firefox you should test it on Aurora or Nightly (the new implementation of
String#matchis not yet in stable release). - Safari (Technology Preview) seems to incorrectly ignore (neither read nor update) the
lastIndexproperty of non-global RegExp.
|
In order to remove some confusion from my comment. In reality, Firefox and Safari always throw when attempting to set the lastIndex property of a frozen regexp to some value, even when the value is unchanged. However:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
See PR #627 for a possible fix. |
claudepache
referenced this issue
Jun 30, 2016
Merged
Disregard lastIndex for non-global non-sticky regexps #627
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
erights
Jun 30, 2016
I like the #627 idea. When the property update is useless any, better not to do it than to cope with a meaningless failure to do it.
erights
commented
Jun 30, 2016
|
I like the #627 idea. When the property update is useless any, better not to do it than to cope with a meaningless failure to do it. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
allenwb
Jun 30, 2016
Member
When the property update is useless any, better not to do it than to cope with a meaningless failure to do it.
Note that the use of a RegExp instance can be distant from the original provider of the instance (ie, it may have been past through multiple layers of function parameters, value containers, etc). The actual usage site may well be dependent upon use of the updated lastIndex value and completely unprepared to deal with the fact that a "frozen" instance was passed to it. Ignoring property updates in that case may silently produce looping or erroneous results. On the other hand, throwing on such updates is more likely to call attention that this is a malformed program that needs to be repaired. BTW, I believe that was the original motivation for using strict puts in these situations (actual its the motivation for even having strict puts).
Note that the use of a RegExp instance can be distant from the original provider of the instance (ie, it may have been past through multiple layers of function parameters, value containers, etc). The actual usage site may well be dependent upon use of the updated |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
allenwb
Jun 30, 2016
Member
Before we rush into changing things, can we step back a bit and survey the actual situations?
First, concerning the various places that may try to updates to non-writable, non-configurable lastIndex RegExp properties. Are there any cases, not in conformance to the ES5/ES6 specs that are implemented with the same interoperable but non-conforming behavior across all major browser platforms. It isn't clear to me from Claude's comments that any such fully interoperable bugs exist. Can we verify? Are there test that cover these cases? If there aren't such universally implemented bugs, then we don't have a web compatibility issue that needs remediation. We just have buggy implementations that are not interoperable in some rare situations. In this case, the behavior specified in the standards should apply and implementations should fix their bugs.
Second, it would be a non-breaking spec. change to make the updates to lastIndex be non-strict puts. But that doesn't mean it is a good idea. I've already argued why it might not be. Regardless, it would be a normative change that needs to be justified on its on merit rather than as an accommodation to a buggy implementation.
I assume that somebody will bring this issue and the supporting data to the next TC39 meeting.
|
Before we rush into changing things, can we step back a bit and survey the actual situations? First, concerning the various places that may try to updates to non-writable, non-configurable Second, it would be a non-breaking spec. change to make the updates to I assume that somebody will bring this issue and the supporting data to the next TC39 meeting. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
claudepache
Jun 30, 2016
Contributor
@allenwb According to my testings, given
var rx = Object.freeze(/a/)the following expressions used to consistently not throw in browsers for various reasons:
"b".match(rx)
"b".search(rx)- in Firefox and Safari because they forget to update the lastIndex property to 0 (according to tests in #627 (comment) — although there is still some mystery for me on how Safari TP implements RegExp.prototype.@@search);
- in Edge, because it lies about freezing (the lastIndex property remains writable);
- in Chrome ≤ 50, apparently because it ignored the failed write to lastIndex.
However, the following expressions do throw in Firefox (but not in Safari or Edge for the same reasons as above):
rx.exec("b")
rx.test("b")|
@allenwb According to my testings, given var rx = Object.freeze(/a/)the following expressions used to consistently not throw in browsers for various reasons: "b".match(rx)
"b".search(rx)
However, the following expressions do throw in Firefox (but not in Safari or Edge for the same reasons as above): rx.exec("b")
rx.test("b") |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
littledan
Jul 1, 2016
Member
@allenwb I won't be able to collect much data beyond the user bug report for a number of months; that's just how long it takes. This issue might not make it to the next meeting and result in a PR that is ready to merge because of that, and that's OK with me; it just means that Chrome might not ship the standard semantics in new versions due to the issue until we get this worked out, and other implementers might want to be similarly cautious.
|
@allenwb I won't be able to collect much data beyond the user bug report for a number of months; that's just how long it takes. This issue might not make it to the next meeting and result in a PR that is ready to merge because of that, and that's OK with me; it just means that Chrome might not ship the standard semantics in new versions due to the issue until we get this worked out, and other implementers might want to be similarly cautious. |
domenic
added
the
web reality
label
Jul 28, 2016
domenic
referenced this issue
Jul 28, 2016
Closed
Question: Intended semantics for setting lastIndex? #477
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@littledan were we able to get data on this one? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
I believe we ended up settling on this as the web compat fix: #627 |
littledan commentedJun 29, 2016
I tried to ship a more spec-compliant implementation of RegExps in Chrome, but a user reported a web compat issue at https://bugs.chromium.org/p/chromium/issues/detail?id=624318
Seems like, historically, browsers did not throw for this test case:
However, ES2015 and ES5 are fairly clear that when a match isn't found, then a strict-mode write is done to set lastIndex to 0. This is tested by test262 https://github.com/tc39/test262/blob/master/test/built-ins/RegExp/prototype/exec/y-fail-lastindex-no-write.js .
Has anyone else tried shipping these semantics? I'm inclined to revert to legacy semantics for now until I can quantify the size of the breakage and know if it's OK.
[Sidebar: One argument for permanent "legacy" semantics is that it enhances the utility of frozen RegExps, but on the other hand, RegExps are already somewhat limited when frozen (e.g., in global mode, they don't make much sense).]