New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A.p.join on cyclic arrays does not reflect web reality #289

Open
rossberg opened this Issue Jan 15, 2016 · 14 comments

Comments

Projects
None yet
6 participants
@rossberg
Member

rossberg commented Jan 15, 2016

Browsers have special measures for dealing with cyclic arrays in the A.p.join operator (and by extension, A.p.toString). Chrome, IE, Firefox, & Safari all behave as follows:

a = [1,,2]
a[1] = a
a.toString() === "1,,2"

That is, they detect the cycle and replace it with the empty string. This is not reflected by the spec, which would require toString to diverge.

Should browsers change, or the spec?

@allenwb

This comment has been minimized.

Show comment
Hide comment
@allenwb

allenwb Jan 15, 2016

Member

I believe this was discussed during ES5 development and at the time Brendan took the position that this was a browser specific hack that should be in the spec (sorry, Brendan, if my recollection is wrong).

I suspect that at this point in time TC39 would more likely take the position that we need to specify web reality. Either in the main body of the spec or in Annex B.

I good first step would be for somebody to write some spec language that describes what browsers (or at least one browser) actually does.

Member

allenwb commented Jan 15, 2016

I believe this was discussed during ES5 development and at the time Brendan took the position that this was a browser specific hack that should be in the spec (sorry, Brendan, if my recollection is wrong).

I suspect that at this point in time TC39 would more likely take the position that we need to specify web reality. Either in the main body of the spec or in Annex B.

I good first step would be for somebody to write some spec language that describes what browsers (or at least one browser) actually does.

@bterlson

This comment has been minimized.

Show comment
Hide comment
@bterlson

bterlson Jan 15, 2016

Member

It is annoying to spec because it requires plumbing a seen set through ToString :/

Member

bterlson commented Jan 15, 2016

It is annoying to spec because it requires plumbing a seen set through ToString :/

@bterlson

This comment has been minimized.

Show comment
Hide comment
@bterlson

bterlson Jan 15, 2016

Member

Actually that way will not work as the seen set has to be plumbed through a toString invocation which is user-visible. I'm not sure how to do this.

Member

bterlson commented Jan 15, 2016

Actually that way will not work as the seen set has to be plumbed through a toString invocation which is user-visible. I'm not sure how to do this.

@bterlson

This comment has been minimized.

Show comment
Hide comment
@bterlson

bterlson Jan 15, 2016

Member

Chakra's implementation is to essentially keep a seen list in the current Realm. I suppose that would work here as well.

Member

bterlson commented Jan 15, 2016

Chakra's implementation is to essentially keep a seen list in the current Realm. I suppose that would work here as well.

@anba

This comment has been minimized.

Show comment
Hide comment
@anba

anba Jan 15, 2016

Contributor

https://gist.github.com/anba/4a154bd7143bf2bab3ef, except the differences between JSC+SM compared to Chakra+V8 when the object is added to the set.

Contributor

anba commented Jan 15, 2016

https://gist.github.com/anba/4a154bd7143bf2bab3ef, except the differences between JSC+SM compared to Chakra+V8 when the object is added to the set.

@bterlson

This comment has been minimized.

Show comment
Hide comment
@bterlson

bterlson Jan 15, 2016

Member

@anba looks solid at first glance. What is the JSC+SM difference?

Member

bterlson commented Jan 15, 2016

@anba looks solid at first glance. What is the JSC+SM difference?

@anba

This comment has been minimized.

Show comment
Hide comment
@anba

anba Jan 15, 2016

Contributor

JavaScriptCore and SpiderMonkey check for cycles right after calling ToObject(this):
https://github.com/WebKit/webkit/blob/7920183734b8510146db5aded800cbd67eee1d84/Source/JavaScriptCore/runtime/ArrayPrototype.cpp#L487
https://github.com/mozilla/gecko-dev/blob/137b0dc6d16ce7065a40079d4b8027ec138d4987/js/src/jsarray.cpp#L1099

Chakra checks for cycles after calling ToString(separator) and additionally performs extra proxy checks:
https://github.com/Microsoft/ChakraCore/blob/d8aa1dd25637e749dab720e5c5732535cb913d79/lib/Runtime/Library/JavascriptArray.cpp#L4100

V8 checks for cycles after calling ToString(separator) and additionally performs extra steps for single-element arrays:
https://github.com/v8/v8/blob/25532be593bd80c9e7b3fac4ee50491159902e93/src/js/array.js#L183
https://github.com/v8/v8/blob/25532be593bd80c9e7b3fac4ee50491159902e93/src/js/array.js#L458

(Similar differences are present for Array.prototype.toLocaleString.)

Contributor

anba commented Jan 15, 2016

JavaScriptCore and SpiderMonkey check for cycles right after calling ToObject(this):
https://github.com/WebKit/webkit/blob/7920183734b8510146db5aded800cbd67eee1d84/Source/JavaScriptCore/runtime/ArrayPrototype.cpp#L487
https://github.com/mozilla/gecko-dev/blob/137b0dc6d16ce7065a40079d4b8027ec138d4987/js/src/jsarray.cpp#L1099

Chakra checks for cycles after calling ToString(separator) and additionally performs extra proxy checks:
https://github.com/Microsoft/ChakraCore/blob/d8aa1dd25637e749dab720e5c5732535cb913d79/lib/Runtime/Library/JavascriptArray.cpp#L4100

V8 checks for cycles after calling ToString(separator) and additionally performs extra steps for single-element arrays:
https://github.com/v8/v8/blob/25532be593bd80c9e7b3fac4ee50491159902e93/src/js/array.js#L183
https://github.com/v8/v8/blob/25532be593bd80c9e7b3fac4ee50491159902e93/src/js/array.js#L458

(Similar differences are present for Array.prototype.toLocaleString.)

@bterlson

This comment has been minimized.

Show comment
Hide comment
@bterlson

bterlson Jan 15, 2016

Member

@allenwb raised some issues in the gist that I will respond to here :)

Regarding how realms behave, essentially the realm that join belongs to keeps track of any array objects its seen, whether they are from another realm or the same realm. This means that you will actually join the same array twice if you can manage to get that array to be joined by join functions from different realms. This is actually interoperable behavior!

var a = [1,,2];
var $child = $.createRealm();
$child.evalInNewScript('var b = [3,,4]');
var b = $child.global.b;
a[1] = b;
b[1] = a;
print($child.global.Array.prototype.join.apply(a));

spidermonkey

1,3,1,,2,4,2

chakra

1,3,1,,2,4,2

node

1,3,1,,2,4,2

Also, absolutely code can get invoked in the middle, and this is expected. When executing Array.prototype.join, and an element has a custom toString or is a proxy or whatever, and that function causes Array.prototype.join to be applied to an array instance that is already being joined further up the stack, an empty string is returned. This is also expected and interoperable.

var a = [1,,2];
a[1] = {
  toString() {
    return '"' + a.toString() + '"';
  }
}

print(a.toString());

d8

1,"",2

chakra

1,"",2

spidermonkey

1,"",2

node

1,"",2

Member

bterlson commented Jan 15, 2016

@allenwb raised some issues in the gist that I will respond to here :)

Regarding how realms behave, essentially the realm that join belongs to keeps track of any array objects its seen, whether they are from another realm or the same realm. This means that you will actually join the same array twice if you can manage to get that array to be joined by join functions from different realms. This is actually interoperable behavior!

var a = [1,,2];
var $child = $.createRealm();
$child.evalInNewScript('var b = [3,,4]');
var b = $child.global.b;
a[1] = b;
b[1] = a;
print($child.global.Array.prototype.join.apply(a));

spidermonkey

1,3,1,,2,4,2

chakra

1,3,1,,2,4,2

node

1,3,1,,2,4,2

Also, absolutely code can get invoked in the middle, and this is expected. When executing Array.prototype.join, and an element has a custom toString or is a proxy or whatever, and that function causes Array.prototype.join to be applied to an array instance that is already being joined further up the stack, an empty string is returned. This is also expected and interoperable.

var a = [1,,2];
a[1] = {
  toString() {
    return '"' + a.toString() + '"';
  }
}

print(a.toString());

d8

1,"",2

chakra

1,"",2

spidermonkey

1,"",2

node

1,"",2

@allenwb

This comment has been minimized.

Show comment
Hide comment
@allenwb

allenwb Jan 15, 2016

Member

@bterison
(oops, I actually intended to respond here rather than to the gist)

WRT the xrealm behavior: Did you try it with three references to the x-realm array? I would think that each inner x-realm join call would exit with an empty cycle list in its own realm as it has no way of knowing that it is being called by a toString/join from another realm and hence needs to preserve the list.

WRT the second issue (reentrency) the problem is that the a proxy or whatever might might start a completely unrelated toString/join that is completely unrelated to an toString/join that is pending on the stack. If the unrelated toString/join coincidentally contains a reference to sometime in the global cycle list it is poisoned by the pending operation

I think we could do something more rational in each of these cases and I double that there is user code that depends upon the current behaviors in cases like these.

Member

allenwb commented Jan 15, 2016

@bterison
(oops, I actually intended to respond here rather than to the gist)

WRT the xrealm behavior: Did you try it with three references to the x-realm array? I would think that each inner x-realm join call would exit with an empty cycle list in its own realm as it has no way of knowing that it is being called by a toString/join from another realm and hence needs to preserve the list.

WRT the second issue (reentrency) the problem is that the a proxy or whatever might might start a completely unrelated toString/join that is completely unrelated to an toString/join that is pending on the stack. If the unrelated toString/join coincidentally contains a reference to sometime in the global cycle list it is poisoned by the pending operation

I think we could do something more rational in each of these cases and I double that there is user code that depends upon the current behaviors in cases like these.

@bterlson

This comment has been minimized.

Show comment
Hide comment
@bterlson

bterlson Jan 15, 2016

Member

Did you try it with three references to the x-realm array?

Can you give me an example? I'm not sure I follow.

the problem is that the a proxy or whatever might might start a completely unrelated toString/join that is completely unrelated to an toString/join that is pending on the stack.

I can buy that this is a problem in the sense that it seems bad, but I'm not sure of a way out of it without introducing observable extra parameters to toString. It is also how implementations behave today. Seems reasonable to spec this as a starting point at least? (Assuming it works for the case you suggest above).

Member

bterlson commented Jan 15, 2016

Did you try it with three references to the x-realm array?

Can you give me an example? I'm not sure I follow.

the problem is that the a proxy or whatever might might start a completely unrelated toString/join that is completely unrelated to an toString/join that is pending on the stack.

I can buy that this is a problem in the sense that it seems bad, but I'm not sure of a way out of it without introducing observable extra parameters to toString. It is also how implementations behave today. Seems reasonable to spec this as a starting point at least? (Assuming it works for the case you suggest above).

@rossberg

This comment has been minimized.

Show comment
Hide comment
@rossberg

rossberg Jan 18, 2016

Member

One set per realm is equivalent to associating each (instance of the) join method with its own seen list, as a sort of private state of the function object. I think that makes sense, as far as sense goes for this kind of thing. Maybe it can even be specced as a special [[property]] the join function creates and maintains on itself?

Member

rossberg commented Jan 18, 2016

One set per realm is equivalent to associating each (instance of the) join method with its own seen list, as a sort of private state of the function object. I think that makes sense, as far as sense goes for this kind of thing. Maybe it can even be specced as a special [[property]] the join function creates and maintains on itself?

@allenwb

This comment has been minimized.

Show comment
Hide comment
@allenwb

allenwb Jan 18, 2016

Member

It's primarily the reentrancy issue that I'm concern about. I'm pretty sure that this could be fixed at the specification level (either by adding a an optional second argument to join for the cycle set or by specifying join in terms of an abstract operation that recurs with an cycle set argument. (and using IsArray to determine whether to recur on the abstract operator or to just toString).

In either case, this has been unspecified for a very long time. I don't see any need to rush to a problematic solution that ignores the reentrancy issue and just copies a poor implementation decision of the past. I suspect we can do better while still maintaining adequate backwards compatibility.

Member

allenwb commented Jan 18, 2016

It's primarily the reentrancy issue that I'm concern about. I'm pretty sure that this could be fixed at the specification level (either by adding a an optional second argument to join for the cycle set or by specifying join in terms of an abstract operation that recurs with an cycle set argument. (and using IsArray to determine whether to recur on the abstract operator or to just toString).

In either case, this has been unspecified for a very long time. I don't see any need to rush to a problematic solution that ignores the reentrancy issue and just copies a poor implementation decision of the past. I suspect we can do better while still maintaining adequate backwards compatibility.

@claudepache

This comment has been minimized.

Show comment
Hide comment
@claudepache

claudepache Jan 19, 2016

Contributor

@allenwb I have hard time to find a plausible scenario where a reentrant call to A.p.join with a same this argument would not highly risk to provoke an infinite cycle, unless some function involved in the cycle takes an explicit measure to prevent that.

Contributor

claudepache commented Jan 19, 2016

@allenwb I have hard time to find a plausible scenario where a reentrant call to A.p.join with a same this argument would not highly risk to provoke an infinite cycle, unless some function involved in the cycle takes an explicit measure to prevent that.

@bterlson

This comment has been minimized.

Show comment
Hide comment
@bterlson

bterlson Jan 19, 2016

Member

@allenwb I would normally agree, although this was something we implemented because we found the web depended on it. I would support standardizing what is currently required to run the web and we can discuss improvements (eg. to handle any reentrancy problems). What's the downside with this approach?

Member

bterlson commented Jan 19, 2016

@allenwb I would normally agree, although this was something we implemented because we found the web depended on it. I would support standardizing what is currently required to run the web and we can discuss improvements (eg. to handle any reentrancy problems). What's the downside with this approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment