Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normative: A more precise Array.prototype.sort #1585

Open
wants to merge 7 commits into
base: master
from

Conversation

@szuend
Copy link

commented Jun 14, 2019

tl;dr This PR intends to nail down some parts of Array.prototype.sort to reduce the amount of cases that result in an implementation-defined sort order.

Overview

The Array.prototype.sort procedure that this PR proposes, can be summarized as follows:

  1. Collect all existing, non-undefined values in the range of [0, [[length]]) into a temporary list using [[Get]] (let the length of this list be n). undefineds are counted and not added to this temporary list (let this count be m).
  2. Sort this temporary list using an implementation-specific sort algorithm.
  3. Write back n sorted values using [[Set]].
  4. Write back m undefineds.
  5. Perform [[Delete]] on integer-indexed properties in the range of [n + m, [[length]]) as holes are moved to the end of the sorting range.

Advantages

The main advantage is that the following cases no longer result in an implementation-defined sort order:

  • sparse objects with elements on the prototype chain (e.g. an array with a hole at index 3 when Object.prototype[3] = 42)
  • objects whose [[Get]], [[Set]] behavior for integer-indexed properties inside the sorting range is not the "ordinary" implementation (i.e. a getter or setter or a proxy).
  • non-extensible objects (see next paragraph)
  • objects with non-configurable or non-writable properties inside the sorting range (see next paragraph)

The reason for this is that [[Get]], [[Set]] and [[Delete]] are now called in a well-defined order, regardless of the chosen sorting algorithm of any engine. (Previously, non-extensible objects and non-configurable/non-writable properties could result in exceptions at any point in time during sorting, depending on the implementation.)

Additionally, ToString operations or comparison functions that throw now leave the object as-is (modulo side-effects and changes caused by [[Get]]).

Web compatibility

We believe this change to be web-compatible, given that it has been (for the most part) shipping since Chrome 74, with the remaining bits shipping since Chrome 76.

  • Starting with V8 v7.4 (Chrome 74), V8 copies non-undefined values into a temporary list for sorting, as described in this PR. However, compacting all values at the start of the sorting range (removing holes) still happened on the object itself and was observable.

  • As of V8 v7.6 (Chrome version 76.0.3806.0) V8's implementation of Array.prototype.sort fully behaves as described by this PR.

Memory concerns

Implementers might be concerned about the added O(n) memory requirements of Array.prototype.sort. The first version in V8 that applied Array.prototype.sort to a temporary copy landed with Chrome 74. Worried about the impact on memory consumption (especially on mobile), we analysed memory usage of some 1000 websites and did not see any change in peak memory consumption.

Variants

This is only a first initial draft of how Array.prototype.sort could be specified in a more precise manner. A few variations come to mind:

  • Instead of counting undefineds, they could be collected as well and passed to SortCompare. This would require SortCompare to be left as-is and handle undefineds appropriately.
  • The spec could go further. In the case of an undefined comparison function, the initial collection phase could call ToString directly. This means the temporary list would contain pairs consisting of the original value and the result of ToString for the respective value. This would further reduce the cases of implementation-defined sort order.

Ref. #302.
@mathiasbynens @ajklein

spec.html Outdated
<p>The sort order is also implementation-defined if _obj_ is sparse and any of the following conditions are true:</p>
<ul>
<li>
IsExtensible(_obj_) is *false*.

This comment has been minimized.

Copy link
@szuend

szuend Jun 14, 2019

Author

This condition can safely removed because: [[Set]] and [[Delete]] are now called in a well-defined order. Additionally, if any ToString or comparison function tries to add properties to the object, the sort-order would still be the same across implementations, as no values have been written back yet.

spec.html Outdated
IsExtensible(_obj_) is *false*.
</li>
<li>
Any integer index property of _obj_ whose name is a nonnegative integer less than _len_ is a data property whose [[Configurable]] attribute is *false*.

This comment has been minimized.

Copy link
@szuend

szuend Jun 14, 2019

Author

Same argument as above.

spec.html Outdated
<p>The sort order is also implementation-defined if any of the following conditions are true:</p>
<ul>
<li>
If _obj_ is an exotic object (including Proxy exotic objects) whose behaviour for [[Get]], [[Set]], [[Delete]], and [[GetOwnProperty]] is not the ordinary object implementation of these internal methods.

This comment has been minimized.

Copy link
@szuend

szuend Jun 14, 2019

Author

Same as above, these are invoked in the same order, independent of the concrete sorting algorithm used.

spec.html Outdated
<emu-alg>
1. Perform an implementation-dependent sequence of calls to the [[Get]] and [[Set]] internal methods of _obj_, to the DeletePropertyOrThrow and HasOwnProperty abstract operation with _obj_ as the first argument, and to SortCompare (described below), such that:
* The property key argument for each call to [[Get]], [[Set]], HasOwnProperty, or DeletePropertyOrThrow is the string representation of a nonnegative integer less than _len_.
* The arguments for calls to SortCompare are values returned by a previous call to the [[Get]] internal method, unless the properties accessed by those previous calls did not exist according to HasOwnProperty. If both prospective arguments to SortCompare correspond to non-existent properties, use *+0* instead of calling SortCompare. If only the first prospective argument is non-existent use +1. If only the second prospective argument is non-existent use -1.

This comment has been minimized.

Copy link
@szuend

szuend Jun 14, 2019

Author

The new algorithm above might need some clarification. This is, SortCompare should only be called with values retrieved by [[Get]] and are part of items.

spec.html Outdated
1. Perform an implementation-dependent sequence of calls to the [[Get]] and [[Set]] internal methods of _obj_, to the DeletePropertyOrThrow and HasOwnProperty abstract operation with _obj_ as the first argument, and to SortCompare (described below), such that:
* The property key argument for each call to [[Get]], [[Set]], HasOwnProperty, or DeletePropertyOrThrow is the string representation of a nonnegative integer less than _len_.
* The arguments for calls to SortCompare are values returned by a previous call to the [[Get]] internal method, unless the properties accessed by those previous calls did not exist according to HasOwnProperty. If both prospective arguments to SortCompare correspond to non-existent properties, use *+0* instead of calling SortCompare. If only the first prospective argument is non-existent use +1. If only the second prospective argument is non-existent use -1.
* If _obj_ is not sparse then DeletePropertyOrThrow must not be called.

This comment has been minimized.

Copy link
@szuend

szuend Jun 14, 2019

Author

This is implicitly the case, as HasProperty(i) for any index i in the range of [0, [[length]]) always returns true, and thus |items| + number of undefineds should be equalt to [[length]].

@@ -32650,9 +32638,6 @@ <h1>Array.prototype.sort ( _comparefn_ )</h1>
<h1>Runtime Semantics: SortCompare ( _x_, _y_ )</h1>
<p>The SortCompare abstract operation is called with two arguments _x_ and _y_. It also has access to the _comparefn_ argument passed to the current invocation of the `sort` method. The following steps are taken:</p>
<emu-alg>
1. If _x_ and _y_ are both *undefined*, return *+0*.
1. If _x_ is *undefined*, return 1.
1. If _y_ is *undefined*, return -1.

This comment has been minimized.

Copy link
@szuend

szuend Jun 14, 2019

Author

undefineds are never part of items, and thus, x and y can never be undefined.

@mathiasbynens

This comment has been minimized.

Copy link
Member

commented Jun 14, 2019

To make this easier to follow, Simon left inline comments on the diff. We've also prepared a slide deck that walks through the proposed algorithm in a more visual way, which I'll present at the upcoming TC39 meeting in July.

One question people might have is: to which extent does this PR match implementation reality? We mentioned that V8 implements Array#sort as described in the PR, but what do other engines do? Below are some test cases.

test-1.js

Here's the example we walk through in the slide deck. We're sorting an array that contains some undefined values and some holes, one of which results in Object.prototype[2] being read.

All engines produce the same result here.

Code

function log(array) {
  let buf = '';
  for (let index = 0; index < array.length; index++) {
    if (array.hasOwnProperty(index)) {
      buf += String(array[index]);
    } else {
      buf += 'hole';
    }
    if (index < array.length - 1) buf += ',';
  }
  print(buf);
}

/* */

Object.prototype[2] = 4;
const array = [undefined, 3, /*hole*/, 2, undefined, /*hole*/, 1];
array.sort();
log(array);

Output

$ eshost -s test.js
#### Chakra, JavaScriptCore, SpiderMonkey, V8, XS
1,2,3,4,undefined,undefined,hole

test-2.js

More interesting version of the same example, with an added getter/setter pair named 2 on Object.prototype. When we hit the hole at index 2, we hit the accessor.

Except for JavaScriptCore, all engines have the same behavior: the getter is called once, the setter is called once with value 3, and then the sorted result is logged. JavaScriptCore instead hits the accessor multiple times, and eventually throws a TypeError (which seems like a bug regardless of this proposed change).

Code

function log(array) {
  let buf = '';
  for (let index = 0; index < array.length; index++) {
    if (array.hasOwnProperty(index)) {
      buf += String(array[index]);
    } else {
      buf += 'hole';
    }
    if (index < array.length - 1) buf += ',';
  }
  print(buf);
}

/* */

Object.defineProperty(Object.prototype, '2', {
  get() { print('get'); return 4; },
  set(v) { print(`set with ${v}`); }
});
const array = [undefined, 3, /*hole*/, 2, undefined, /*hole*/, 1];
array.sort();
log(array);

Output

$ eshost -s test.js
#### Chakra, SpiderMonkey, V8, XS
get
set with 3
1,2,hole,4,undefined,undefined,hole

#### JavaScriptCore
get
set with 2
get
get
set with [object Object]
get
TypeError: undefined is not an object

test-3.js

This example uses a custom comparison function which throws in the middle of sorting (on the third call). The proposed change doesn't modify the original array until the very end, and so in this case, the original array remains unchanged.

Once again, this matches existing implementations, except for JavaScriptCore, which mutates the original array until the exception is thrown.

Code

function log(array) {
  let buf = '';
  for (let index = 0; index < array.length; index++) {
    if (array.hasOwnProperty(index)) {
      buf += String(array[index]);
    } else {
      buf += 'hole';
    }
    if (index < array.length - 1) buf += ',';
  }
  print(buf);
}

/* */

Object.defineProperty(Object.prototype, '2', {
  get() { print('get'); return 4; },
  set(v) { print(`set with ${v}`); }
});
const array = [undefined, 3, /*hole*/, 2, undefined, /*hole*/, 1];
let count = 0;
try {
  array.sort((a, b) => {
    if (++count === 3) {
      throw new Error('lolwat');
    }
    return b - a;
  });
} catch (exception) {
  print(exception);
}
log(array);

Output

$ eshost -s test.js
#### Chakra, SpiderMonkey, V8, XS
get
Error: lolwat
undefined,3,hole,2,undefined,hole,1

#### JavaScriptCore
get
set with 2
get
get
set with 4
get
Error: lolwat
3,4,hole,1,undefined,undefined,hole
@ljharb
Copy link
Member

left a comment

I’d be very interested in seeing the toString step included as well, to reduce variance further.

spec.html Outdated Show resolved Hide resolved
@jmdyck
Copy link
Collaborator

left a comment

Since %TypedArray%.prototype.sort is defined roughly as a diff against Array.prototype.sort, it may need collateral changes.

For instance, this sentence would no longer apply?:

The implementation-defined sort order condition for exotic objects is not applied by %TypedArray%.prototype.sort.

Also, its reference to "the entry steps in Array.prototype.sort" would be a big vaguer, since those steps no longer constitute a distinct <emu-alg> of their own. We could just leave it as is and trust that people will figure it out, or change it to something like "the first three steps in Array.prototype.sort", but then that's less robust to future change (e.g. if A.p.sort were to insert some Asserts at the start).

spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
@bakkot

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2019

I'd like to see more investigation of the behavior of existing engines in edge cases. Off the top of my head:

  • accessors on the array itself
  • accessors on a different object, where the array's prototype has been manually changed to that object
  • non-writeable or non-configurable properties on the array
  • non-writeable or non-configurable properties on Object.prototype in a position where there is a hole in the array
  • what happens when values in the array are mutated as a side effect of an accessor or the comparison function or the toString of an array element
  • what happens when holes in the array are created or removed as a side effect of an accessor or the comparison function or the toString of an array element
  • what happens when the length of the array is mutated as a side effect of an accessor or the comparison function or the toString of an array element
  • what happens for arrays of length 0 or 1, especially when there is a hole or an accessor pair on the array at index 0, or when there is a side effect in the comparison function
  • which proxy traps are triggered when invoking .sort on a proxy, and in what order

I am sure there are more cases to consider, too.

@anba

This comment has been minimized.

Copy link
Collaborator

commented Jul 24, 2019

I'd like to see more investigation of the behavior of existing engines in edge cases.

SpiderMonkey always first collects all elements, stores them in a temporary object, and then performs the search in that temporary object:

The elements in the sorted object are then written back with [[Set]] semantics and holes are deleted through [[Delete]]:

@szuend szuend force-pushed the szuend:array-sort branch from daad2a2 to 2c1431a Sep 19, 2019

@szuend

This comment has been minimized.

Copy link
Author

commented Sep 19, 2019

I'd like to see more investigation of the behavior of existing engines in edge cases.

I implemented some of these here. All tests were run using eshost -s <file>. The output of each test case can be found here:

Accessor on the array itself (accessors-on-array.js)

#### chakra, spidermonkey, v8
get [2]
get [3]
set [2] with d
set [3] with undefined
get [2]
get [3]
a,c,d,undefined,undefined,undefined,undefined,hole

#### javascriptcore
get [2]
get [3]
set [2] with d
set [3] with undefined
get [2]
get [2]
set [2] with d
get [2]
get [3]
a,c,d,undefined,undefined,undefined,undefined,hole

#### xs
get [2]
get [3]
set [2] with d
TypeError: ?.set: cannot coerce undefined to object

Accessors on the arrays prototype object with a hole (accessors-on-object-proto.js)

#### chakra, spidermonkey, v8, xs
get [2]
set [2] with c
a,b,hole,d,undefined,undefined,undefined,hole

#### javascriptcore
get [2]
set [2] with a
get [2]
get [2]
set [2] with c
a,b,hole,d,undefined,undefined,undefined,hole

non-writeable or non-configurable elements on the array or its prototype
non-configurable-element.js

#### chakra

TypeError: Object doesn't support this action

#### javascriptcore

TypeError: Unable to delete property.

#### spidermonkey

TypeError: property 7 is non-configurable and can't be deleted

#### v8

TypeError: Cannot delete property '6' of [object Array]

#### xs

TypeError: Array.prototype.sort: delete 6: not configurable

non-configurable-proto-hole.js

#### chakra, javascriptcore, spidermonkey, v8, xs
a,b,c,d,foo,undefined,undefined,hole

non-writeable-element.js

#### chakra

TypeError: Object doesn't support this action

#### javascriptcore

TypeError: Attempted to assign to readonly property.

#### spidermonkey

TypeError: 1 is read-only

#### v8

TypeError: Cannot assign to read only property '1' of object '[object Array]'

#### xs

TypeError: Array.prototype.sort: C: xsSet 0: not writable

non-writeable-proto-hole.js

#### chakra

TypeError: Object doesn't support this action

#### javascriptcore

TypeError: Attempted to assign to readonly property.

#### spidermonkey

TypeError: 2 is read-only

#### v8

TypeError: Cannot assign to read only property '2' of object '[object Array]'

#### xs

TypeError: Array.prototype.sort: C: xsSet 0: not writable

Accessors mutate an element
accessor-sets-predecessor.js

#### chakra, spidermonkey, v8, xs
a,b,c,d,undefined,undefined,undefined,hole
a,foobar,c,d,undefined,undefined,undefined,hole

#### javascriptcore
a,b,c,d,undefined,undefined,undefined,hole
a,foobar,d,foobar,undefined,undefined,undefined,hole

accessor-sets-successor.js

#### chakra, spidermonkey, v8, xs
a,c,d,foobar,undefined,undefined,undefined,hole
a,b,c,d,undefined,undefined,undefined,hole

#### javascriptcore
a,c,foobar,foobar,undefined,undefined,undefined,hole
a,b,c,d,undefined,undefined,undefined,hole

Accessors delete an element
accessor-deletes-predecessor.js

#### chakra, spidermonkey, v8, xs
a,b,c,d,undefined,undefined,undefined,hole
a,hole,c,d,undefined,undefined,undefined,hole

#### javascriptcore
a,b,c,d,undefined,undefined,undefined,hole
a,hole,d,undefined,undefined,undefined,undefined,hole

accessor-deletes-successor.js

#### chakra, javascriptcore, spidermonkey, v8, xs
a,c,d,hole,undefined,undefined,hole,hole
a,b,c,d,undefined,undefined,undefined,hole

Accessor add/remove elements
accessor-adds-two-elements.js

#### chakra, spidermonkey, v8, xs
a,b,c,d,undefined,undefined,undefined,hole,foo,bar,foo,bar
a,b,c,d,undefined,undefined,undefined,hole,foo,bar

#### javascriptcore
a,b,c,d,undefined,undefined,undefined,hole,foo,bar,foo,bar,foo,bar,foo,bar
a,b,c,d,undefined,undefined,undefined,hole,foo,bar,foo,bar

accessor-removes-two-elements.js

#### chakra, spidermonkey, v8, xs
b,c,undefined,undefined
a,b,c,d,undefined,undefined,undefined

#### javascriptcore
b,c,undefined,undefined
a,b,c,undefined

Accessors modify .length
accessor-decreases-length.js

#### chakra, spidermonkey, v8, xs
b,c,undefined,undefined
a,b,c,d,undefined,undefined,undefined

#### javascriptcore
b,c,undefined,undefined
a,b,c,undefined

accessor-increases-length.js

#### chakra, spidermonkey, v8, xs
a,b,c,d,undefined,undefined,undefined,hole,hole,hole,hole,hole
a,b,c,d,undefined,undefined,undefined,hole,hole,hole

#### javascriptcore
a,b,c,d,undefined,undefined,undefined,hole,hole,hole,hole,hole,hole,hole,hole,hole
a,b,c,d,undefined,undefined,undefined,hole,hole,hole,hole,hole

0-length and 1-length arrays (short-arrays.js)

#### chakra
0-length array:
get [0]
set [0] with undefined
get [0]
undefined
1-length array:
get [0]
set [0] with bar
get [0]
bar,undefined

#### javascriptcore
0-length array:
get [0]
undefined
1-length array:
get [0]
get [0]
set [0] with bar
get [0]
get [0]
set [0] with bar
get [0]
bar,undefined

#### spidermonkey, v8
0-length array:
get [0]
undefined
1-length array:
get [0]
set [0] with bar
get [0]
bar,undefined

#### xs
0-length array:
get [0]
undefined
1-length array:
TypeError: Array.prototype.sort: Cannot coerce to string

Sort on a proxy with a backing array proxy.js

#### chakra
get ['length']
get ['0']
get ['1']
get ['2']
get ['3']
get ['4']
get ['5']
get ['6']
get ['7']
set ['0'] = a
set ['1'] = b
set ['2'] = c
set ['3'] = d
set ['4'] = undefined
set ['5'] = undefined
set ['6'] = undefined
set ['7'] = undefined

#### javascriptcore
get ['length']
get ['length']
get ['0']
has ['0']
get ['0']
has ['1']
get ['1']
set ['0'] = c
has ['2']
has ['3']
get ['3']
set ['1'] = b
has ['4']
get ['4']
has ['5']
has ['6']
get ['6']
set ['2'] = a
has ['7']
get ['7']
set ['3'] = d
set ['4'] = undefined
set ['5'] = undefined
delete ['6']
delete ['7']
get ['0']
get ['0']
get ['1']
get ['1']
get ['2']
get ['2']
get ['3']
get ['3']
set ['0'] = a
set ['1'] = b
set ['2'] = c
set ['3'] = d

#### spidermonkey
get ['length']
has ['0']
get ['0']
has ['1']
get ['1']
has ['2']
has ['3']
get ['3']
has ['4']
get ['4']
has ['5']
has ['6']
get ['6']
has ['7']
get ['7']
set ['0'] = a
set ['1'] = b
set ['2'] = c
set ['3'] = d
set ['4'] = undefined
set ['5'] = undefined
delete ['7']
delete ['6']

#### v8, xs
get ['length']
has ['0']
get ['0']
has ['1']
get ['1']
has ['2']
has ['3']
get ['3']
has ['4']
get ['4']
has ['5']
has ['6']
get ['6']
has ['7']
get ['7']
set ['0'] = a
set ['1'] = b
set ['2'] = c
set ['3'] = d
set ['4'] = undefined
set ['5'] = undefined
delete ['6']
delete ['7']

Sort an array with a proxy prototype proxy-proto.js

#### chakra
get ['2']
get ['5']
set ['2'] = c
set ['5'] = undefined

#### javascriptcore
has ['2']
has ['5']
set ['2'] = a
set ['5'] = undefined
get ['2']
get ['2']
set ['2'] = c

#### spidermonkey, v8, xs
has ['2']
has ['5']
set ['2'] = c
set ['5'] = undefined

The result is not that surprising given that v8 implements basically the same approach now as spidermonkey. Please note that in the proxy.js example, the only difference between spidermonkey and v8/xs is the order of [[Delete]] calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.