New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert: improve deepEqual perf for large input #12849

Closed
wants to merge 1 commit into
base: master
from

Conversation

@addaleax
Member

addaleax commented May 5, 2017

Use a Map instead of an array for checking previously found
cyclic references.

This reduces complexity for an array-of-objects case from
O(n²) to O(n·log n).

Case from the issue, new output:

2, 3
3, 0
5, 0
8, 4
12, 2
18, 1
27, 4
41, 3
62, 1
93, 3
140, 4
210, 10
315, 5
473, 11
710, 16
1065, 25
1598, 40
2397, 75
3596, 119
5394, 150
8091, 216
12137, 296
18206, 419
27309, 681
40964, 944
61446, 1439
92169, 2424

Fixes: #12842

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)
@addaleax

This comment has been minimized.

Show comment
Hide comment
@addaleax

addaleax May 5, 2017

Member

Also, I found some weird behaviour while doing this:

a = {};
assert.deepStrictEqual([a,a], [a,{}]);

This does not throw, but I think it should; do you agree? Would you consider changing it semver-major? If not, I could do it in this PR, otherwise I’ll just wait until after this to avoid conflicts.

Member

addaleax commented May 5, 2017

Also, I found some weird behaviour while doing this:

a = {};
assert.deepStrictEqual([a,a], [a,{}]);

This does not throw, but I think it should; do you agree? Would you consider changing it semver-major? If not, I could do it in this PR, otherwise I’ll just wait until after this to avoid conflicts.

@mscdex mscdex added the performance label May 5, 2017

@addaleax addaleax requested review from joyeecheung and Trott May 5, 2017

@cjihrig

cjihrig approved these changes May 5, 2017

LGTM if the CI passes.

Show outdated Hide outdated lib/assert.js
const actualIndex = memos.actual.indexOf(actual);
if (actualIndex !== -1) {
if (actualIndex === memos.expected.indexOf(expected)) {
memos = memos || {

This comment has been minimized.

@mscdex

mscdex May 5, 2017

Contributor

How about instead doing if (!memos) { memos = ... }?

@mscdex

mscdex May 5, 2017

Contributor

How about instead doing if (!memos) { memos = ... }?

This comment has been minimized.

@addaleax

addaleax May 5, 2017

Member

Sure, why not. Updated

@addaleax

addaleax May 5, 2017

Member

Sure, why not. Updated

@jasnell

jasnell approved these changes May 5, 2017

Show outdated Hide outdated lib/assert.js
if (actualIndex !== -1) {
if (actualIndex === memos.expected.indexOf(expected)) {
if (!memos) {
memors = {

This comment has been minimized.

@targos

targos May 5, 2017

Member

memos

@targos

targos May 5, 2017

Member

memos

This comment has been minimized.

@addaleax

addaleax May 5, 2017

Member

fixed, sorry. 😄

@addaleax

addaleax May 5, 2017

Member

fixed, sorry. 😄

Show outdated Hide outdated lib/assert.js
} else {
memos.actual.map.set(actual, memos.actual.position++);
}
if (memos.expected.map.get(expected) === undefined) {

This comment has been minimized.

@targos

targos May 5, 2017

Member

!memos.expected.map.has(expected)?

@targos

targos May 5, 2017

Member

!memos.expected.map.has(expected)?

@addaleax

This comment has been minimized.

Show comment
Hide comment
if (actualIndex === memos.expected.indexOf(expected)) {
if (!memos) {
memos = {
actual: { map: new Map(), position: 0 },

This comment has been minimized.

@joyeecheung

joyeecheung May 5, 2017

Member

How about a WeakMap?

@joyeecheung

joyeecheung May 5, 2017

Member

How about a WeakMap?

This comment has been minimized.

@addaleax

addaleax May 5, 2017

Member

@joyeecheung That thought occurred to me and I couldn’t really find a difference … would there be one? Is that going to be more performant?

@addaleax

addaleax May 5, 2017

Member

@joyeecheung That thought occurred to me and I couldn’t really find a difference … would there be one? Is that going to be more performant?

This comment has been minimized.

@joyeecheung

joyeecheung May 5, 2017

Member

@addaleax I think the GC pressure would be smaller after we leave the function because the references don't have be traversed?

@joyeecheung

joyeecheung May 5, 2017

Member

@addaleax I think the GC pressure would be smaller after we leave the function because the references don't have be traversed?

This comment has been minimized.

@addaleax

addaleax May 5, 2017

Member

@joyeecheung ok, updated :)

@addaleax

addaleax May 5, 2017

Member

@joyeecheung ok, updated :)

@refack

This comment has been minimized.

Show comment
Hide comment
@refack

refack May 5, 2017

Member

@addaleax I keep being surprised by your energy and dedication 🥇
You could also add an eager chack at https://github.com/nodejs/node/blob/master/lib/assert.js#L408 and only sort if first compare fails. Or if (!(a instanceof Array) || !(b instanceof Array))

Member

refack commented May 5, 2017

@addaleax I keep being surprised by your energy and dedication 🥇
You could also add an eager chack at https://github.com/nodejs/node/blob/master/lib/assert.js#L408 and only sort if first compare fails. Or if (!(a instanceof Array) || !(b instanceof Array))

@addaleax

This comment has been minimized.

Show comment
Hide comment
@addaleax

addaleax May 5, 2017

Member

@refack That seems like a completely different issue, though? :D

Member

addaleax commented May 5, 2017

@refack That seems like a completely different issue, though? :D

@refack

refack approved these changes May 5, 2017

@refack

This comment has been minimized.

Show comment
Hide comment
@refack

refack May 5, 2017

Member

It fit under the title "assert: improve deepEqual perf for large input". No need to sort keys of arrays. It is an O(n*log(n)) operation.

Member

refack commented May 5, 2017

It fit under the title "assert: improve deepEqual perf for large input". No need to sort keys of arrays. It is an O(n*log(n)) operation.

@addaleax

This comment has been minimized.

Show comment
Hide comment
@addaleax

addaleax May 5, 2017

Member

@refack Arrays can have extra non-indexed properties, too ;) This might not be as easy as you think. If you think you can improve the situation, feel free to experiment and see if you can get a PR together, and I’ll be happy to review.

Member

addaleax commented May 5, 2017

@refack Arrays can have extra non-indexed properties, too ;) This might not be as easy as you think. If you think you can improve the situation, feel free to experiment and see if you can get a PR together, and I’ll be happy to review.

@refack

This comment has been minimized.

Show comment
Hide comment
@refack

refack May 5, 2017

Member

Challenge accepted 🖖

Member

refack commented May 5, 2017

Challenge accepted 🖖

@Fishrock123

This comment has been minimized.

Show comment
Hide comment
@Fishrock123

Fishrock123 May 5, 2017

Member
a = {};
assert.deepStrictEqual([a,a], [a,{}]);

Imo, no. deepStrictEqual is supposed to make sure the props are strictly equal, not if the object is the same memory address.

Member

Fishrock123 commented May 5, 2017

a = {};
assert.deepStrictEqual([a,a], [a,{}]);

Imo, no. deepStrictEqual is supposed to make sure the props are strictly equal, not if the object is the same memory address.

@Fishrock123

CI again, since the last one only ran linux(?): https://ci.nodejs.org/job/node-test-pull-request/7899/

@addaleax

This comment has been minimized.

Show comment
Hide comment
@addaleax

addaleax May 5, 2017

Member

@joyeecheung Looks like doing this with WeakMaps is a bit more complicated than just switching the Map type, because at that part of the code it’s possible that one of the values is not a primitive (and it’s tricky to fix that because in for non-strict deepEqual a non-primitive and a primitive can compare equal).

New CI: https://ci.nodejs.org/job/node-test-commit/9684/

Member

addaleax commented May 5, 2017

@joyeecheung Looks like doing this with WeakMaps is a bit more complicated than just switching the Map type, because at that part of the code it’s possible that one of the values is not a primitive (and it’s tricky to fix that because in for non-strict deepEqual a non-primitive and a primitive can compare equal).

New CI: https://ci.nodejs.org/job/node-test-commit/9684/

@addaleax

This comment has been minimized.

Show comment
Hide comment
@addaleax

addaleax May 5, 2017

Member
 a = {};
assert.deepStrictEqual([a,a], [a,{}]);

Imo, no. deepStrictEqual is supposed to make sure the props are strictly equal, not if the object is the same memory address.

Sorry, huh – what do memory addresses have to do with this? It’s not about the left and the right side of assert.deepStrictEqual having different object identities, it’s about different kinds of cyclic structures comparing equal.

To have a more explicit example:

a = {}; b = {};
assert.deepStrictEqual({key1:a,key2:a},{key1:a,key2:b});

should throw imo, because obj1.key1 === obj1.key2 but obj2.key1 !== obj2.key2.

Member

addaleax commented May 5, 2017

 a = {};
assert.deepStrictEqual([a,a], [a,{}]);

Imo, no. deepStrictEqual is supposed to make sure the props are strictly equal, not if the object is the same memory address.

Sorry, huh – what do memory addresses have to do with this? It’s not about the left and the right side of assert.deepStrictEqual having different object identities, it’s about different kinds of cyclic structures comparing equal.

To have a more explicit example:

a = {}; b = {};
assert.deepStrictEqual({key1:a,key2:a},{key1:a,key2:b});

should throw imo, because obj1.key1 === obj1.key2 but obj2.key1 !== obj2.key2.

const actualPosition = memos.actual.map.get(actual);
if (actualPosition !== undefined) {
if (actualPosition === memos.expected.map.get(expected)) {

This comment has been minimized.

@TimothyGu

TimothyGu May 5, 2017

Member

Changing from Array.prototype.indexOf (which uses Strict Equality Comparison) to Map.prototype.get (which uses SameValueZero) can have subtle nuances that might result in incompatibilities, especially around NaN. Not sure if we need to care about it though.

@TimothyGu

TimothyGu May 5, 2017

Member

Changing from Array.prototype.indexOf (which uses Strict Equality Comparison) to Map.prototype.get (which uses SameValueZero) can have subtle nuances that might result in incompatibilities, especially around NaN. Not sure if we need to care about it though.

This comment has been minimized.

@addaleax

addaleax May 5, 2017

Member

I don’t think that actually affects the logic here – if it matters at all, it should only affect cases where non-primitives compare deep-equal to primitives, but that doesn’t seem to happen anyway (e.g. assert.deepEqual([0], 0) throws even though [0] == 0).

@addaleax

addaleax May 5, 2017

Member

I don’t think that actually affects the logic here – if it matters at all, it should only affect cases where non-primitives compare deep-equal to primitives, but that doesn’t seem to happen anyway (e.g. assert.deepEqual([0], 0) throws even though [0] == 0).

@Trott

This comment has been minimized.

Show comment
Hide comment
@Trott

Trott May 5, 2017

Member
a = {}; b = {};
assert.deepStrictEqual({key1:a,key2:a},{key1:a,key2:b});

One more reason to consider the implementation of deep*() in core a mistake and something that should have been left to userland.

IMO, that should not throw (so the current behavior is correct). The deep part of deepStrictEqual() is sorta kinda "hey, I'm recursive!" So it sees an object (or array, and ideally Map or Set too, basically any collection) as the value and is all like, "OK, let's go through this thing's values and see if they're all strict equal! No keys in both objects, they are deep strict equal!"

So, confusingly, the "strict" part doesn't apply to objects (or arrays or--ideally at least--Maps or Sets). It does mean "don't cast a string to a number" though. Whee!

But I do agree it's confusing and unfortunate.

Even if you disagree, I definitely wouldn't fix it in this PR because I predict a CITGM run will not be kind to the additional change.

Member

Trott commented May 5, 2017

a = {}; b = {};
assert.deepStrictEqual({key1:a,key2:a},{key1:a,key2:b});

One more reason to consider the implementation of deep*() in core a mistake and something that should have been left to userland.

IMO, that should not throw (so the current behavior is correct). The deep part of deepStrictEqual() is sorta kinda "hey, I'm recursive!" So it sees an object (or array, and ideally Map or Set too, basically any collection) as the value and is all like, "OK, let's go through this thing's values and see if they're all strict equal! No keys in both objects, they are deep strict equal!"

So, confusingly, the "strict" part doesn't apply to objects (or arrays or--ideally at least--Maps or Sets). It does mean "don't cast a string to a number" though. Whee!

But I do agree it's confusing and unfortunate.

Even if you disagree, I definitely wouldn't fix it in this PR because I predict a CITGM run will not be kind to the additional change.

@Trott

Trott approved these changes May 5, 2017 edited

Change looks good to me if CI is green and if benchmark/assert doesn't show any perf regressions. (Maybe add a benchmark to show the benefit of this change if existing benchmarks don't already show it?)

@refack

This comment has been minimized.

Show comment
Hide comment
@joyeecheung

This comment has been minimized.

Show comment
Hide comment
@joyeecheung

joyeecheung May 7, 2017

Member

@addaleax hmm wait, do we even need to put primitives into the memos? The memos are for cycles and is it even possible to make a cycle with primitives? (Or do we even care if they do since they are not compared recursively?)

Also FWIW I think the example above should not throw because objects are not primitives and as the docs and the comments stare, for non-primitives we don't compare them by using operators. Objects with the same properties(or no properties), prototypes and tags should be considered equal.

Member

joyeecheung commented May 7, 2017

@addaleax hmm wait, do we even need to put primitives into the memos? The memos are for cycles and is it even possible to make a cycle with primitives? (Or do we even care if they do since they are not compared recursively?)

Also FWIW I think the example above should not throw because objects are not primitives and as the docs and the comments stare, for non-primitives we don't compare them by using operators. Objects with the same properties(or no properties), prototypes and tags should be considered equal.

@addaleax

This comment has been minimized.

Show comment
Hide comment
@addaleax

addaleax May 7, 2017

Member

hmm wait, do we even need to put primitives into the memos? The memos are for cycles and is it even possible to make a cycle with primitives?

I don’t think so, for both questions – but I’m not sure it’s trivial to undo that without changing the logic here for really weird edge cases.

Member

addaleax commented May 7, 2017

hmm wait, do we even need to put primitives into the memos? The memos are for cycles and is it even possible to make a cycle with primitives?

I don’t think so, for both questions – but I’m not sure it’s trivial to undo that without changing the logic here for really weird edge cases.

assert: improve deepEqual perf for large input
Use a Map instead of an array for checking previously found
cyclic references.

This reduces complexity for an array-of-objects case from
O(n²) to O(n·log n).

Fixes: #12842
PR-URL: #12849
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
@addaleax

This comment has been minimized.

Show comment
Hide comment
@addaleax

addaleax May 7, 2017

Member

Landed in 7e5f500

Member

addaleax commented May 7, 2017

Landed in 7e5f500

@addaleax addaleax closed this May 7, 2017

addaleax added a commit that referenced this pull request May 7, 2017

assert: improve deepEqual perf for large input
Use a Map instead of an array for checking previously found
cyclic references.

This reduces complexity for an array-of-objects case from
O(n²) to O(n·log n).

Fixes: #12842
PR-URL: #12849
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>

@addaleax addaleax deleted the addaleax:assert-deepequal-map branch May 7, 2017

anchnk added a commit to anchnk/node that referenced this pull request May 19, 2017

assert: improve deepEqual perf for large input
Use a Map instead of an array for checking previously found
cyclic references.

This reduces complexity for an array-of-objects case from
O(n²) to O(n·log n).

Fixes: nodejs#12842
PR-URL: nodejs#12849
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>

@jasnell jasnell referenced this pull request May 11, 2017

Closed

8.0.0 Release Proposal #12220

@gibfahn gibfahn referenced this pull request Jun 15, 2017

Closed

Auditing for 6.11.1 #230

2 of 3 tasks complete
@gibfahn

This comment has been minimized.

Show comment
Hide comment
@gibfahn

gibfahn Jun 20, 2017

Member

@addaleax if this is to be backported to v6.x it would need to be benchmarked on that branch. Feel free to backport or just change the label.

Should this be backported to v6.x-staging? If yes please follow the guide and raise a backport PR, if no let me know or add the dont-land-on label.

Member

gibfahn commented Jun 20, 2017

@addaleax if this is to be backported to v6.x it would need to be benchmarked on that branch. Feel free to backport or just change the label.

Should this be backported to v6.x-staging? If yes please follow the guide and raise a backport PR, if no let me know or add the dont-land-on label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment