Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

events: Faster ascending array index #3976

Closed
wants to merge 5 commits into from

Conversation

@alemures
Copy link
Contributor

commented Nov 23, 2015

Faster iteration over array using an ascending index.

@cjihrig

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2015

Do you have any benchmark results to show a speedup?

@alemures

This comment has been minimized.

Copy link
Contributor Author

commented Nov 23, 2015

Using a numeric array of 1.000.000 elements and executing 100 times the old and the new version of the function arrayClone I am getting the following results:

arrayCloneOld (descending index) -> 445ms
arrayCloneNew (ascending index) -> 390ms

This is an exceptional post about this topic (tip number 3):
https://gamealchemist.wordpress.com/2013/05/01/lets-get-those-javascript-arrays-to-work-fast/

@mscdex

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2015

I tested both implementations using 5 different arrays (in benchd using node v5.1.0) of sizes 100000, 10000, 1000, 100, and 10. Both implementations performed about the same for me and had ~15% variance after ~100 runs.

@mscdex mscdex added the events label Nov 23, 2015

@mscdex

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2015

Ok I bumped up the number of runs and the variance drops to ~5.5-9% and the for loop seems to consistently be a little faster.

LGTM

@targos

This comment has been minimized.

Copy link
Member

commented Nov 23, 2015

I personally always use slice to clone an array. Here are the numbers I get with v5.1 (array of 1.000.000 elements, 1000 runs):

  • current: 19412.949ms
  • this PR: 17655.677ms
  • slice: 5604.732ms
@mscdex

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2015

Ok, then let's use slice().

@thefourtheye

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2015

@targos What about smaller number of events?

@targos

This comment has been minimized.

Copy link
Member

commented Nov 23, 2015

Just testing for loop vs slice (number of runs inversly proportional to length):

# elements for loop slice
1 2156.987ms 9614.789ms
10 1215.489ms 1095.061ms
100 1079.062ms 155.996ms
1000 1037.503ms 99.311ms
@mscdex

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2015

This sounds familiar, I remember implementing a similar thing where I had determined some arbitrary (could be a defined V8 constant) cutoff so that a loop was used for smaller lengths and slice() for larger lengths. I was thinking the cutoff was between 10 and 100 though. I guess that could have changed over time too...

@targos

This comment has been minimized.

Copy link
Member

commented Nov 23, 2015

Comparison between current master (node2) and slice:

events/ee-add-remove.js n=250000: ./node: 1188400 ./node2: 1179000 ......................... 0.80%
events/ee-emit-multi-args.js n=2000000: ./node: 3998800 ./node2: 5674700 ................. -29.53%
events/ee-emit.js n=2000000: ./node: 4953600 ./node2: 7788400 ............................ -36.40%
events/ee-listener-count-on-prototype.js n=50000000: ./node: 438400000 ./node2: 453910000 . -3.42%
events/ee-listeners-many.js n=5000000: ./node: 6967700 ./node2: 3564800 ................... 95.46%
events/ee-listeners.js n=5000000: ./node: 7291300 ./node2: 20185000 ...................... -63.88%

Comparison between current master and for loop:

events/ee-add-remove.js n=250000: ./node: 1201500 ./node2: 1098600 ........................ 9.37%
events/ee-emit-multi-args.js n=2000000: ./node: 5997000 ./node2: 5778500 .................. 3.78%
events/ee-emit.js n=2000000: ./node: 8575100 ./node2: 8150900 ............................. 5.20%
events/ee-listener-count-on-prototype.js n=50000000: ./node: 433950000 ./node2: 411540000 . 5.44%
events/ee-listeners-many.js n=5000000: ./node: 3913700 ./node2: 3490700 .................. 12.12%
events/ee-listeners.js n=5000000: ./node: 22746000 ./node2: 20301000 ..................... 12.04%

I don't have any stats about it but I guess that the common case is to have a few listeners so this PR LGTM.

@mscdex

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2015

I found the PR where I reverted back to a single algorithm. As much as I dislike using magic cutoff values like that, maybe it's worth re-adding...

@thefourtheye

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2015

I am +0. Most of the times we will not have events in the order of 10Ks I guess. Since the number of events normally range from 10s to 100s, based on the table shown above by targos, I tend to think that slice would be better.

while (i--)
function arrayClone(arr, length) {
var copy = new Array(length);
for (var i = 0; i < length; i++)

This comment has been minimized.

Copy link
@Fishrock123

Fishrock123 Nov 23, 2015

Member

I'll bet v8 transforms these into assembly that is about the same.

@alemures

This comment has been minimized.

Copy link
Contributor Author

commented Nov 24, 2015

After a good amount of test I got these results with the new implementation:

N Elements Execution Times arrayCloneNew arrayCloneOld slice
1 1.000.000 16ms 23ms 93ms
20 1.000.000 58ms 60ms 110ms
50 1.000.000 120ms 125ms 118ms
75 1.000.000 135ms 190ms 135ms
200 1.000.000 200ms 420ms 205ms
1000 1.000.000 760ms 2000ms 750ms

What do you think about it guys? The method is a bit longer but it is able to scale much better than use just slice or just the old while loop.

@jwueller

This comment has been minimized.

Copy link
Contributor

commented Nov 24, 2015

Considering these benchmarks, this change seems reasonable. The speedup is measurable and changes to the source are not excessive. I normally do not think that we should optimize for engine implementation details like slice() in JavaScript userland, but this is used by a lot of people.

I say go ahead, but maybe a comment explaining this specific implementation would be nice. It may not be obvious why this is being used in a few months.

function arrayClone(arr, length) {
if (length === 1) {
return [arr[0]];
} else if (length < 50) {

This comment has been minimized.

Copy link
@trevnorris

trevnorris Nov 30, 2015

Contributor

no need for else. you're returning early.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Nov 30, 2015

Contributor

also a code comment about why 50 was chosen. At first look the decision seems arbitrary.

This comment has been minimized.

Copy link
@alemures

alemures Nov 30, 2015

Author Contributor

I like using elseif in those cases so the code looks more compacted in easy functions. Anyway in order to add a comment in the second case a simple if looks better.

@ex1st

This comment has been minimized.

Copy link

commented Dec 9, 2015

For me concat two times faster than slice for large arrays.

return arr.concat();
@mscdex

This comment has been minimized.

Copy link
Contributor

commented Dec 14, 2015

FWIW I was doing some more benchmarking on this today and from what I'm seeing now, the magic number seems to be more like 27 instead of 50 with the node master branch as of this writing.

@alemures

This comment has been minimized.

Copy link
Contributor Author

commented Dec 14, 2015

I ran this simple script and I got the next results, the elements of the arrays are milliseconds from array of 25 elements (index 0), to array of 75 elements (index 49), running each operation 200,000 times. The result is that the magic number is more or less 30 using node v5.2.0. So I think that a number between 27, 30 is better as you say @mscdex .

Array of numbers:

slice [26,28,23,23,23,24,23,24,23,23,23,23,24,23,24,23,24,24,24,24,24,26,24,24,24,24,25,24,25,29,26,26,27,25,25,24,25,24,25,25,25,25,24,26,25,25,25,28,25,25]
loop  [20,18,19,19,20,19,20,21,24,25,24,26,27,26,26,28,27,28,28,30,29,30,30,31,32,32,32,33,37,36,37,36,37,37,36,36,39,38,38,38,39,39,40,40,40,41,41,43,43,43]

Array of strings:

slice [29,27,25,25,26,25,26,25,28,26,27,26,27,26,26,27,27,26,27,26,27,25,27,27,28,27,28,27,28,28,28,27,28,28,28,29,28,29,28,28,29,28,29,29,28,30,30,30,30,29]
loop  [25,23,24,24,25,24,28,28,26,29,29,29,31,32,33,31,33,34,34,36,37,37,38,38,39,40,39,40,41,42,42,44,44,45,46,46,46,47,49,47,46,49,51,51,50,51,53,52,52,53]

Here is the script:

var mat = [];
var min = 25, max = 75;
var i, j;

for(i = min; i < max; i++) {
  mat[i - min] = []
  for(j = 0; j < i; j++) {
    // mat[i - min][j] = i * j; // Numbers
    mat[i - min][j] = i * j + 'foo'; // Strings
  }
}

var times = 200000;

var t;
var resSlice = [], resLoop = [];
for(i = 0; i < mat.length; i++) {
  t = Date.now();
  for(j = 0; j < times; j++) {
    mat[i].slice();
  }
  resSlice.push(Date.now() - t)

  t = Date.now();
  for(j = 0; j < times; j++) {
    arrayClone(mat[i], mat[i].length);
  }
  resLoop.push(Date.now() - t)
}

console.log('slice', JSON.stringify(resSlice));
console.log('loop ', JSON.stringify(resLoop));

function arrayClone(arr, length) {
  var copy = new Array(length);
  for (var i = 0; i < length; i++)
    copy[i] = arr[i];
  return copy;
}
@ronkorving

This comment has been minimized.

Copy link
Contributor

commented Jan 14, 2016

@alemures given the comment by @ex1st, could you add concat() to the benchmark results?

@ChALkeR

This comment has been minimized.

Copy link
Member

commented Jan 14, 2016

Could you also try Array.from(array) and Array.of(...array), please?

Edit: nevermind, it looks like those are slow.

@alemures

This comment has been minimized.

Copy link
Contributor Author

commented Jan 16, 2016

The same than before using the updated node version v5.4.1 and adding Array#concat(). In this case Array#concat() is slower than Array#slice() so we will still get the best performance using a mix of loop and Array#slice() as it's in the pull request. Should I go then for 30 as the magic number?

@ChALkeR I added Array.from and Array.of to the benchmark but it didn't finish in about 30 seconds so according to that, they have to be extremely slow in this particular case.

Array of numbers:

slice  [26,23,21,23,21,22,21,22,22,23,23,24,22,22,22,22,23,22,22,23,23,22,23,23,23,23,23,23,23,24,24,24,24,24,23,24,24,25,24,25,24,24,24,25,25,27,26,25,25,26]
loop   [19,16,17,18,18,19,19,20,21,23,24,25,25,24,25,25,26,27,28,28,28,29,28,29,30,30,31,31,32,31,33,32,33,33,35,35,35,35,36,37,37,38,38,39,39,40,40,41,41,41]
concat [32,28,32,28,29,28,30,29,31,31,31,30,30,30,30,30,30,29,30,29,30,29,31,30,30,30,30,30,31,30,31,30,31,31,31,31,31,31,31,33,32,31,31,31,31,33,32,31,31,32]

Array of strings:

slice  [28,25,25,23,24,24,24,25,25,23,25,24,24,24,25,23,24,24,25,24,25,25,25,25,25,25,25,25,26,25,25,25,26,25,26,25,27,25,26,25,27,26,27,26,27,26,26,26,26,26]
loop   [24,21,21,23,23,22,24,26,26,27,27,28,29,30,31,31,31,32,32,33,33,35,35,35,38,36,37,38,38,38,40,40,41,41,42,42,43,43,44,45,45,46,47,47,47,48,49,49,50,50]
concat [33,29,30,28,30,29,30,29,31,31,30,30,30,29,31,30,31,29,31,30,31,32,30,31,31,32,31,31,31,31,31,31,31,31,32,31,32,31,32,31,32,31,32,31,32,33,32,32,32,32]

Here is the script:

var mat = [];
var min = 25, max = 75;
var i, j;

for(i = min; i < max; i++) {
  mat[i - min] = []
  for(j = 0; j < i; j++) {
    //mat[i - min][j] = i * j; // Numbers
    mat[i - min][j] = i * j + 'foo'; // Strings
  }
}

var times = 200000;

var t;
var resSlice = [], resLoop = [], resConcat = [];
for(i = 0; i < mat.length; i++) {
  t = Date.now();
  for(j = 0; j < times; j++) {
    mat[i].slice();
  }
  resSlice.push(Date.now() - t);

  t = Date.now();
  for(j = 0; j < times; j++) {
    arrayClone(mat[i], mat[i].length);
  }
  resLoop.push(Date.now() - t);

  t = Date.now();
  for(j = 0; j < times; j++) {
    mat[i].concat();
  }
  resConcat.push(Date.now() - t);
}

console.log('slice', JSON.stringify(resSlice));
console.log('loop ', JSON.stringify(resLoop));
console.log('concat ', JSON.stringify(resConcat));

function arrayClone(arr, length) {
  if (length === 1) return [arr[0]];
  var copy = new Array(length);
  for (var i = 0; i < length; i++)
    copy[i] = arr[i];
  return copy;
}
@mscdex

This comment has been minimized.

Copy link
Contributor

commented Jan 16, 2016

From what I'm seeing 27 might be a better choice, but anything in between 27 and 30 should probably work.

@jasnell

This comment has been minimized.

Copy link
Member

commented Jan 18, 2016

Ewww magic numbers are bleh, but there's obviously some gain here so I'm good with this. Just so it's clear why that number (27-30) was chosen, can you add a bit more detail in the comments and give a point in time type of statement, e.g. "As of v5.2 benchmarks showed..." that will give us a reference point if we decide to revisit this later on.

Otherwise, LGTM

@mscdex

This comment has been minimized.

Copy link
Contributor

commented Jan 18, 2016

I would probably reference at least the v8 version (in addition to the node version) in such a comment as something like this is most likely tied to v8.

@alemures

This comment has been minimized.

Copy link
Contributor Author

commented Jan 18, 2016

Yes, that's not the most elegant code that I've never written, anyhow the function is simple and short enough to be easy to understand and review in the future (as soon as I add a bit more comments as you said). I see myself adding one or two listeners per EventListener object the most of the time so it is worth it.

}

return arr.slice();
}

This comment has been minimized.

Copy link
@thefourtheye

thefourtheye Jan 18, 2016

Contributor

No newline at the end of file?

After a discussion, 27 seems a better choice than 50 as the magic num…
…ber. Improved comments and added missing new line at the end of file.
@ChALkeR

This comment has been minimized.

Copy link
Member

commented Jan 24, 2016

Note that v8 version in master got bumped to 4.8 a few days ago: #4785.

Not sure if that affects any benchmarks, but perhaps it would be better to re-run them to be on the safe side. That should be easy to track, because this is not merged yet =).

@ChALkeR ChALkeR added the performance label Feb 16, 2016

@alemures

This comment has been minimized.

Copy link
Contributor Author

commented Mar 23, 2016

This is the updated results using the last version of node in master branch. I ran the same script in the same machine.

{ http_parser: '2.6.2',
  node: '6.0.0-pre',
  v8: '4.9.385.27',
  uv: '1.8.0',
  zlib: '1.2.8',
  ares: '1.10.1-DEV',
  modules: '47',
  openssl: '1.0.2g' }

Array of numbers:

slice  [33,30,31,29,29,29,30,30,30,31,32,33,34,33,33,34,34,34,36,34,35,35,35,36,35,36,36,37,37,38,39,37,37,39,38,38,39,39,41,39,39,40,40,40,41,42,41,42,42,42]
loop   [20,17,17,18,18,18,19,21,21,22,23,25,27,25,25,25,26,27,27,27,28,29,29,29,30,30,31,31,32,32,33,33,33,34,34,35,35,36,37,37,37,38,37,39,39,39,40,40,41,41]
concat [45,45,42,42,43,45,43,44,44,44,44,44,45,45,46,48,47,48,48,49,49,52,49,50,50,50,49,50,51,51,51,52,52,52,52,52,53,53,53,53,54,54,55,54,55,59,55,55,55,56]

Array of strings:

slice  [40,40,36,37,38,38,38,40,39,40,40,41,41,42,42,42,43,42,44,45,44,45,45,46,46,47,47,48,48,48,49,49,49,50,50,50,51,52,52,53,52,53,55,55,54,55,56,56,57,57]
loop   [24,23,23,23,24,24,26,27,29,28,28,30,30,30,31,32,34,35,33,34,34,35,36,36,37,37,38,38,39,40,41,43,41,41,43,44,44,44,45,46,46,46,47,48,49,49,50,50,50,52]
concat [54,50,51,51,52,52,52,53,53,58,54,54,54,55,56,56,57,58,57,58,58,58,59,62,64,60,60,62,61,62,63,64,64,64,64,65,67,69,65,67,67,67,69,69,68,68,69,69,73,71]

Slower results for build it functions slice and concat? Does it make any sense?

@mscdex

This comment has been minimized.

Copy link
Contributor

commented Mar 24, 2016

FWIW I tested the 3 methods with benchmark.js and I think the loop would be best for most cases:

misc/arrayclone.js method=slice size=2: 8,009,151 ops/sec ± 0.13% (97 runs sampled)
misc/arrayclone.js method=slice size=10: 6,161,806 ops/sec ± 0.79% (96 runs sampled)
misc/arrayclone.js method=slice size=100: 1,761,706 ops/sec ± 0.31% (95 runs sampled)
misc/arrayclone.js method=slice size=1000: 213,588 ops/sec ± 1.49% (95 runs sampled)
misc/arrayclone.js method=slice size=10000: 20,230 ops/sec ± 1.07% (85 runs sampled)
misc/arrayclone.js method=loop size=2: 24,590,635 ops/sec ± 0.32% (90 runs sampled)
misc/arrayclone.js method=loop size=10: 11,716,484 ops/sec ± 0.28% (91 runs sampled)
misc/arrayclone.js method=loop size=100: 1,668,301 ops/sec ± 0.38% (94 runs sampled)
misc/arrayclone.js method=loop size=1000: 172,419 ops/sec ± 0.63% (92 runs sampled)
misc/arrayclone.js method=loop size=10000: 15,829 ops/sec ± 1.24% (85 runs sampled)
misc/arrayclone.js method=concat size=2: 4,015,089 ops/sec ± 0.36% (89 runs sampled)
misc/arrayclone.js method=concat size=10: 3,508,154 ops/sec ± 0.30% (95 runs sampled)
misc/arrayclone.js method=concat size=100: 1,444,073 ops/sec ± 0.32% (96 runs sampled)
misc/arrayclone.js method=concat size=1000: 209,317 ops/sec ± 0.63% (93 runs sampled)
misc/arrayclone.js method=concat size=10000: 20,171 ops/sec ± 1.09% (86 runs sampled)

Each array was filled by having every even element be an array of functions, otherwise just set to a function.

@mscdex

This comment has been minimized.

Copy link
Contributor

commented Mar 24, 2016

After narrowing it down a bit more, it looks like the threshold on master (v8 4.9) is now ~70, at which point switching over to slice() is faster:

misc/arrayclone.js method=slice size=50: 2,880,092 ops/sec ± 0.42% (96 runs sampled)
misc/arrayclone.js method=slice size=60: 2,559,432 ops/sec ± 0.34% (97 runs sampled)
misc/arrayclone.js method=slice size=70: 2,299,314 ops/sec ± 0.31% (96 runs sampled)
misc/arrayclone.js method=slice size=80: 2,084,609 ops/sec ± 0.32% (93 runs sampled)
misc/arrayclone.js method=slice size=100: 1,753,448 ops/sec ± 0.59% (96 runs sampled)
misc/arrayclone.js method=loop size=50: 3,177,551 ops/sec ± 0.41% (95 runs sampled)
misc/arrayclone.js method=loop size=60: 2,678,230 ops/sec ± 0.54% (91 runs sampled)
misc/arrayclone.js method=loop size=70: 2,319,788 ops/sec ± 0.62% (91 runs sampled)
misc/arrayclone.js method=loop size=80: 2,032,337 ops/sec ± 0.62% (92 runs sampled)
misc/arrayclone.js method=loop size=100: 1,669,508 ops/sec ± 0.37% (95 runs sampled)
misc/arrayclone.js method=concat size=50: 2,136,718 ops/sec ± 0.38% (96 runs sampled)
misc/arrayclone.js method=concat size=60: 1,895,947 ops/sec ± 0.38% (94 runs sampled)
misc/arrayclone.js method=concat size=70: 1,792,314 ops/sec ± 0.34% (95 runs sampled)
misc/arrayclone.js method=concat size=80: 1,653,147 ops/sec ± 0.32% (97 runs sampled)
misc/arrayclone.js method=concat size=100: 1,447,514 ops/sec ± 0.30% (97 runs sampled)
@trevnorris

This comment has been minimized.

Copy link
Contributor

commented Mar 27, 2016

Just curious if anyone has done analysis on how many events are being passed through in different scenarios. For example, if nextTick is used I've found that there's a ~98% chance that there are <= 4 queued and processed before the event loop continues.

Is there any chance the method will deopt because of the extra logic? Since the JIT will most likely have optimized for the common case when the other is hit.

@jasnell

This comment has been minimized.

Copy link
Member

commented Apr 1, 2016

Quick aside: marking this as a don't land on v4 for the time being.

@estliberitas estliberitas force-pushed the nodejs:master branch 2 times, most recently to c7066fb Apr 26, 2016

@Trott Trott force-pushed the nodejs:master branch to c5ce7f4 Sep 21, 2016

@rvagg rvagg force-pushed the nodejs:master branch 2 times, most recently to 83c7a88 Oct 18, 2016

@jasnell jasnell added the stalled label Oct 28, 2016

@MylesBorins MylesBorins force-pushed the nodejs:master branch to 54fef67 Feb 1, 2017

@jasnell

This comment has been minimized.

Copy link
Member

commented Feb 28, 2017

@nodejs/ctc ... is this something we' want to pursue? If not, recommend closing.

@alemures

This comment has been minimized.

Copy link
Contributor Author

commented Mar 1, 2017

I just noticed that a similar change was introduced by @bmeurer in the following commit:

f2f997a

Although this version is still performing better for length==1 (using literal array) and for big arrays (using slice), it's also adding more complexity.

@jasnell

This comment has been minimized.

Copy link
Member

commented Mar 24, 2017

Closing due to lack of forward progress on this. Can reopen and revisit if necessary

@jasnell jasnell closed this Mar 24, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.