Skip to content
This repository has been archived by the owner on Jun 26, 2020. It is now read-only.

I/6528: Faster paths concatenation #1835

Merged
merged 13 commits into from
Apr 14, 2020
Merged

I/6528: Faster paths concatenation #1835

merged 13 commits into from
Apr 14, 2020

Conversation

jodator
Copy link
Contributor

@jodator jodator commented Apr 2, 2020

Suggested merge commit message (convention)

Other: Use faster for loop instead of Array.concat() to merge Position paths. Closes ckeditor/ckeditor5#6528.


Additional information

@jodator jodator requested a review from Reinmar April 2, 2020 08:34
@coveralls
Copy link

coveralls commented Apr 2, 2020

Coverage Status

Coverage remained the same at 100.0% when pulling e63694a on i/6528 into 670cd7b on master.

This is a performance enhancement. Offset is calculated many times in model operations. Getters should be As Fast As Possible thus calling external method that do too much is not desired here.
@jodator
Copy link
Contributor Author

jodator commented Apr 3, 2020

I'm not sure what does the coverage drop cause. It is in view.js file. The master is the same as base of this PR yet /pr has different result than /push .

I cannot restart the job either :/

// like mixed data types with arrays (e.g. [ 0 ].concat( 1, 2, [3, 4])) thus it probably check each argument's types and more.
// In Position's paths concatenation case there always be two arrays to merge so such general method is not needed.
function concatenatePaths( rootPath, path ) {
const newPath = [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check whether creating new array with the desired length (`new Array(len)`) isn't better?

Creating an empty array and pushing items to it seems extremely counterintuitive. I understand that right now we can see that it's faster, but this is this type of optimization that may suddenly become slower. 

Also, on which browsers did you test it? Due to this being counterintuitive, I'd like to be 100% sure that it doesn't deoptimize this on one of the browsers. It's probably one of the places where having a manual test comparing two scenarios might be helpful in verifying this in the future too. Because it's not said that the situation won't change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check whether creating new array with the desired length (new Array(len)) isn't better?

I'll check. The same thing was pointed out by @mlewand . I think that in this case, the change might be negligible as both arrays have length of max few values. This is potentially helpful when concatenating large arrays.

The case that this might get slower will probably be a newer a case for Array.concat() as this method do too many things AFAICS (must check each parameter from args if it is an array, etc).

Responding also to spread operator. It is slower. I'll post the results below. Also, I'll add more comments in the code so one would remember why for and not [ ...root, ...path ].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I'll check this with a manual test and post the results.

Copy link
Contributor Author

@jodator jodator Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the results are

1. Proposed solution

2. Spread operator:

3. Ultra optimized array creation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Firefox (using medium sample because FF has short buffer and it way to slow there):

1. Proposed

2. Spread operator

It looks like there is no difference there but number of samples is different so it looks like the performance buffer was overflown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So 1 & 3 are basically the same so we can choose either.

Copy link
Member

@Reinmar Reinmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a manual test would be extremely helpful to check this on various browsers. It may be a trimmed down version of this case (a very simplified Position class) – it should print out on the console result of running a concat version and push version and a spread operator vesion.

Then, we should check this on all the browsers.

@Reinmar
Copy link
Member

Reinmar commented Apr 3, 2020

BTW, a spread operator seems as a better (more intuitive) solution. I'd like to see on the results how it looks because if it's not significantly slower, I'd feel safer with it :D

@Reinmar
Copy link
Member

Reinmar commented Apr 3, 2020

As for the CC – it's 100% for me, and it was 100% when the CI executed it... no idea why Coveralls shows that it dropped. I'll ignore it.

@jodator
Copy link
Contributor Author

jodator commented Apr 6, 2020

I've created a sample manual test for this case. The results are (Firefox left, Chrome right):

The "ultra-looop" is a test with Array( length ) optimization and not using push().

@jodator
Copy link
Contributor Author

jodator commented Apr 7, 2020

@Reinmar Things to discuss:

  1. Is the manual test needed? I see it's benefits for PR but such PR is too narrow IMO.
  2. Continuing on this. Maybe full Position class would be better with other methods to test (future thinking, so it might be too much also).
  3. I can see that "ultra" loop is quite superior in this synthetic tests in Chrome and meh in Firefox.

@Reinmar
Copy link
Member

Reinmar commented Apr 8, 2020

  • Is the manual test needed? I see it's benefits for PR but such PR is too narrow IMO.

It's needed for validating the decision right now and re-evaluating it later on.

  1. Continuing on this. Maybe full Position class would be better with other methods to test (future thinking, so it might be too much also).

The manual test that you created is testing unrealistic cases and may not be reliable. E.g. it creates an array and does not do anything with it later on. That means that the browser may recognise this code as having no side effects and simply remove it.

Maybe I'm too cautious (if there's such a thing), but I'd test:

  • a copy of the current position's constructor,
  • make sure that the code has some side effects – e.g. push something like position.path.length to window.somethingGlobal.

The idea is to make sure the browser really executes this code and that the code resembles its real form (the constructor does all the same things) so you know that e.g. another instruction in the same function doesn't cause some deopt (which is possible because of JIT).

3. I can see that "ultra" loop is quite superior in this synthetic tests in Chrome and meh in Firefox.

Yeah, there's no single solution that'd look stable on these browsers. Also, I'm curious about Safari too. But once we'll have a test, we'll be able to make a decision.

PS. Yes, microoptimizations are hard because you need to make sure that everything you test is realible.

@jodator
Copy link
Contributor Author

jodator commented Apr 9, 2020

So I've update the manual test with simple class instances and did some fiddling with the tests. I am sad as this tests is still too synthetic. The problem is that the times fluctuates (probably due to Garbage Collector) and the times are not conclusive on subsequent calls. Also tests give different result when the order is different.

Below is a comparison of two tests orders and subsequent calls.

Tests order ATests order B

@jodator
Copy link
Contributor Author

jodator commented Apr 9, 2020

OK. I've got the tests wrong. After some tweaks:

Test Order ATest Order B

@jodator
Copy link
Contributor Author

jodator commented Apr 9, 2020

So this is a result I've got with testing the positions - the for loop and "ultra" for loop have similar outcomes and the performance gain of using Array( length ) is IMO negligible.

@jodator jodator requested a review from Reinmar April 9, 2020 08:20
@Reinmar
Copy link
Member

Reinmar commented Apr 13, 2020

I've been doing some realistic tests:

  • Loading "load (semantic)" data: 3.95s -> 3.91s
  • Undoing removal of "load (semantic)" data: 2.04s -> 1.99s
  • Undoing a 3x43 table cell merge: 1.67s -> 1.14s

Interestingly, only table merge got significantly better. 

I noticed though that in most cases (and in some scenarios in 100% cases) we pass a root to the position constructor. The root has an empty path so we can optimize for this scenario which makes the array concatenation operator insignificant.

I pushed an improvement which reduced the undo time of a 7x43 merge: from 5.6s (original), via 3.9s (Jodi's) to 3.3s. 

I actually used the spread operator but reverting back to concat() would most likely not change much. In other words, it's always better to not do the thing rather than optimize the thing.

PS. Any significant optimization seems to be impossible here without finally rewriting the positions to be immutable and making sure that OT uses some optimized version of them.

@Reinmar
Copy link
Member

Reinmar commented Apr 13, 2020

PS. I used slice() because of https://jsben.ch/lO6C5.

@Reinmar
Copy link
Member

Reinmar commented Apr 13, 2020

At this point, I realized that we could use any concatenation method, and that makes the manual test that I asked you to create quite redundant. Sorry for that. I'd remove it now.

@@ -140,7 +143,7 @@ export default class Position {
* @type {Number}
*/
get offset() {
return last( this.path );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? I thought myself that last() is probably slow, but I didn't see any change in results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a need here for using an external library which involves loading an external library for this. Also for other path operations, we do not use libs for simple tasks in Position elsewhere.

The last helper does:

function last(array) {
  var length = array == null ? 0 : array.length;
  return length ? array[length - 1] : undefined;
}

and we should never have Position without this.path.

Calling external method will be slower. It is arguable if this is an important gain. Here I get 195.7ms (1.1%) down to 2.3 ms (0.0%). In a total of ~20s (long semantic + undo). So it is arguable if this change is necessary - I'm for this change.

Current:

Current with reverting to last.
Selection_307

TL;DR: Let's keep it - using last() makes no sense.

@Reinmar Reinmar merged commit bfc6c88 into master Apr 14, 2020
@Reinmar Reinmar deleted the i/6528 branch April 14, 2020 10:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants