refactor(utils): improve performance of copyEmptyArrayProps function#1816
Conversation
🦋 Changeset detectedLatest commit: 85daa2e The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
| const merged = { | ||
| ...newObj, | ||
| ...nextObject, | ||
| } |
There was a problem hiding this comment.
This combined object was not needed as it is exactly what the accumulator does in the reduce.
| const hashMapValue = value.reduce((acc, val) => { | ||
| acc[val.id] = val | ||
| return acc | ||
| }, {}) |
There was a problem hiding this comment.
This is a small trick to avoid the usage of find in a situation of array of array (of arrays). The difference between both approaches was remarkable when debugging this locally.
With hashmap it was less than 100 ms
With .find around 6 seconds
There was a problem hiding this comment.
As a pattern I will name this trick a "dictionary"... you could even do this a little trickier
const valueDict = Object.assign({}, ...value.map( x=>( { [x.id]: x}) ) )
There was a problem hiding this comment.
fwiw the object assign method is a bit slower, probably because you are mapping through all of the values three times (once for the map, once for the spread, once for the assignment).
If it's a large array (100k elements) the reduce approach takes ~2ms whereas the object assign approach takes ~50ms.
// Create an array for testing.
const testArray = new Array(100_000).fill(null).map((_, index) => ({ id: index }));
// Store individual iteration times.
const reduceTimes = [];
const objectAssignTimes = [];
const testIterations = 1_000;
for (let i = 0; i < testIterations; i++) {
// Run reduce method.
const startTimeReduce = Date.now();
testArray.reduce((acc, element) => {
acc[element.id] = element;
return acc;
}, {});
reduceTimes.push(Date.now() - startTimeReduce);
// Run object assign method.
const startTimeObjectAssign = Date.now();
Object.assign({}, ...testArray.map((element) => ({ [element.id]: element })));
objectAssignTimes.push(Date.now() - startTimeObjectAssign);
}
// Log results.
console.log({ totalTimeReduce: reduceTimes.reduce((acc, val) => acc + val) / testIterations });
console.log({
totalTimeObjectAssign: objectAssignTimes.reduce((acc, val) => acc + val) / testIterations,
});
There was a problem hiding this comment.
Well... I just said trickier :-)
Thanks for the testing!
Another interesting approach will be if Array.prototype.group proposal gets approved. It won't return exactly a dictionary because the values will be an array...
But it will read nicely:
array.group( x=>x.id )
There was a problem hiding this comment.
Thanks both for the input and the testing around this. It was really a nice learning experience and I think this improves the library for those cases that we see big numbers.
Just for the record, the case that we were checking was a customer from cimpress that had almost 13k addresses defined in it and the size of the entity was almost 9MB 😨 💥
| ) | ||
| /* eslint-disable no-param-reassign */ | ||
| newObj[key][i] = nestedObject | ||
| merged[key][i] = nestedObject |
There was a problem hiding this comment.
That means that we can just reassign this to the accumulator.
| [key]: isNil(newObj[key]) ? [] : newObj[key], | ||
| } | ||
| merged[key] = isNil(newObj[key]) ? [] : newObj[key] | ||
| return merged |
There was a problem hiding this comment.
instead of using the approach of spread + assignment, we assign first and then return the merged object to avoid the spread operator inside the .reduce
| [key]: nestedObject, | ||
| } | ||
| merged[key] = nestedObject | ||
| return merged |
| expect(old).toEqual(oldObj) | ||
| expect(fixedNewObj).toEqual(newObj) | ||
| expect(end - start).toBeLessThan(100) | ||
| }) |
There was a problem hiding this comment.
Quick test to verify this. It is using performance from node in order to verify the function.
You can run this test in main to validate that there were performance issues with this scenario 👍
Codecov Report
@@ Coverage Diff @@
## master #1816 +/- ##
=======================================
Coverage 94.64% 94.64%
=======================================
Files 141 141
Lines 4875 4877 +2
Branches 1332 1332
=======================================
+ Hits 4614 4616 +2
Misses 258 258
Partials 3 3
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
f33aa97 to
582fbb5
Compare
danrleyt
left a comment
There was a problem hiding this comment.
Thanks for the brilliant contribution, that will also help us immensely.
Summary
Refactors a little bit the way
copyEmptyArrayPropsis implemented in order to seek a better performance specially for cases with a huge amount of items in an array.Description
Context: We (pangolins) do a heavy use of this library in Audit Log. We were observing some customer comparisons that were taking a huge amount of time and when debugging that led us to spotting sync actions library as the one that was causing the delay.
After some investigation, we found that the library was not really performant for cases when customers with a huge amount of addresses where compared between versions. In our case it was a customer with almost 13k addresses in the old and new version.
In this PR we propose a solution by changing the implementation to remove the usage of spread operators inside a reduce in favour of attribute assignments or even the usage of a map instead of array methods which usually led to the O(n^2) problem More info
Todo
Typelabel for the PR