(#6096) - clean up and reduce memory usage of mapreduce #6096

Merged
merged 1 commit into from Jan 3, 2017

Projects

None yet

2 participants

@nolanlawson
Member

Testing indicates that Map/Set use less memory than {} so we should be preferring those.

I also took one large function-within-a-function and refactored it into smaller functions one level higher.

@nolanlawson
Member

Failing on coverage, needs investigation.

@nolanlawson
Member

Seems objectCollate and boolean comparison within collate() is not being used... very strange outcome. I'll investigate when I have time.

@nolanlawson nolanlawson (#6096) - clean up and reduce memory usage of mapreduce
6bfd675
@nolanlawson
Member
nolanlawson commented Jan 3, 2017 edited

This needed a bit more work because some code paths inside pouchdb-collate were not being hit anymore once I implemented my optimizations. So I decided we needed some new tests to make sure that the old code paths were being tested, so I added some new mapreduce tests.

Some of those code paths were still not being hit, though, and then I realized that we forgot to port over the pouchdb-collate tests when we moved them into PouchDB core. So I did that...

Then I found some code paths that were still not being hit (inside of pouchdb-collate), so after careful analysis I removed them.

So now this PR adds some fairly modest code simplifications to mapreduce, but mostly adds a lot of new tests. 😃

@@ -20,14 +20,11 @@ function collate(a, b) {
if ((ai - bi) !== 0) {
return ai - bi;
}
- if (a === null) {
- return 0;
- }
@nolanlawson
nolanlawson Jan 3, 2017 Member

this code path was only being hit if you did collate(undefined, null) and it turned out this was pointless after I fixed lastKey because we always coerce undefined to null everywhere else when emitted by map/reduce (as CouchDB itself does)

switch (typeof a) {
case 'number':
return a - b;
case 'boolean':
- return a === b ? 0 : (a < b ? -1 : 1);
+ return a < b ? -1 : 1;
@nolanlawson
nolanlawson Jan 3, 2017 Member

this code path was impossible to hit because if two booleans are identical, then collate will return 0 above during the a === b test

PouchDB.ajax = ajax;
PouchDB.utils = utils;
PouchDB.Errors = errors;
+PouchDB.collate = collate;
@nolanlawson
nolanlawson Jan 3, 2017 Member

need to export this in order to test in tests/unit/test.collate.js

@@ -0,0 +1,469 @@
+'use strict';
@nolanlawson
nolanlawson Jan 3, 2017 Member

this is ported over almost verbatim from pouchdb-collate/test

- var indexableKeysToKeyValues = docData.indexableKeysToKeyValues;
- var changes = docData.changes;
+ var indexableKeysToKeyValues = docData[0];
+ var changes = docData[1];
@nolanlawson
nolanlawson Jan 3, 2017 Member

using an array here is more efficient than {changes: ..., indexableKeysToKeyValues: ...}. the keys don't need to be spelled out explicitly because it's all internal code

+
+ function createIndexableKeysToKeyValues(mapResults) {
+ var indexableKeysToKeyValues = new Map();
+ var lastKey;
@nolanlawson
nolanlawson Jan 3, 2017 edited Member

It turned out I had forgotten about JavaScript variable hoisting, and this lastKey was non-undefined between doc comparisons (because it was all one gigantic function and so lastKey was hoisted to the top), meaning that we were doing needless collate() comparisons between documents. (The collate() comparison is only necessary to uniqify the indexable key for an emission if it's 1) the same map() for the same document and 2) it emitted the exact same key more than once.)

So by moving this to a separate function, we now re-initialize lastKey to undefined for every doc, and furthermore by checking i > 0, this is what led to those old pouchdb-collate code paths not being hit anymore.

So this means that from now on our indexable keys should be quite a bit simpler; no need to add the i value unless we truly are emitting the exact same key for the exact same document.

@daleharvey

Looks awesome, thanks

@daleharvey daleharvey merged commit 50f6ac1 into master Jan 3, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage remained the same at 100.0%
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment