Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to JS implementation of indexedDB.cmp(). #1412

Merged
merged 7 commits into from Oct 14, 2021
Merged

Switch to JS implementation of indexedDB.cmp(). #1412

merged 7 commits into from Oct 14, 2021

Conversation

dfahlander
Copy link
Collaborator

@dfahlander dfahlander commented Oct 14, 2021

This one is about 30 times faster than indexedDB.cmp()

  • it removes the global dependency of indexedDB for RangeSet,
    used by liveQuery().

Chrome:

Implementation Number of iterations Time taken (string) Time taken (number) Time taken (Uint8Array)
indexedDB.cmp() 1_000_000 550 ms 503 ms 790 ms
plain JS version 1_000_000 17.5 ms 13 ms 27 ms
factor faster 31 39 29

Firefox:

Implementation Number of iterations Time taken (string)
indexedDB.cmp() 1_000_000 370 ms
plain JS version 1_000_000 50 ms

Safari:

Implementation Number of iterations Time taken (string)
indexedDB.cmp() 1_000_000 110 ms
plain JS version 1_000_000 22 ms

Bundle sizes

Dexie version Minified Minified & gzipped
dexie@3.0.3 71.13 kb 22.1 kb
dexie@3.2.0-rc.2 (with liveQuery()) 78.3 kb 25.1 kb
dexie@3.2.0-... with also JS cmp 79 kb 25.3 kb

This one is about 30 times faster than indexedDB.cmp()
+ it removes the global dependency of indexedDB for RangeSet,
used by liveQuery().
comparing arrays and typed arrays of different sizes.
Ran unit tests on IE after doing this. All still pass.
Though one test on Dexie.Observable failed so I without digging to deep into the reason, I disabled that test for IE and legacy edge.
@dfahlander dfahlander merged commit 46a430f into master Oct 14, 2021
IdbKeyRange: typeof IDBKeyRange,
tmpTrans: IDBTransaction) : DBCore
{
const cmp = indexedDB.cmp.bind(indexedDB);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, i was very excited when i first read about those performance improvements, however when taking a look at the code i got some doubts.

  • Have to tried to perform those benchmarks without the use of Function.bind (?)
  • This JS version of cmp seems only faster for small amounts of data, and specially when the data isn't equal from the first few bytes (?)

Check this out:

b1=cryptHelper.getRandomValues(473)
b2=cryptHelper.getRandomValues(473)

a1 = () => indexedDB.cmp(b1,b2)
a2 = () => compareUint8Arrays(b1,b2)
> console.time('dd');for(let i = 1e6; i--; a1());console.timeEnd('dd');
< dd: 799.961181640625 ms

> console.time('dd');for(let i = 1e6; i--; a2());console.timeEnd('dd');
< dd: 5.804931640625 ms

All good here, with 473bytes of data and not equal.

b2=b1
> console.time('dd');for(let i = 1e6; i--; a1());console.timeEnd('dd');
< dd: 822.533203125 ms

> console.time('dd');for(let i = 1e6; i--; a2());console.timeEnd('dd');
< dd: 702.003173828125 ms

Still reasonable here when the data is equal, but...

b1=cryptHelper.getRandomValues(4730)
b2=b1

> console.time('dd');for(let i = 1e6; i--; a1());console.timeEnd('dd');
< dd: 1101.98681640625 ms

> console.time('dd');for(let i = 1e6; i--; a2());console.timeEnd('dd');
< dd: 8124.588134765625 ms

Thoughts? Not sure if i did overlooked something, those tests were made under Chrome 96 on Windows.

const bl = b.length;
const l = al < bl ? al : bl;
for (let i = 0; i < l; ++i) {
if (a[i] !== b[i]) return a[i] < b[i] ? -1 : 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...as a follow up to my previous comment, here's an experiment with DataView

function compareUint8Arrays(a,b) {
    const dv1 = new DataView(a.buffer);
    const dv2 = new DataView(b.buffer);
    let idx = 0;
    let bytes = Math.min(dv1.byteLength, dv2.byteLength);
    if (bytes > 4) {
        let len = bytes >> 2;
        while (idx < len) {
            let b1 = dv1.getUint32(idx);
            let b2 = dv2.getUint32(idx);
            if (b1 !== b2) return b1 < b2 ? -1 : 1;
            ++idx;
        }
        bytes -= idx << 2;
    }
    if (bytes > 2) {
        let len = bytes >> 1;
        while (idx < len) {
            let b1 = dv1.getUint16(idx);
            let b2 = dv2.getUint16(idx);
            if (b1 !== b2) return b1 < b2 ? -1 : 1;
            ++idx;
        }
        bytes -= idx << 1;
    }
    while (idx < bytes) {
        let b1 = dv1.getUint8(idx);
        let b2 = dv2.getUint8(idx);
        if (b1 !== b2) return b1 < b2 ? -1 : 1;
        ++idx;
    }
    return dv1.byteLength === dv2.byteLength ? 0 : dv1.byteLength < dv2.byteLength ? -1 : 1;
}
b1=cryptHelper.getRandomValues(4730)
b2=b1

> console.time('dd');for(let i = 1e6; i--; a2());console.timeEnd('dd');
< dd: 1382.230224609375 ms

... 83% faster with 4KB of data (?) but still slower than indexedDB.cmp

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I get it right, the conclusion of all this is that:

  • JS based comparison still beats native one by far for numbers and strings (and probably Dates and reasonably short arrays)
  • The only situation where we can see native indexedDB.cmp beat the JS based one is when comparing large binaries that have an identical start sequence in their first 500 bytes.
  • Binary keys still perform much better with the JS based algorithm than the native one as long as they are reasonably short or have reasonable randomness.

Now, when we are comparing keys, we only do it for indexed keys and properties and these shall not be large anyway (see #1392) and there is almost never any good reason to index large binaries. A typical binary index represent either a primary key of maybe 16 (random) bytes, or digests that may be between 8-32 bytes and have the same randomness. There might be other use cases, like indexing paths of tree structures that may be longer, and having identical start sequences, but probably never that large. Indexing binaries larger than say 1500 bytes would create other performance issues anyway because it bloats the BTree in the database - which is the reason that most databases limit the permitted size of BTree indexes to 1700-3500 bytes depending on DB engine.

@dfahlander
Copy link
Collaborator Author

dfahlander commented Dec 18, 2021

See my review comments - real world indexed binary keys are not large and therefore the plain JS version outperforms indexedDB.cmp() by far in most cases anyway. There might be use cases for indexed binary keys up to around 1000 bytes at most but indexing larger binary keys than that is bad practice anyway (See #1392 (comment)). Also, the typical binary index is a random primary key or digest which by definition are rarely equal in leading bytes. Therefore I don't think it's worth it to optimize the code for the cost of increased bundle size and possible bugs.

@diegocr
Copy link

diegocr commented Dec 19, 2021

Thanks David, that's very helpful information. Also, i was apparently under the wrong impression this comparator was used for more stuff than just keys, so i do fully understand this now and i do agree about using the current JS version as-is.

fwiw, i just took a look at https://github.com/dexie/Dexie.js/search?q=cmp and found the following which you may want to move as well, as they're tied to the Where clauses we may get a little performance boost there as well (?)

this._descending = (a, b) => indexedDB.cmp(b, a);
this._max = (a, b) => indexedDB.cmp(a,b) > 0 ? a : b;
this._min = (a, b) => indexedDB.cmp(a,b) < 0 ? a : b;

return idb.cmp(a,b) === 0; // Works with all indexable types including binary keys.

Cheers!

@dfahlander
Copy link
Collaborator Author

Thanks! Good you saw that. I'll replace them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants