New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to JS implementation of indexedDB.cmp(). #1412
Conversation
This one is about 30 times faster than indexedDB.cmp() + it removes the global dependency of indexedDB for RangeSet, used by liveQuery().
Comparing two arrays with different length: https://www.w3.org/TR/IndexedDB-3/#compare-two-keys
comparing arrays and typed arrays of different sizes.
Ran unit tests on IE after doing this. All still pass. Though one test on Dexie.Observable failed so I without digging to deep into the reason, I disabled that test for IE and legacy edge.
IdbKeyRange: typeof IDBKeyRange, | ||
tmpTrans: IDBTransaction) : DBCore | ||
{ | ||
const cmp = indexedDB.cmp.bind(indexedDB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, i was very excited when i first read about those performance improvements, however when taking a look at the code i got some doubts.
- Have to tried to perform those benchmarks without the use of
Function.bind
(?) - This JS version of cmp seems only faster for small amounts of data, and specially when the data isn't equal from the first few bytes (?)
Check this out:
b1=cryptHelper.getRandomValues(473)
b2=cryptHelper.getRandomValues(473)
a1 = () => indexedDB.cmp(b1,b2)
a2 = () => compareUint8Arrays(b1,b2)
> console.time('dd');for(let i = 1e6; i--; a1());console.timeEnd('dd');
< dd: 799.961181640625 ms
> console.time('dd');for(let i = 1e6; i--; a2());console.timeEnd('dd');
< dd: 5.804931640625 ms
All good here, with 473bytes of data and not equal.
b2=b1
> console.time('dd');for(let i = 1e6; i--; a1());console.timeEnd('dd');
< dd: 822.533203125 ms
> console.time('dd');for(let i = 1e6; i--; a2());console.timeEnd('dd');
< dd: 702.003173828125 ms
Still reasonable here when the data is equal, but...
b1=cryptHelper.getRandomValues(4730)
b2=b1
> console.time('dd');for(let i = 1e6; i--; a1());console.timeEnd('dd');
< dd: 1101.98681640625 ms
> console.time('dd');for(let i = 1e6; i--; a2());console.timeEnd('dd');
< dd: 8124.588134765625 ms
Thoughts? Not sure if i did overlooked something, those tests were made under Chrome 96 on Windows.
const bl = b.length; | ||
const l = al < bl ? al : bl; | ||
for (let i = 0; i < l; ++i) { | ||
if (a[i] !== b[i]) return a[i] < b[i] ? -1 : 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...as a follow up to my previous comment, here's an experiment with DataView
function compareUint8Arrays(a,b) {
const dv1 = new DataView(a.buffer);
const dv2 = new DataView(b.buffer);
let idx = 0;
let bytes = Math.min(dv1.byteLength, dv2.byteLength);
if (bytes > 4) {
let len = bytes >> 2;
while (idx < len) {
let b1 = dv1.getUint32(idx);
let b2 = dv2.getUint32(idx);
if (b1 !== b2) return b1 < b2 ? -1 : 1;
++idx;
}
bytes -= idx << 2;
}
if (bytes > 2) {
let len = bytes >> 1;
while (idx < len) {
let b1 = dv1.getUint16(idx);
let b2 = dv2.getUint16(idx);
if (b1 !== b2) return b1 < b2 ? -1 : 1;
++idx;
}
bytes -= idx << 1;
}
while (idx < bytes) {
let b1 = dv1.getUint8(idx);
let b2 = dv2.getUint8(idx);
if (b1 !== b2) return b1 < b2 ? -1 : 1;
++idx;
}
return dv1.byteLength === dv2.byteLength ? 0 : dv1.byteLength < dv2.byteLength ? -1 : 1;
}
b1=cryptHelper.getRandomValues(4730)
b2=b1
> console.time('dd');for(let i = 1e6; i--; a2());console.timeEnd('dd');
< dd: 1382.230224609375 ms
... 83% faster with 4KB of data (?) but still slower than indexedDB.cmp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I get it right, the conclusion of all this is that:
- JS based comparison still beats native one by far for numbers and strings (and probably Dates and reasonably short arrays)
- The only situation where we can see native indexedDB.cmp beat the JS based one is when comparing large binaries that have an identical start sequence in their first 500 bytes.
- Binary keys still perform much better with the JS based algorithm than the native one as long as they are reasonably short or have reasonable randomness.
Now, when we are comparing keys, we only do it for indexed keys and properties and these shall not be large anyway (see #1392) and there is almost never any good reason to index large binaries. A typical binary index represent either a primary key of maybe 16 (random) bytes, or digests that may be between 8-32 bytes and have the same randomness. There might be other use cases, like indexing paths of tree structures that may be longer, and having identical start sequences, but probably never that large. Indexing binaries larger than say 1500 bytes would create other performance issues anyway because it bloats the BTree in the database - which is the reason that most databases limit the permitted size of BTree indexes to 1700-3500 bytes depending on DB engine.
See my review comments - real world indexed binary keys are not large and therefore the plain JS version outperforms indexedDB.cmp() by far in most cases anyway. There might be use cases for indexed binary keys up to around 1000 bytes at most but indexing larger binary keys than that is bad practice anyway (See #1392 (comment)). Also, the typical binary index is a random primary key or digest which by definition are rarely equal in leading bytes. Therefore I don't think it's worth it to optimize the code for the cost of increased bundle size and possible bugs. |
Thanks David, that's very helpful information. Also, i was apparently under the wrong impression this comparator was used for more stuff than just keys, so i do fully understand this now and i do agree about using the current JS version as-is. fwiw, i just took a look at https://github.com/dexie/Dexie.js/search?q=cmp and found the following which you may want to move as well, as they're tied to the Dexie.js/src/classes/where-clause/where-clause-constructor.ts Lines 33 to 35 in fe682ef
Dexie.js/src/classes/table/table.ts Line 130 in e865e80
Cheers! |
Thanks! Good you saw that. I'll replace them. |
This one is about 30 times faster than indexedDB.cmp()
used by liveQuery().
Chrome:
Firefox:
Safari:
Bundle sizes