Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark against level #4

Closed
3 tasks done
vweevers opened this issue Nov 9, 2021 · 11 comments
Closed
3 tasks done

Benchmark against level #4

vweevers opened this issue Nov 9, 2021 · 11 comments
Assignees
Labels
benchmark Requires or pertains to benchmarking

Comments

@vweevers
Copy link
Member

vweevers commented Nov 9, 2021

Compare:

  • level@7 against classic-level. To test the native part (now using encoding options instead of *asBuffer). Not expecting a change here.
  • level-mem + subleveldown against memory-level + sublevels, with json encoding. To test the JS part. Expecting a slight improvement here, though it might only surface in real-world apps (with GC pressure) rather than a synthetic benchmark.
  • Batching on level-mem against memory-level, as it removes two (or three, with sublevels) iterations of the batch array.

A quick benchmark is enough (of reads and writes). It's just to check that performance is equal or better.

@vweevers vweevers added this to To Do in Level (old board) Nov 9, 2021
@vweevers vweevers added this to the 1.0.0 milestone Nov 9, 2021
@vweevers vweevers added the benchmark Requires or pertains to benchmarking label Nov 20, 2021
@vweevers vweevers self-assigned this Nov 21, 2021
@vweevers vweevers removed this from To Do in Level (old board) Dec 4, 2021
@vweevers vweevers removed this from the 1.0.0 milestone Dec 29, 2021
@vweevers
Copy link
Member Author

puts on level-mem + subleveldown versus memory-level + sublevels, with json encoding. Win.

put 1642345597518

puts on level-mem versus memory-level versus memory-level using strings internally. Double win.

put 1642346203027

@vweevers
Copy link
Member Author

vweevers commented Jan 16, 2022

iterator.next() on level-mem versus memory-level, using json and utf8 valueEncodings. No difference (because the main cost is setImmediate).

iterate 1642346920934

@vweevers
Copy link
Member Author

iterator.next() on level-mem versus iterator.nextv(1000) on memory-level. Not a fair benchmark, but the new nextv() API is an obvious win.

iterate 1642347677129

@vweevers
Copy link
Member Author

iterator.next() on level versus iterator.next() on classic-level. Slower. I reckon that's because I changed the structure of the cache (in short: [entry, entry, ..] instead of [key, value, key, value, ..]) which should make nextv() faster. That'll be difficult to compare fairly.

iterate 1642373843003

@vweevers
Copy link
Member Author

Batch puts on level-mem versus memory-level. Win.

batch-put 1643394415492

@vweevers
Copy link
Member Author

Gets on level-mem versus memory-level. Win.

get 1643397969100

However, memory-level is slower when using a binary valueEncoding. That warrants a closer look.

@vweevers
Copy link
Member Author

vweevers commented Jan 28, 2022

However, memory-level is slower when using a binary valueEncoding. That warrants a closer look.

It's not due to binary. Happens on any encoding when this code path is triggered:

if (options.keyEncoding !== keyFormat || options.valueEncoding !== valueFormat) {
options = { ...options, keyEncoding: keyFormat, valueEncoding: valueFormat }
}

V8 has a performance issue with the spread operator when properties are not present. The following "fixes" it:

options.keyEncoding = keyFormat
options.valueEncoding = valueFormat
options = { ...options, keyEncoding: keyFormat, valueEncoding: valueFormat }

As does using Object.assign() instead of spread:

get 1643406383068

Could switch to Object.assign() but I do still generally prefer the spread operator, for being idiomatic (not being vulnerable to prototype pollution could be another argument but I don't see how that would matter here).

@vweevers
Copy link
Member Author

The same get() performance regression exists on classic-level. Using Object.assign() would fix it.

get 1643410282728

@vweevers
Copy link
Member Author

vweevers commented Jan 29, 2022

Quick-and-dirty benchmark of streams, comparing nextv() to next(). Ref Level/community#70 and Level/read-stream#2.

Unrelated to abstract-level, but it's a win.

classic-level | using nextv() | took 1775 ms, 563380 ops/sec
classic-level | using nextv() | took 1577 ms, 634115 ops/sec
classic-level | using nextv() | took 1549 ms, 645578 ops/sec
classic-level | using nextv() | took 1480 ms, 675676 ops/sec
classic-level | using nextv() | took 1572 ms, 636132 ops/sec
                                 avg 1591 ms

level         | using next()  | took 1766 ms, 566251 ops/sec
level         | using next()  | took 1776 ms, 563063 ops/sec
level         | using next()  | took 1737 ms, 575705 ops/sec
level         | using next()  | took 1711 ms, 584454 ops/sec
level         | using next()  | took 1729 ms, 578369 ops/sec
                                 avg 1744 ms

@vweevers
Copy link
Member Author

Did a better benchmark of streams. This one takes some explaining. In the graph legend below:

  • new-nextv is using a level-read-stream on a classic-level iterator using nextv(size)
  • old-next is level().createReadStream(), i.e. level-iterator-stream on a leveldown iterator using next()
  • new-next is using a level-read-stream on a classic-level iterator using next() (a temporary code path for fair benchmarking)
  • new-nextv-tweaked uses userland options to make sure that the byte-hwm is more than the expected byte size of a nextv(size) array, because otherwise classic-level would hit the byte-hwm first and return partially filled arrays
  • old-next-tweaked uses userland options to increase both byte-hwm and stream-hwm (manually creating a level-iterator-stream so that there's a way to specify both byte-hwm and stream-hwm). In such a way that the byte-hwm is effectively ignored and we can compare the effect of merely increasing stream-hwm.

Where "byte-hwm" is the highWaterMark on the C++ side, measured in bytes. And "stream-hwm" is the highWaterMark of object-mode streams, measured in amount of entries.

That's about half of the explainer needed... In hindsight I wish I didn't do the abstract-level and nextv() work in parallel. So please allow me to just skip to conclusions (and later document how a user should tweak their options):

  • nextv(size) is faster than next() (compare new-nextv to old-next)
  • Though the refactorings needed to implement nextv(size) make next() slightly slower (compare old-next to new-next)
  • Both the old next() and the new nextv(size) can be tweaked through options, but nextv(size) can't be beaten.
  • Bottom line, streams became faster. I won't give any numbers, because there are too many factors.

TLDR: we're good. Most importantly, the performance characteristics of streams and iterators did not change, in the sense that an app using smaller or larger values (I used 100 bytes) would be hurt by upgrading to abstract-level or classic-level. That's because leveldown internally already had two highWaterMark mechanisms; classic-level merely "hoists" one of them up to streams. So if an app has extremely large values, we will not prefetch more items than before. If an app has small values, we will not prefetch less than before. If an app is not using streams, iterators still do prefetch (as you can see later when I finally push all code).

stream 1643555843619

@vweevers
Copy link
Member Author

and later document how a user should tweak their options

Done in Level/classic-level#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark Requires or pertains to benchmarking
Projects
Status: Done
Development

No branches or pull requests

1 participant