replace FlatQueue with Heapify? #27

leeoniya · 2020-03-18T05:41:09Z

this looks quite promising: https://github.com/luciopaiva/heapify

Flatbush already uses typed arrays, so there should be no new compat issues.

3x faster initial build (assuming it can be used instead of a push loop), otherwise...
2x faster pushes

looks like it requires knowing the queue capacity in advance, but since Flatbush is static anyhow, this should already be known?

mourner · 2020-03-18T08:43:07Z

Nice find — indeed looks like a great library, I didn't know about it! Definitely worth a try, although there are several arguments for keeping FlatQueue:

Priority queue is only used for K nearest neighbor queries (not indexing or bbox search), and as far as I remember, queue operations are not the bottleneck there — it already works fast enough.
Knowing queue capacity in advance might be a problem — when you do kNN search, you can specify K as Infinity and do by distance or predicate instead, in which case you don't really know how many results to expect. Maybe you can still have a reasonable bound for the search queue, I'm not sure.
Although both libraries are tiny by today's standards, FlatQueue is a few times smaller than Heapify.

If you make a benchmark that compares the two specifically for Flatbush, let me know — curious to know the results!

mourner · 2020-03-18T13:13:13Z

@leeoniya tried this out in the heapify branch — benchmark seem to show ~15% kNN performance improvement, though this comes at a cost of reserving 4 + 8 bytes per node for the queue (~11.8MB for 1 million items), 13% bigger bundle and a dependency I have no control over. If we do accept this as a good tradeoff, I could probably update FlatQueue to have an option of using it with a fixed capacity and typed arrays, which should bring similar performance.

mourner · 2020-03-18T13:50:51Z

@leeoniya experimenting with this further, looks like we can make FlatQueue as fast or even a little faster than Heapify for the Flatbush kNN use case by not shrinking queue arrays on pop/clear — see the PR here mourner/flatqueue#1. Meanwhile we retain the advantage of not allocating more memory than necessary. What do you think? V8 is exceptionally fast on regular arrays nowadays, not that much advantage to typed ones beyond memory footprint.

leeoniya · 2020-03-18T14:50:25Z

sweet!

though this comes at a cost of reserving 4 + 8 bytes per node for the queue (~11.8MB for 1 million items)

~~odd. i would expect less mem use for typed~~

err, sorry. should not be writing on phone before coffee.

smaller mem footprint is always nice, but maybe not if it bloats or complicates the code. if that micro pr has the perf improvements, then perhaps it's good enough.

some additional observations:

FlatQueue's api seems to use .peekValue() and .peek() for priority. is this atypical for a priority queue where .peek() is usually for value (or key) and something like .peekPriority() is for priority? (this is how heapify does it and maybe more expected). i'm no expert though and my experience here is essentially zero.

i think that both FlatBush and FlatQueue can be made smaller by moving away from class and prototype structures and making more variables local within a closure (like uPlot). all of this will go away, and many variable names will get shortened when they're not exposed under this. there should be no perf impact because there are usually very few instances of FlatQueue and FlatBush created. the bundle size could easily drop by 30%.

mourner · 2020-03-18T16:09:31Z

FlatQueue's api seems to use .peekValue() and .peek() for priority. is this atypical for a priority queue where .peek() is usually for value (or key) and something like .peekPriority() is for priority? (this is how heapify does it and maybe more expected). i'm no expert though and my experience here is essentially zero.

@leeoniya no, it's consistent with Heapify, just uses different terms. peek is the same in both, and peekValue corresponds to peekPriority.

i think that both FlatBush and FlatQueue can be made smaller by moving away from class and prototype structures and making more variables local within a closure (like uPlot).

I'm not a fan of closures, and find class-based code organization much clearer and easier to read. Also, gzipping strips away most of the this properties overhead anyway.

mourner · 2020-03-18T16:45:16Z

Fixed in 7a76c8b by upgrading to a newer version of FlatQueue and released as v3.2.1. The memory advantage here is that we don't have to reserve a lot of capacity beforehand, and only use the size needed (which we can't really predict). For the 1 million case in the benchmark, after all kinds of kNN queries, the flat queue uses only ~10k items in average (instead of >1 million).

leeoniya · 2020-03-18T17:09:13Z

I'm not a fan of closures, and find class-based code organization much clearer and easier to read. Also, gzipping strips away most of the this properties overhead anyway.

yeah, those are definitely the trade-offs. though it's not just this, but also everything that is under this's namespace:

this._indices[index]
this._boxes[this._pos++]

instead of compressing to this

this._indices[a];this._boxes[this._pos++]

will compress to

x[i];b[p++]

comparing just gzip size only accounts for network transfer, but people seem to forget that the whole source still has to be parsed after inflating. i'm sure the difference is not terribly significant for a lib this size, but still worth mentioning. i might make a closure-ified flatbush branch to see the real difference.

thanks for the discussion!

leeoniya · 2020-03-18T20:42:57Z

did some probing...

after de-prototyping/de-classing Flatbush & FlatQueue, and removing the throw sanity checks, i was able to get the minified/es5/CJS size down to 3.66 KB (from 5.91 KB).

mourner added the question Further information is requested label Mar 18, 2020

mourner mentioned this issue Mar 18, 2020

Do not shrink underlying arrays on pop/clear mourner/flatqueue#1

Merged

mourner closed this as completed in 7a76c8b Mar 18, 2020

leeoniya mentioned this issue Mar 21, 2020

scatter plots leeoniya/uPlot#107

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

replace FlatQueue with Heapify? #27

replace FlatQueue with Heapify? #27

leeoniya commented Mar 18, 2020 •

edited

mourner commented Mar 18, 2020

mourner commented Mar 18, 2020

mourner commented Mar 18, 2020

leeoniya commented Mar 18, 2020 •

edited

mourner commented Mar 18, 2020

mourner commented Mar 18, 2020

leeoniya commented Mar 18, 2020

leeoniya commented Mar 18, 2020

replace FlatQueue with Heapify? #27

replace FlatQueue with Heapify? #27

Comments

leeoniya commented Mar 18, 2020 • edited

mourner commented Mar 18, 2020

mourner commented Mar 18, 2020

mourner commented Mar 18, 2020

leeoniya commented Mar 18, 2020 • edited

mourner commented Mar 18, 2020

mourner commented Mar 18, 2020

leeoniya commented Mar 18, 2020

leeoniya commented Mar 18, 2020

leeoniya commented Mar 18, 2020 •

edited

leeoniya commented Mar 18, 2020 •

edited