Performance improvement (addresses #15) #17

peteraisher · 2019-09-28T17:29:00Z

This update implements some lower-level functions to improve the performance of standard heap operations on SwiftPriorityQueue and adds performance tests for init, pop, push and remove. Hopefully the improvement is sufficient to close #15.

A naive implementation replacing sink and swim with versions that can be called from inside Array.withUnsafeMutableBufferPointer(_:) already gave promising performance for init, pop and push. (See forked master branch. This might represent a good compromise solution if the final code is deemed to be too complex.)

A more in-depth rewrite of the internal code (with inspiration from libcxx/include/algorithm) gave further performance improvements, although the resulting code is less "easy" to understand. In an effort to aid readability, the new code is extensively (some might say excessively!) marked up.

Building a SwiftPriorityQueue from an Array of 100,000 values went from 17.8 ms to 4.1 ms.
Emptying a SwiftPriorityQueue with 100,000 values using pop() went from 2.11 s to 0.36 s.
Calling remove(_:) on 100 random items from a SwiftPriorityQueue with 100,000 values went from 2.36 s to 0.277 s.

Building a SwiftPriorityQueue by calling push(_:) 100,000 times went from 118 ms to 149 ms. Not certain what happened there! Further investigation may be required, but in any case, the performance improvement in pop() more than compensates in normal usage where pop() and push(_:) are called equally often.

- new performance tests for heap creation and for pop

performance tests added for push and remove

refine and add code comments to `remove` implementation

Remove unused `sink` and `swim` functions. Avoid discouraged direct access to `description` and `debugDescription`.

Instead, add function `decrementAndReturn()` Make `extension` to `UnsafeMutablePointer` explicitly `internal` Clean internals of `__heapify(_:_:_:) `

davecom · 2019-09-28T19:23:35Z

Hi Peter,
Thanks for the extensive write up and contribution. This is all very interesting. It is in effect a total rewrite from scratch. It's interesting that the push() performance went down. I assume the gains here are mostly from reimplementation at a lower level rather than algorithmic changes (it's still a binary heap I presume?), so I would've expected everything to be improved.

My concern is that merging this would lead to new code that I don't completely understand and would therefore be hard more me to maintain going forward. It's going to a new level of complexity. SwiftPriorityQueue has been around for about 5 years and is used in many real world projects, so if I make the code hard on myself to maintain, the detriment of that could be greater than any performance gain. I'm wondering if there are some gains here that can be gleaned from more minor changes to the existing codebase? Is there an 80/20 rule here?

Why is your version faster? Is it because it is implemented at a lower level with pointers, or because it is using a different algorithm at some point?

Also, more immediately, can you put the performance tests into a separate pull request so I can try to see if I can combine some of the ideas from your version into the mainline version? Thanks in advance.

Thank you so much for all of the hard work here,
David

peteraisher · 2019-09-28T20:29:51Z

Hi,

My pleasure! It was quite fun to see how far it could be pushed! Yes, over the course of the day it gradually morphed into a total rewrite. Not sure what the issue was with push(_:) performance either. There are no algorithmic changes, it's still a binary heap, so I also would've expected improvements across the board. It's possible that the extra overhead from repeatedly entering the withUnsafeMutablePointer(_:) closure outweighs the benefits there. I didn't get round to doing proper profiling etc.

I fully understand your concern about maintainability. Luckily there are still pretty big gains to be had just from tweaking things to call withUnsafeMutablePointer(_:) instead of working with the Array. That gave similar performance gains (still a few times faster) for moderate extra complexity, so maybe this is the way you want to go. The code is here.

In the meantime, I will create a new PR with just the performance tests.

Peter Aisher added 6 commits September 28, 2019 01:32

- initial reimplementation of heap functions using pointers

755a3a5

- new performance tests for heap creation and for pop

all functions reimplemented

863f9d2

remove reimplemented with pointers

4fa83e8

performance tests added for push and remove

decorate functions as @inlinable

3c3c04d

refine and add code comments to `remove` implementation

Add documentation markup to helper functions.

ce61abe

Remove unused `sink` and `swim` functions. Avoid discouraged direct access to `description` and `debugDescription`.

Remove unnecessary operator definitions

f3249e3

Instead, add function `decrementAndReturn()` Make `extension` to `UnsafeMutablePointer` explicitly `internal` Clean internals of `__heapify(_:_:_:) `

peteraisher closed this Oct 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvement (addresses #15) #17

Performance improvement (addresses #15) #17

peteraisher commented Sep 28, 2019

davecom commented Sep 28, 2019

peteraisher commented Sep 28, 2019

Performance improvement (addresses #15) #17

Performance improvement (addresses #15) #17

Conversation

peteraisher commented Sep 28, 2019

davecom commented Sep 28, 2019

peteraisher commented Sep 28, 2019