The top-down make_heap implementation runs in O(n) #2

Morwenn · 2019-08-18T15:56:29Z

I had forgotten about it in the meantime, but I asked on /r/compsci whether the final make_heap ran with the anticipated time complexity. While it didn't give me a clear answer for that specific algorithm, user _--__ proposed a proof to show that the top-down version of make_heap runs in O(n).

Below is the relevant part of the discussion:

_--__

While I'm not 100% sure you are doing it quite right (haven't ironed out all the code details) it seems the principle you are trying to apply is as follows:

Initially, each element can be considered the root of a size 2^k - 1 balanced binary tree (for some k). The intention is to turn each such tree into a poplar.

Critically, the nodes of the tree occur in the array earlier than the root, so if we proceed sequentially through the array (creating poplars), whenever we get to an element it is actually the root of a semipoplar. So the corresponding tree can be transformed to a poplar with a simple sift call.

Calling sift on a semipoplar of size k takes O(log k) time.

Note that although we "proceed sequentially" through the array, two halves of the poplar being built are independent until we reach the root. In other words, the top-down make_poplar function
make_poplar(first, last) {
   make_poplar(first, (first+last)/2)
   make_poplar((first+last)/2,last)
   sift
}
is doing exactly the same thing (i.e. calling sift on the elements of the array in sequential order)!

This presentation makes it much easier to calculate the running time - we can invoke the Master theorem. More precisely, if T(n) is the running time on an array of length n, then this recursive presentation tells us that T(n) ≤ 2T(n/2) + log(n). The Master Theorem then immediately tells us that T(n) [which falls into Case 1 of the theorem] will be O(n).

Morwenn

Thanks, this does look like a start to prove the O(n) running time of make_heap, at least when there is a single poplar in the collection :)

Intuitively I'd think that the running time is O(n) no matter the number of final poplars after make_heap has ended: there can be at most O(log n) such poplars by definition so there can be at most log n calls to make_poplar from the top level, but on ever smaller sections of the original collection: every new poplar has a depth smaller than the previous poplar (except when there are two poplars of the same size at the end), which means that sift has fewer work to do for every new top-level poplar.

_--__

The argument goes through for each separate poplar - make_heap is O(k) where k is the size of the array relating to the poplar, so if your array is broken up into r poplars as n=k1+k2+k3+...+kr then the running time will be O(k1)+O(k2)+...+O(kr) = O(n). Basically, a single poplar is, in fact, the "worst case"

At some point I will likely have to integrate that into the README.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The top-down make_heap implementation runs in O(n) #2

The top-down make_heap implementation runs in O(n) #2

Morwenn commented Aug 18, 2019

The top-down make_heap implementation runs in O(n) #2

The top-down make_heap implementation runs in O(n) #2

Comments

Morwenn commented Aug 18, 2019