Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The top-down make_heap implementation runs in O(n) #2

Open
Morwenn opened this issue Aug 18, 2019 · 0 comments
Open

The top-down make_heap implementation runs in O(n) #2

Morwenn opened this issue Aug 18, 2019 · 0 comments

Comments

@Morwenn
Copy link
Owner

Morwenn commented Aug 18, 2019

I had forgotten about it in the meantime, but I asked on /r/compsci whether the final make_heap ran with the anticipated time complexity. While it didn't give me a clear answer for that specific algorithm, user _--__ proposed a proof to show that the top-down version of make_heap runs in O(n).

Below is the relevant part of the discussion:

_--__

While I'm not 100% sure you are doing it quite right (haven't ironed out all the code details) it seems the principle you are trying to apply is as follows:

  • Initially, each element can be considered the root of a size 2^k - 1 balanced binary tree (for some k). The intention is to turn each such tree into a poplar.
  • Critically, the nodes of the tree occur in the array earlier than the root, so if we proceed sequentially through the array (creating poplars), whenever we get to an element it is actually the root of a semipoplar. So the corresponding tree can be transformed to a poplar with a simple sift call.
  • Calling sift on a semipoplar of size k takes O(log k) time.

Note that although we "proceed sequentially" through the array, two halves of the poplar being built are independent until we reach the root. In other words, the top-down make_poplar function

make_poplar(first, last) {
   make_poplar(first, (first+last)/2)
   make_poplar((first+last)/2,last)
   sift
}

is doing exactly the same thing (i.e. calling sift on the elements of the array in sequential order)!

This presentation makes it much easier to calculate the running time - we can invoke the Master theorem. More precisely, if T(n) is the running time on an array of length n, then this recursive presentation tells us that T(n) ≤ 2T(n/2) + log(n). The Master Theorem then immediately tells us that T(n) [which falls into Case 1 of the theorem] will be O(n).

Morwenn

Thanks, this does look like a start to prove the O(n) running time of make_heap, at least when there is a single poplar in the collection :)

Intuitively I'd think that the running time is O(n) no matter the number of final poplars after make_heap has ended: there can be at most O(log n) such poplars by definition so there can be at most log n calls to make_poplar from the top level, but on ever smaller sections of the original collection: every new poplar has a depth smaller than the previous poplar (except when there are two poplars of the same size at the end), which means that sift has fewer work to do for every new top-level poplar.

_--__

The argument goes through for each separate poplar - make_heap is O(k) where k is the size of the array relating to the poplar, so if your array is broken up into r poplars as n=k1+k2+k3+...+kr then the running time will be O(k1)+O(k2)+...+O(kr) = O(n). Basically, a single poplar is, in fact, the "worst case"

At some point I will likely have to integrate that into the README.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant