Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchical heaps code terminology update #87

Open
shwestrick opened this issue Dec 13, 2018 · 4 comments
Open

Hierarchical heaps code terminology update #87

shwestrick opened this issue Dec 13, 2018 · 4 comments

Comments

@shwestrick
Copy link
Collaborator

shwestrick commented Dec 13, 2018

The terminology in the hierarchical heaps code needs some updates. Here are some terms that I think should change, and what I would change them to. I will add more as I encounter them.


hierarchical heap ⟹ superheap
I think we better understand now that what we once called a "hierarchical heap" is actually a piece of the hierarchy with a very specific structure which deserves its own abstraction. Also, I find the term "hierarchical heap" confusing because it doesn't seem to line up with our terminology in papers and discussions:

  • We typically think of each node of the hierarchy as being a heap. In this interpretation, the hierarchy defines a parent/child relationship between heaps.
  • However, the term "hierarchical heap" suggests an interpretation where each heap of the hierarchy is a subtree of nodes, with the hierarchy providing a containment relationship between heaps. This is not the interpretation we usually use.

UPDATE: These objects could be called "pseudo-heaps"; see discussion.


level ⟹ level/depth
Confusingly, sometimes "level" refers to one layer of a superheap; other times, it refers to a depth in the hierarchy. These should be distinguished. I think "level" is a good term for a layer of the superheap, but we should call "depth" for what it is.

highest ⟹ shallowest
lowest ⟹ deepest
Terms such as "high" and "low" are confusing for depths, since a low depth is high in the tree, and a high depth is low in the tree. "Shallow" and "deep" are unambiguous.


locallyCollectibleSize (LCS) ⟹ ?
This currently refers to the size (in bytes) of the heaps which are local to a particular worker. I think this name means "the size of heaps which could be collected locally right now" but I find the name confusing. What is a better name?

locallyCollectibleHeapSize (LCHS) ⟹ ?
This currently refers to a value relative to the locallyCollectibleSize (LCS) which is used to trigger local collections. Confusingly, it is not a size of a heap. Collections are currently triggered when the ratio LCHS/LCS goes below a set ratio (set by default to 2.0). I would prefer for this quantity to actually be a threshold in bytes. For example, call it the "localCollectionThreshold (LCT)", and then the condition for triggering a local collection would simply be LCS ≥ LCT.

UPDATE: These terms no longer exist in the implementation.


min chunk size ⟹ block size
We currently have a command-line argument for the minimum chunk size, which is actually the block size used throughout the runtime. Blocks seem to be fundamental enough to how the runtime works to simply expose a "block size" command-line argument.

@shwestrick
Copy link
Collaborator Author

This is somewhat in progress in my fork.

@shwestrick
Copy link
Collaborator Author

Following the merge of unstable-cd changes into master with 810aaf9, the situation has changed a little bit:

  1. The name for heaps probably still should be changed, but "superheap" no longer seems to be the right term.
  2. The level/depth confusion has been mostly cleared up
  3. LCS and LCHS no longer exist. Replaced with new GC policy based upon bytesAllocatedSinceLastCollection and bytesSurvivedLastCollection in GC_thread
  4. min-chunk-size ⟹ block-size still needs to be done.

@shwestrick
Copy link
Collaborator Author

(0ec7cce) min-chunk-size ⟹ block-size is done.

@shwestrick
Copy link
Collaborator Author

@typerSniper and I have settled on the term pseudo-heap for the struct HM_HierarchicalHeap objects.

I like this name, because pseudo-heaps are not heaps themselves, but rather artifacts of the heap implementation strategy:

  • A single "heap" (from the perspective of the theoretical heap hierarchy) can actually be made up of multiple pseudo-heaps, distributed across processors.
  • A pseudo-heap might not contain objects itself, but instead might defer to some other "representative" pseudo-heap. This indirection makes it possible to do a heap merge quickly, in only constant time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant