Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing Histories #27

Open
willdumm opened this issue Oct 12, 2022 · 1 comment
Open

Indexing Histories #27

willdumm opened this issue Oct 12, 2022 · 1 comment

Comments

@willdumm
Copy link
Contributor

willdumm commented Oct 12, 2022

In order to do efficient sampling without replacement, or trivial uniform sampling, or iterate through histories in a small DAG, it would be useful to have integer indexing. For large hDAGs these integers may need to be very large, but we're already using arbitrary precision integers from boost for tree counting.

One way to implement a system of integer indices, with integers in [0, N) where N is the number of histories in the DAG, is the following recursive routine, where the number of subtrees below each node can be computed ahead-of-time with SubtreeWeight<TreeCount>.ComputeWeightBelow()

Tree is a collection of edges, which will describe the tree corresponding to the index.

GetSubtree(node, subindex):

  • For child clade c of node:
    • let cladentrees be the sum of the number of subtrees below each child node of c.
    • let cladeindex = subindex mod cladentrees
    • redefine subindex = subindex / cladentrees
    • For edge descending from c:
      • let nsubtrees be the number of subtrees below the child node of edge.
      • if cladeindex >= nsubtrees:
        • redefine cladeindex = cladeindex - nsubtrees
      • else:
        • add edge to Tree
        • do GetSubtree(edge.child, cladeindex)
        • break from inner loop

Given an integer index, the tree corresponding to that index can be computed by setting Tree empty and doing GetSubtree(UA_node, index).

By default, this allows a wraparound index, so every integer maps to a tree. By manually restricting to [0, N) we end up with bijective indexing.

There's a python implementation here:
https://github.com/matsengrp/historydag/blob/88db496bb6420adf85fce78a861e28ab74031694/historydag/dag.py#L202

Credit to @clarisw for figuring this out, and for the python implementation.

@willdumm willdumm added the enhancement New feature or request label Oct 12, 2022
@willdumm
Copy link
Contributor Author

Some leading ideas for inspiration are described on this issue, but not the actual algorithm:
matsengrp/historydag#3

@willdumm willdumm added proposed feature and removed enhancement New feature or request labels Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant