You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In order to do efficient sampling without replacement, or trivial uniform sampling, or iterate through histories in a small DAG, it would be useful to have integer indexing. For large hDAGs these integers may need to be very large, but we're already using arbitrary precision integers from boost for tree counting.
One way to implement a system of integer indices, with integers in [0, N) where N is the number of histories in the DAG, is the following recursive routine, where the number of subtrees below each node can be computed ahead-of-time with SubtreeWeight<TreeCount>.ComputeWeightBelow()
Tree is a collection of edges, which will describe the tree corresponding to the index.
GetSubtree(node, subindex):
For child clade c of node:
let cladentrees be the sum of the number of subtrees below each child node of c.
let cladeindex = subindex mod cladentrees
redefine subindex = subindex / cladentrees
For edge descending from c:
let nsubtrees be the number of subtrees below the child node of edge.
if cladeindex >= nsubtrees:
redefine cladeindex = cladeindex - nsubtrees
else:
add edge to Tree
do GetSubtree(edge.child, cladeindex)
break from inner loop
Given an integer index, the tree corresponding to that index can be computed by setting Tree empty and doing GetSubtree(UA_node, index).
By default, this allows a wraparound index, so every integer maps to a tree. By manually restricting to [0, N) we end up with bijective indexing.
In order to do efficient sampling without replacement, or trivial uniform sampling, or iterate through histories in a small DAG, it would be useful to have integer indexing. For large hDAGs these integers may need to be very large, but we're already using arbitrary precision integers from boost for tree counting.
One way to implement a system of integer indices, with integers in [0, N) where N is the number of histories in the DAG, is the following recursive routine, where the number of subtrees below each node can be computed ahead-of-time with
SubtreeWeight<TreeCount>.ComputeWeightBelow()
Tree
is a collection of edges, which will describe the tree corresponding to the index.GetSubtree(
node
,subindex
):c
ofnode
:cladentrees
be the sum of the number of subtrees below each child node ofc
.cladeindex
=subindex
modcladentrees
subindex
=subindex
/cladentrees
edge
descending fromc
:nsubtrees
be the number of subtrees below the child node ofedge
.cladeindex
>=nsubtrees
:cladeindex
=cladeindex
-nsubtrees
edge
toTree
edge.child
,cladeindex
)Given an integer index, the tree corresponding to that index can be computed by setting
Tree
empty and doing GetSubtree(UA_node
,index
).By default, this allows a wraparound index, so every integer maps to a tree. By manually restricting to [0, N) we end up with bijective indexing.
There's a python implementation here:
https://github.com/matsengrp/historydag/blob/88db496bb6420adf85fce78a861e28ab74031694/historydag/dag.py#L202
Credit to @clarisw for figuring this out, and for the python implementation.
The text was updated successfully, but these errors were encountered: