Skip to content

Commit

Permalink
New test added to reveal a bug involving encoding splits on pruned tr…
Browse files Browse the repository at this point in the history
…ees.

If the first taxon is not in the tree when the splits are encoded,
the mask in NormalizedBitmaskDict will be wrong. Failure to have the
1-bit set will cause the mask to be complemented. Subsequent "anding"
of that mask to any split's bitmask will result in a 0 key in the
NormalizedBitmaskDict. Thus there will be only one key in the dict and
bitmask-dependent operations (such as clade summarization and tree-to-tree
distances) will be incorrect.
  • Loading branch information
mtholder committed Jul 3, 2013
1 parent c6e6a8c commit 7d4b3a5
Showing 1 changed file with 19 additions and 0 deletions.
19 changes: 19 additions & 0 deletions dendropy/test/test_splits_on_incomplete_leaf_sets.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,25 @@ def testUnrooted(self):
def testRooted(self):
self.check("Rooted", "incomplete_leaves_rooted")

def testPrunedThenEncoding(self):
from cStringIO import StringIO
FIX = False
inp = StringIO('''(a,b,c,(d,e));
(b,d,(c,e));''')
first, second = dendropy.TreeList.get_from_stream(inp, schema='newick')
# prune tree 1 to have the same leaf set as tree 2.
# this removes the first taxon in the taxon list "A"
retain_list = set([node.taxon for node in second.leaf_nodes()])
exclude_list = [node for node in first.leaf_nodes() if node.taxon not in retain_list]
for nd in exclude_list:
first.prune_subtree(nd)
# the trees are now (b,c,(d,e)) and (b,d,(c,e)) so the symmetric diff is 2
if FIX:
dendropy.treesplit.encode_splits(first, lowest_relevant_bit=2)
dendropy.treesplit.encode_splits(second, lowest_relevant_bit=2)
self.assertEquals(2, first.symmetric_difference(second))


if __name__ == "__main__":
unittest.main()

0 comments on commit 7d4b3a5

Please sign in to comment.