-
Notifications
You must be signed in to change notification settings - Fork 79
Add a Tree.generate_star method #934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Leaving as a draft until #930 is merged, so I can uncomment a test for |
|
📖 Docs for this PR can be previewed here |
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks @hyanwong. These will be really handy, I think. The main feedback is we shouldn't have special cases, and try to make simple rules about the number of nodes and edges returned hold for all n >= 0.
I agree it's nice to have all these have a common prefix. I'm not so sure about "create_" though; perhaps "make_"?
I'm not sure how useful the empty one is to be honest, it's already a one-liner, and I don't think the equivalances you're stating will hold once the special cases at 1 and 0 have been removed.
python/tests/test_topology.py
Outdated
| """ | ||
|
|
||
| def test_star_equivalent(self): | ||
| for extra_params in [{}]: # , {'span':2.5}]: # Add after #930 is merged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's only one extra param, so loop seems unnecessary and makes the code less readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just uncommented the bit with "span" in, so it will test with with the default, and with a span now.
python/tskit/trees.py
Outdated
| flags=np.full(n, NODE_IS_SAMPLE, dtype=np.uint32), | ||
| time=np.zeros(n), | ||
| ) | ||
| if n > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should make this exception. We should always return a tree with num_leaves + 1 nodes and num_leaves edges. (which holds true for all num_leaves >= 0, right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to match unrank, @jeromekelleher . For num_leaves=1 that produces a TS with a single node and no edges. I think this is the right behaviour. We don't really want a unary node as the root, do we - it would be removed by simplify() if we were to run that on the TS anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Maybe we should just raise a ValueError if n < 2 for now, and avoid this discussion. It's not particularly interesting what the semantics are in these degenerate cases and we can always fix them later if it does become important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, could do, I suppose.
python/tskit/trees.py
Outdated
| @staticmethod | ||
| def create_star(n, *, span=1, branch_length=1, record_provenance=True): | ||
| """ | ||
| Create a single :class:<Tree> and associated tree sequence where the ``n`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete "and associated tree sequence" - see the note below.
python/tskit/trees.py
Outdated
| ) | ||
| if n & (n - 1) != 0: | ||
| raise ValueError("n must be an integer power of 2") | ||
| if n > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I don't like the special case. We shouldn't need it here, right?
python/tskit/trees.py
Outdated
| raise ValueError("n must be an integer power of 2") | ||
| if n > 1: | ||
| merge_nodes = list(range(n)) | ||
| while len(merge_nodes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(merge_nodes) > 0
Or
I think it's useful so that we can test e.g. |
Generating an empty ts is pretty trivial: |
That's not an obvious calling convention to people unfamiliar with tables. E.g. I might hope to get an empty ("null") TS by simply calling
It never struck me as lacking before. But I do think we have somewhat overlooked at the behaviour of various tskit functions for "null" tree sequences & trees, so this was a first step in doing something about that. |
The semantics around object initialisation are extremely tricky because of the interaction with the C API and Python C API. I'd rather avoid this can of worms for now. |
7078b48 to
abe3920
Compare
|
So it's not obvious to me how to create semi-balanced trees that match |
901139a to
b1914e0
Compare
jeromekelleher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks @hyanwong. Just need to fix up the changelog and we're good to merge.
b1914e0 to
ecaa210
Compare
Description
Add
Tree.create_star,Tree.create_balanced, andTree.create_emptymethods, as discussed on Slack.The
Tree.create_emptymethod is perhaps unlikely to be widely used, but I think is useful for completeness, and will also come in handy for testing (for example, I only just noticed that we currently bomb out with a non-obvious message when trying to plot an empty TS).Happy to make these class methods if that is preferred.