New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tree distance between two samples #2627
Comments
How about this? I don't think we want a tree sequence version, it's too complicated to define. |
|
Gah, I didn't even know about that. Thanks, and sorry for the noise. |
I think maybe a note in the existing |
Oh, hang on a second, Should there be an option to return the actual length, rather than the edge count? |
That's why it was called "path_length" (something we wanted for the balance metrics), which was the number of steps you take in the path from one to the other. The other is very easy to calculate using the tmrca of the two nodes. I guess you could call this |
Yes, path_distance or "branch_distance", I guess. I'll look up what it's called in the phylogenetic literature. |
I think the fact that I misunderstood the method, and that you misunderstood what I meant, is a good argument for implementing something like |
SGTM. Could even have something like |
When writing the tskit introduction for phylogeneticists, I was reminded of something that has caught people out before: it is relatively common to want to know the length of the path in a tree connecting two samples (or indeed multiple pairs of samples). This can be done using
ts.diversity(mode="branch", windows="trees")
orts.divergence(mode="branch", windows="trees")
, but that's not terribly obvious to a phylogeneticist. For this rather simple measure, is it perhaps worth having a function that is essentially an alias to one of these methods?I guess something like
ts.path_length(sample_a, sample_B)
or (if we want aTree
version),tree.path_length(A, B)
. The docs could then say this is a simple alias for thedivergence
method with specific settings, which would give people an easier way in to finding out about the power of themode=branch
stats methods.Perhaps we don't want to add to an already weightily API, though. Thoughts e.g. @benjeffery ? Better suggestions?
The text was updated successfully, but these errors were encountered: