Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I don't understand those Dendroscope network distances #13

Open
lutteropp opened this issue Nov 29, 2020 · 13 comments
Open

I don't understand those Dendroscope network distances #13

lutteropp opened this issue Nov 29, 2020 · 13 comments

Comments

@lutteropp
Copy link
Owner

lutteropp commented Nov 29, 2020

Why do trees with (unrooted) RF-distance of zero have non-zero Dendroscope distances? Does it have to do with the rooting somehow?

See screenshot from a results.csv file, used (in different settings) on a 4 taxon simulated tree:
Screenshot from 2020-11-29 22-36-35

I also attached the entire CSV file
small_tree_results.csv.txt

In general, I still need to figure out how to interpret those topological distances computed by Dendroscope. Like... which number is good? What's the theoretical maximum distance? Can we convert them into relative values somehow?

@stamatak
Copy link
Collaborator

stamatak commented Nov 30, 2020 via email

@celinescornavacca
Copy link
Collaborator

why do trees with (unrooted) RF-distance of zero have non-zero Dendroscope distances?

Because dendroscope works with rooted trees,so everything is defined on rooted trees (rooted RF-distances are defined on clusters instead of bipartitions)

Like... which number is good?

The distances describe different things, I attach a chapter from the book I wrote with Daniel and Regula describing them.
The hardwired distance is less interesting for us, the softwired one the easier to interpret, but please read the chapter and share you opinion too.

comparing.pdf

What's the theoretical maximum distance?

I do not think it exists a result for general networks, maybe something for some restrained topological classes of networks. But it will of no use here.

Can we convert them into relative values somehow?

No, I do not think so, see above.

@lutteropp
Copy link
Owner Author

lutteropp commented Feb 10, 2021

Thanks! Is it the book "Phylogenetic Networks: Concepts, Algorithms and Applications"?

I only got to read the PDF now, and I don't know what a hardwired vs. a softwired cluster is. Or a cluster... Hoping to find the definitions in another chapter of the book.

@lutteropp
Copy link
Owner Author

I found the book online, trying to speed-read relevant-looking parts of it.

@lutteropp
Copy link
Owner Author

lutteropp commented Feb 15, 2021

[Copy from Slack message, to have this here as well]

I have just figured out that we can easily plot relative distance versions (in range [0.0, 1.0]) of all topological network distances. When looking at the definitions in @celines network book, they all are of the form:
(|symmetric difference between A and B|) divided by 2.
-> We just need to change them to be (|symmetric difference between A and B|) divided by (|A union B|) and there we go. Then, we will get relative distances. These will make nicer plots.

@lutteropp
Copy link
Owner Author

@celinescornavacca Does this approach make sense? I am assuming that we can have two networks (on the same set of taxa) which have zero clusters in common.

@lutteropp
Copy link
Owner Author

Of course, the trivial clusters will always be in common. Which means we will never get a distance score of 1.0 by applying this trick. Is this a problem?

@lutteropp
Copy link
Owner Author

I believe it is not a problem, because with relative RF-distance it is kinda the same issue...

@lutteropp
Copy link
Owner Author

Do we need to exclude the trivial bipartitions/clusters when computing the network distances? I tried finding the definition for relative RF distance to check how it is done there, but I only found the definitions for absolute RF distance online... :-/

@stamatak
Copy link
Collaborator

stamatak commented Feb 16, 2021 via email

@lutteropp
Copy link
Owner Author

Thanks @stamatak! So we need to diverge from the distance definitions in Celines network book: We will explicitly discard the trivial bipartitions/clusters/whatever in our own distance implementations.

@stamatak
Copy link
Collaborator

stamatak commented Feb 16, 2021 via email

@celinescornavacca
Copy link
Collaborator

celinescornavacca commented Feb 16, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants