Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whales tree #45

Open
drabosky opened this issue Mar 28, 2017 · 16 comments
Open

whales tree #45

drabosky opened this issue Mar 28, 2017 · 16 comments

Comments

@drabosky
Copy link
Contributor

Whales tree now fails ultrametric test in ape. Pretty sure this didn't happen previously. It's definitely a tiny numerical issue and easily fixed, but we should fix tree by adding trivial branch length to tips to ensure it passes ape ultrametric check.

@josephwb
Copy link
Contributor

I wonder if something is up with ape. I am looking at the C++ code and am confused.

Anyway, whales passes our phyx ultrametricity test, as well as that of phylobase:

require(phylobase);
phy <- as(whales, "phylo4");
isUltrametric(phy);
[1] TRUE

@drabosky
Copy link
Contributor Author

It seems like the auto tol settings are not doing what they used to. There's a numerical differential in root-to-tip distances of 4e-6, but now causes tree to fail checks. This is an issue because other packages (like diversitree) use is.ultrametric and now reject this (and other) datasets.

Tree first fails with:
is.ultrametric(whales, tol = 1e-8)
all other manual tol > 1e-8 passes the check.

@drabosky
Copy link
Contributor Author

default tol = .Machine$double.eps^0.5 = 1.490116e-08
anything different about this?

@josephwb
Copy link
Contributor

Here is the function from apev3.5 (note same tol):

is.ultrametric.OLD <- function(phy, tol = .Machine$double.eps^0.5)
{
    if (!inherits(phy, "phylo"))
        stop('object "phy" is not of class "phylo"')
    if (is.null(phy$edge.length))
        stop("the tree has no branch lengths")

    phy <- reorder(phy)
    n <- length(phy$tip.label)
    e1 <- phy$edge[, 1]
    e2 <- phy$edge[, 2]
    EL <- phy$edge.length

    ## xx: vecteur donnant la distance d'un noeud
    ##     ou d'un tip a partir de la racine
    xx <- numeric(n + phy$Nnode)

    ## the following must start at the root and follow the
    ## edges contiguously; so the tree must be either in cladewise
    ## order (or in pruningwise but the for loop must start from
    ## the bottom of the edge matrix)

    for (i in seq_len(length(e1)))
        xx[e2[i]] <- xx[e1[i]] + EL[i]

    isTRUE(all.equal.numeric(var(xx[1:n]), 0, tolerance = tol))
}

and whales passes:

is.ultrametric.OLD(whales);
[1] TRUE

@josephwb
Copy link
Contributor

It seems the old function is available in apev4:

ape:::.is.ultrametric_ape(whales, tol=.Machine$double.eps^0.5, option=2, n=length(whales$tip.label));
[1] TRUE

The important arg is option. It factors into this switch:

    crit <- switch(option, {
        mn <- min(xx.tip)
        mx <- max(xx.tip)
        (mx - mn)/mx
    }, var(xx.tip))

Setting option to 2 checks whether the variance of the root-to-tip paths are < tol. If option is set to 1 it fails:

ape:::.is.ultrametric_ape(whales, tol=.Machine$double.eps^0.5, option=1, n=length(whales$tip.label));
[1] FALSE

So, for the time being you could wrap ape:::.is.ultrametric_ape to get the old behaviour while you figure things out.

@jonchang
Copy link
Contributor

Right, but I think the issue is that other people will be reading in the whales tree and be surprised when their (non-bammtools) packages report that the tree isn't ultrametric.

I wrote a small routine to try to get the tip lengths to cooperate but it's incredibly slow and I'm not convinced that it's actually going to work.

library(ape)

phy <- read.tree("whaletree.tre")

compute_tip_lens <- function(phy) {
    e1 <- phy$edge[, 1]
    e2 <- phy$edge[, 2]
    EL <- phy$edge.length
    xx <- numeric(n + phy$Nnode)
    for (i in seq_len(length(e1))) xx[e2[i]] <- xx[e1[i]] + EL[i]
    xx.tip <- xx[1:n]
}

compute_ultrametric_stat <- function(xx.tip) {
    mn <- min(xx.tip)
    mx <- max(xx.tip)
    (mx - mn)/mx
}

is_ultrametric <- function(phy, tol = .Machine$double.eps^0.5) {
    xx.tip <- compute_tip_lens(phy)
    isTRUE(all.equal.numeric(compute_ultrametric_stat(xx.tip), 0, tolerance = tol))
}

ctr <- 0
while (!is_ultrametric(phy)) {
    tiplen <- compute_tip_lens(phy)
    idx <- which.min(tiplen)
    phy$edge.length[idx] <- phy$edge.length[idx] - .Machine$double.eps
    ctr <- ctr + 1
    if (ctr %% 1000 == 0) {
        cat(ctr, compute_ultrametric_stat(tiplen), fill = T)
    }
}

@drabosky
Copy link
Contributor Author

Yeah, the incompatibilities in other packages will be a problem. I ran into this issue with final analyses in our FiSSE paper (non bammtools packages) and had to include a function check_and_fix_ultrametric to pre-process trees before using in diversitree etc:
https://github.com/macroevolution/fisse/blob/master/run_fisse/traitDependent_functions.R#L3-L29

@drabosky
Copy link
Contributor Author

I'm going to send a link to this thread to Emmanuel

@emmanuelparadis
Copy link

Hi guys,

Yes is.ultrametric() has changed with ape 4. This is explained in the help page (and also why the default has changed). Maybe you want to use is.ultrametric(phy, option = 2)?

@dwbapst
Copy link

dwbapst commented Mar 28, 2017

Ah, the new ((max - min/max)) default is invariant to the linear scale of the branch lengths. Well, that makes sense, I guess.

...So, why are supposedly ultrametric trees failing under the new default?

@drabosky
Copy link
Contributor Author

the default check seems to be overly sensitive relative to trees produced by BEAST, TreePL, etc at least w a few trees I've tried which I can imagine could cause problems for some users who will just use the defaults. Looks like Rich FitzJohn just made some changes to diversitree on Github on account of this but I haven't looked deeper to see if the package is now ape 4.0 compatible.

@josephwb
Copy link
Contributor

Doh, looked like none of use thought to RTFM! :bowtie:

@dwbapst
Copy link

dwbapst commented Mar 29, 2017

FYI, this may be relevant, if you haven't seen it already: Liam has just posted a new function for phytools that will force a tree to be ultrametric, so to pass is.ultrametric. Apparently a key issue is trees read from files with low numerical precision.

http://blog.phytools.org/2017/03/forceultrametric-method-for-ultrametric.html

@emmanuelparadis
Copy link

Thanks Dave! I've just posted a comment on Liam's blog.

@FabianRoger
Copy link

FabianRoger commented Mar 30, 2017

Hej,

I just wanted to let you know that the change seems to have broken my code, too. I for instance calculate phylogenetic diversity with the entropart package which contains a function Preprocess.Tree that in turn internal calls ape::as.hclust.phylo which calls is.ultrametric which now fails. I checked and the tree is ultrametric if I set the tol to tol=.Machine$double.eps^0.31 but not for lower tolerances. It works also with option = 2, but again it's a function that calls a function that calls is.ultrametric so i would have to redefine the function to make it work. I saw the suggestion from Liam, but the nnls option takes very long to calculate for my tree (16S tree with 8187 tips).

Update: for what it's worth, @drabosky function works in a reasonable time but it's not an ideal solution...

@richfitz
Copy link

the default check seems to be overly sensitive relative to trees produced by BEAST, TreePL, etc at least w a few trees I've tried which I can imagine could cause problems for some users who will just use the defaults. Looks like Rich FitzJohn just made some changes to diversitree on Github on account of this but I haven't looked deeper to see if the package is now ape 4.0 compatible.

Diversitree is in "appease CRAN only" maintenance mode and this my changes were pretty much to stop the function running on some tests (that aren't run on CRAN anyway). Thanks for everyone for finding the actual issue 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants