Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Some) large trees fail in ape 5.5 #14

Closed
erikrikarddaniel opened this issue May 26, 2021 · 5 comments
Closed

(Some) large trees fail in ape 5.5 #14

erikrikarddaniel opened this issue May 26, 2021 · 5 comments

Comments

@erikrikarddaniel
Copy link

After updating from ape 5.4 to 5.5 the read.tree function fails on large trees (e.g. the attached) with the following message:

numbers of left and right parentheses in Newick string not equal

I have checked the tree, and parentheses are balanced.

Some time ago, I reduced a tree that failed and at some point read.tree managed to read it, which is why I think it has to do with size.

Downgrading ape to 5.4 solves the issue.

gtdbtk.ar122.classify.tree.gz

@emmanuelparadis
Copy link
Owner

Yes, your tree is correct, but ape has difficulties handling it because some quoted labels include a semicolon which read.tree identifies as the end of the tree. The way quoted labels are treated is not trivial: the code in ape 5.4 had problems with some other trees; see issue #1. There is a fix in ape 5.5-1 but it also fails with your tree.

I'm working on a solution that will work for all trees.

@erikrikarddaniel
Copy link
Author

Thank you! I have personal experience with funny characters in names, so you have my full understanding! :-) I'll continue with 5.4 for the time being.

/D

@emmanuelparadis
Copy link
Owner

I found a solution that should please everybody. I pushed it here, so you can install version 5.5-1. If you prefer to test it without re-installing the whole package, you can do these three commands from R:

source("https://raw.githubusercontent.com/emmanuelparadis/ape/master/R/read.tree.R")
.treeBuild <- ape:::.treeBuild
.cladoBuild <- ape:::.cladoBuild

The last two lines are because these two functions are not exported by ape.

I'll check the code again later.

@erikrikarddaniel
Copy link
Author

OK. I tried to install 5.5-1 with devtools::install_version() but it wasn't found. I suppose you're still working on it and haven't released it.

After executing your code, with 5.4 loaded, read.tree still works.

In any case, 5.4 works fine for me, so I'll wait until there's an official 5.5-1 released.

Thanks for looking at this so quick!

/D

@emmanuelparadis
Copy link
Owner

emmanuelparadis commented May 28, 2021

I made another small modification which should fix the last problems. Here's an example to illustrate the behaviour of read.tree; the file tr.tre is:

(a  b:1,'; : (
)':1)'[:;(")]':1;

This is read as:

> tr <- read.tree("tr.tre")
> tr

Phylogenetic tree with 2 tips and 1 internal nodes.

Tip labels:
  ab, '; : ()'
Node labels:
  '[:;(")]'

Rooted; includes branch lengths.

> tr$tip.label
[1] "ab"       "'; : ()'"

> tr$node.label
[1] "'[:;(\")]'"

To summarize:

  • all line breaks are ignored;
  • spaces (and TABs) in non-quoted labels are ignored;
  • labels within (single) quotes can have all characters except single quotes.

For the moment, double quotes (or other fancy quotes) are not considered as quoting labels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants