Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IQtree uses an overly stringent constraint tree parsers #226

Open
corneliusroemer opened this issue Jun 10, 2024 · 2 comments
Open

IQtree uses an overly stringent constraint tree parsers #226

corneliusroemer opened this issue Jun 10, 2024 · 2 comments

Comments

@corneliusroemer
Copy link

corneliusroemer commented Jun 10, 2024

For my SARS-CoV-2 Nextclade trees I make extensive use of constraint trees (see e.g. previous issues #102, #144, . The following are some thoughts on making constraint tree validation throw a more helpful error message.

I was struggling to debug the error ERROR: ERROR: Taxon in constraint tree does not appear in full tree:

iqtree2 -ntmax 4 -s builds/BA.2.86/masked_masked_without_recombinants-delim.fasta -m GTR -ninit 2 -n 2 -me 0.05 -nt AUTO -redo -czb -g c.nwk -ninit 1 -n 1 > builds/BA.2.86/masked_masked_without_recombinants-delim.iqtree.log
ERROR: ERROR: Taxon  in constraint tree does not appear in full tree
ERROR: Bad constraint tree (see above)

As my constraint tree has around a thousand external and internal nodes the error message is hard to action. It seems that I trigger an edge case that isn't considered in the code, as there's a double space after Taxon in ERROR: ERROR: Taxon in constraint tree does not appear in full tree, it looks like in my case something is empty that shouldn't be empty.

I figured out a minimal reproducible example of a constraint tree that triggers this error:

(
  JN.1.18.5,
  (
    LS.1
  )something
);

When looking at it like this it's obvious: there's an unnecessary paren around LS.1, I should remove the ( ... )something. But as far as I know this is still valid newick.

In any case, interestingly, you throw a helpful error message when I get rid of the something.

Using this:

(
  JN.1.18.5,
  (
    LS.1
  )
);

throws ERROR: Redundant double-bracket ‘((…))’ with closing bracket ending at (line 5 column 1)

So there are two things I suggest:

a) throw a more helpful error message, similar to the redundancy one, in case of an internal node that is redundant and labelled (not redundant and unlabelled, this case you've got covered)
b) consider allowing redundant internal nodes. All newick definitions/grammars I've come across do not forbid redundancy. E.g. https://phylipweb.github.io/phylip/newick_doc.html or https://en.wikipedia.org/wiki/Newick_format

Note: I'm using IQ-TREE multicore version 2.3.4 COVID-edition for MacOS Intel 64-bit built May 23 2024

corneliusroemer added a commit to corneliusroemer/nwk-formatter that referenced this issue Jun 10, 2024
@corneliusroemer
Copy link
Author

I've added a rule to catch this issue early in my Newick formatting/linting utility: https://github.com/corneliusroemer/nwk-formatter (which helps me explain why certain constraint trees fail with IQtree)

@bqminh
Copy link
Collaborator

bqminh commented Jun 11, 2024

I see, thanks for pointing this out. Can you give us some simple input files to help debug this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants