Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taxon name error #11

Closed
hansir8 opened this issue Nov 6, 2020 · 9 comments
Closed

Taxon name error #11

hansir8 opened this issue Nov 6, 2020 · 9 comments

Comments

@hansir8
Copy link

hansir8 commented Nov 6, 2020

Dear Bui Quang Minh:
I am meeting a error report from IQTREE2 (IQ-TREE multicore version 2.1.2 COVID-edition for Linux 64-bit built Oct 22 2020). Here is the partial detailed reports from that.
--------->
Rate parameters: A-C: 2.09932 A-G: 3.94693 A-T: 1.47380 C-G: 1.38623 C-T: 5.05624 G-T: 1
Base frequencies: A: 0.280 C: 0.189 G: 0.208 T: 0.323
Site proportion and rates: (0.085,0.021) (0.048,0.194) (0.028,0.194) (0.042,0.199) (0.076,0.4
Parameters optimization took 99 rounds (1005.335 sec)
Computing ML distances based on estimated model parameters...
Computing ML distances took 0.072589 sec (of wall-clock time) 1.404860 sec(of CPU time)
Computing RapidNJ tree took 0.012983 sec (of wall-clock time) 0.263654 sec (of CPU time)
ERROR: Alignment sequence AY278489|SARS-CoV_GD01|Betacoronavirus does not appear in the tree
ERROR: Alignment sequence AY390556|SARS-CoV_GZ02|Guangzhou|Betacoronavirus does not appear in
ERROR: Alignment sequence AY485277|SARS-CoV_Sino1-11|Betacoronavirus does not appear in the tr
ERROR: Alignment sequence AY508724|SARS-CoV_NS-1|Betacoronavirus does not appear in the tree
ERROR: Alignment sequence KT444582|WIV16|Yunnan|Betacoronavirus does not appear in the tree
ERROR: Alignment sequence KY417146|Rs4231|Yunnan|Betacoronavirus does not appear in the tree
ERROR: Alignment sequence KY417151|Rs7327|Yunnan|Betacoronavirus does not appear in the tree
ERROR: Alignment sequence KY417152|Rs9401|Yunnan|Betacoronavirus does not appear in the tree
ERROR: Alignment sequence MK211376|BtRs-BetaCoV/YN2018B|Yunnan|Betacoronavirus does not appear
(KJ473815|BtRs-BetaCoV/GX2013|Guangxi|Betacoronavirus,JX993988|Yunnan2011|Betacoronavirus,((((
ERROR: Tree taxa and alignment sequence do not match (see above)
----------->

It is about the taxon name error. However, it run accurately in IQTREE (Version 1.6.12). I am waiting for a reply from you. Tanks.
Sincerely
Zhenzhi Han

@bqminh
Copy link
Member

bqminh commented Dec 3, 2020 via email

@hansir8
Copy link
Author

hansir8 commented Dec 4, 2020

Dear Bui Quang Minh:
Unfortunately, I also meet the same error in the IQ-TREE2 (multicore version 2.1.0 for Linux 64-bit built Jul 18 2020).

Tanks.
Sincerely
Zhenzhi Han

@bqminh
Copy link
Member

bqminh commented Dec 4, 2020 via email

@hansir8
Copy link
Author

hansir8 commented Dec 5, 2020

I used older version you told me and meet another error in the IQ-TREE2 (IQ-TREE multicore version 2.0.8 for Linux 64-bit built Jul 9 2020).
------->
Creating fast initial parsimony tree by random order stepwise addition...
ERROR: phylotreepars.cpp:1053: virtual int PhyloTree::computeParsimonyTree(const char *, Alignment *, int *): Assertion `leafNum == 3' failed.
ERROR: STACK TRACE FOR DEBUGGING:
ERROR:
ERROR: *** IQ-TREE CRASHES WITH SIGNAL ABORTED
ERROR: *** For bug report please send to developers:
ERROR: *** Log file: 3.fas.log
ERROR: *** Alignment files (if possible)
-------->
And the older version before 2.0.7, only contain the source code. Could you send me the compiled version of that for me? So that I could assess the error for older version of IQTREE2.
Tanks.
Sincerely
Zhenzhi Han

@bqminh
Copy link
Member

bqminh commented Dec 5, 2020 via email

@hansir8
Copy link
Author

hansir8 commented Dec 10, 2020

Dear Bui Quang Minh:
I have checked five versions, including the version 2.0.4-2.0.8. And I found the error begin at the version 2.0.8. Previous versions could run normally for this dataset. Please check the error information about the new versions described above.

Tanks.
Sincerely
Zhenzhi Han

@iqtree
Copy link
Collaborator

iqtree commented Dec 10, 2020 via email

@andersgs
Copy link

I have run into the same issue. Specifying --keep-ident seems to fix things. My guess is that the comparison between alignment and the tree is being done with the processed alignment.

tamaramagdr pushed a commit that referenced this issue Oct 7, 2021
1. computeMLDistances no longer writes a distance file (it was
   usually written *again* in computeBioNJ; see change #2).
2. runTreeConstruction can no longer assume that the distance
   file has been written by computeMLDistances, so (if
   iqtree->computeBioNJ has not been called, it must write it,
   even if params.user_file was false, via a call to
   iqtree->printDistanceFile).
3. PhyloTree now has a num_packets member (which tracks, how
   many packets to divide work into: it can be the same as
   num_threads, but is generally more; at present by a factor
   of 2).  Member functions such as getBufferPartialLhSize
   must allocate per packet rather than per thread.  See in
   particular changes #9, #10 and #11.
4. Removed a little commented-out code from PhyloTree.cpp
   (And moved for-loop iteration variables that could've
    been in-loop, but weren't in-loop, in lots of places).
   (Likewise in phylotreesse.cpp).
5. Removed redundant assignments to nullptr (particularly in
   PhyloTree::deleteAllPartialLh); these aren't needed now
   Because aligned_free sets the pointer to nullptr for you.
6. Client code that set IQTree::num_threads directly now does
   so via setNumThreads (e.g. in phylotesting.cpp)
   (Also in PhyloTree::optimizePatternRates) (because
   setNumThreads also sets num_packets).  For now,
   num_packets is set to 2*num_threads (see change #9).
7. Removed dead pointer adjustments in the "any size" case in
   PhyloTree::computePartialParsimonyFastSIMD.  These had been
   left over from before that member function was vectorised
   (The pointers are recalculated at the start of the next
   Iteration of the loop, so adjusting them is a waste of time).
   (Hopefully the compiler was optimizing the adjustments away).
8. Fully unrolled the size 4 case in productVecMat
   (In phylokernelnew.h).
9. computeBounds chooses sizes for blocks of work
   (Based on the number of packets of work as well as
   the number of threads to be allocated).  For now, it is
   assumed that the number of packets of work is divisible
   by the number of threads.
10. PhyloTree::computeTraversalInfo calculates buffer sizes
    Required in terms of num_packets rather than num_threads.
11. #pragma omp parallel for ... and corresponding for loops
    are now for packets of work not threads.
    (a) PhyloTree::computeTraversalInfo
    (b) PhyloTree::computeLikelihoodDervGenericSIMD (*)
        (Two separate #pragma omp parallel for blocks)
    (c) PhyloTree::computeLikelihoodBranchGenericSIMD (*)
    (d) PhyloTree::computeLikelihoodFromBufferGenericSIMD (*)
    (e) PhyloTree::computeLikelihoodDervMixlenGenericSIMD (*)
    (f) PhyloTree::computeNonrevLikelihoodDervGenericSIMD (*)
        (Two separate #pragma omp parallel for blocks)
    (g) PhyloTree::computeNonrevLikelihoodBranchGenericSIMD (*)
        (Two separate #pragma omp parallel for blocks)

    The ones marked with (*) now use reductions (aimed at
    double) where possible, rather than #omp critical section.
    I've got rid of the private(pin,i,c) stuff by declaring
    Those variables local to the loops that use them.
    (This means doing horizontal_add per-packet rather than
    after all the packets are processed).
    They all use dynamic (rather than static) scheduling.
tamaramagdr pushed a commit that referenced this issue Oct 7, 2021
was necessary (see #2 through #8 and particularly #5 below), and also
drafted some additional "progress-reporting" (see #9 through #11):
1. If -mlnj-only is found on the command-line, Params::compute_ml_tree_only
   will be set to true (in parseArg(), in utils/tools.cpp).
2. initializeParams doesn't call computeInitialTree if compute_ml_tree_only
   is set to true.
3. You can't set the root of a tree (if you don't yet have one), a bit later
   in the same function (and also in IQTree::initSettings).
4. Added PhyloTree::ensureNumberOfThreadsIsSet (and updated repetitive code
   that was doing what it does, in several other places).  This forced some
   updates in other files, such as main/phylotesting.cpp.
5. Added PhyloTree::ensureModelParametersAreSet (as the same steps need to
   be carried out somewhat later if there isn't an initial tree before
   ML distances are calculated). It returns a tree string.
6. In runTreeConstruction, when compute_ml_tree_only is set, negative
   branches are resolved, and #4 and #5 are called only AFTER the tree
   has been constructed.
7. In IQTree::initCandidateTreeSet the tree mightn't be a parsimony tree
   (I think if you've combined -nt AUTO and --mlnj-only) as such, but there
   will be *a* tree.  The list of cases wasn't exhaustive any more.
8. Added a distanceFileWritten member variable and a getDistanceFileWritten
   Member function to PhyloTree.
9. (This and the following changes are progress reporting changes).
   Added member functions for progress reporting to PhyloTree:
   (a) initProgress  (pushes where you are on a stack, and starts
                      reporting progress, if there's now one level
                      of progress reporting on the stack)
   (b) trackProgress (bumps up progress if progress stack depth is: 1)
   (c) hideProgress  (called before you write log messages to cut)
   (d) showProgress  (called again after)
   (e) doneProgress  (pops, and stops reporting progress, if the last
                      level of progress reporting was just popped)
   The supporting member variables are progressStackDepth and progress.
9. IQTree::optimizeNNI uses the functions added in change #9 to report
   Progress (problem here is that MAXSTEPS is a rather "high" guess
   (For n sequences it is ~2n, when the best guess for how many iterations
   There will be, with parallel NNIs, is on the order of ~p where p is the
   worst-case "tip-to-tip" path length of the tree - probably a lot less.
10.PhyloTree::testNumThreads also uses the functions added in change#9 to
   Report how many threads it has tried (though, for now, it badly
   over-reports how long it thinks it will take) (because it thinks it will
   do max_procs iterations and each will take as long as the last, but,
   Really, it'll do max_procs/2, or so, and they go faster and faster as
   there are more threads in use in later steps - one more each step).
11.PhyloTree::optimizeAllBranches reports progress (via the functions
   added in change#9).  Normally it reports progress during parameter
   optimisation (because I haven't written "higher-level" progress
   reporting for that yet).

There are some potential issues though:
1. The special-case code for dealing with "+I+G" rates doesn't yet have
   a counterpart when compute_ml_tree_only is set (in runTreeConstruction).
2. Likewise, the code for when (params.lmap_num_quartets >= 0)
   (No counterpart when compute_ml_tree_only is set, yet) (this too
    is in runTreeConstruction).
   (I haven't figured out how to test the "counterpart" versions of those
    yet, which is why I haven't written them)
3. If you pass -nt AUTO I'm not sure how many threads the NJ (or whatever)
   step will use (I think it's all of them), and the ML distance
   calculations also "use all the threads" (because the thread count's not
   set when that code runs either).  Both parallelise... well... but I'm
   not so sure it's a good idea that it hogs all the CPU cores like that.
@thomaskf
Copy link
Collaborator

Dear Zhenzhi Han,
The issue of ERROR: Alignment sequence xxx does not appear in the tree" has been fixed in version 2.2.0.2 (https://github.com/iqtree/iqtree2/releases/tag/v2.2.0.2). Thank you!
Thomas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants