-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warning: taxomony id doesn't exists for NZ_AJTB01000092.1! #19
Comments
Can you show the relevant parts in your input files? Does the taxa exist in the taxonomy tree, too? |
I have downloaded the latest taxonomy and split it in to names and dump using Warning: taxomony id doesn't exists for NZ_AJTB01000101.1! and then this too.. |
This record has been removed from the NCBI nucleotide database (http://www.ncbi.nlm.nih.gov/nuccore/NZ_AJTB01000092.1). Usually we detect these cases by missing entries in the taxonomy dump - which I think is the case here. Note that the assembly_summary and taxonomy are not always in sync. |
That is the issue I am not using assembly_summary as my backbone, I am trying to build it with all available sequences plasmid contigs scaffold in all around 42080 species for bacteria and 5654 for viral. |
any solution for this? |
Can you show us the line for NZ_AJTB01000101.1 in the seqid2taxa.map file and lines around it? Is the corresponding tax id (1527292) in the nodes.dmp and names.dmp? |
Since I did not follow your manual online I made my own script and built the seqid2taxa.map (where is used all accession id from fasta header and got tax id from ncbi), and yes @fbreitwieser was right it has been removed from the database. and hence not seen in nodes.dmp. So the next question to ask is how is it still on their refseq website in fasta file. and how do i cater this issue to build centrifuge index? |
The thing is that RefSeq and the taxonomy database are not always at the same state. In Centrifuge the sequences with no mapping get added to the database with taxonomy ID 0 - though maybe we should just skip them. But the database should be built without problems, even if there is missing mapping. |
has any come across this error so far? both my input_sequence file and seqid2taxa.map files has this id, centrifuge-build is still spitting this error out..
The text was updated successfully, but these errors were encountered: