Replies: 1 comment 4 replies
-
When creating the mapping file, try to cut the |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear Diamond Community!
I am writing to ask for help while trying to perform query sequence taxon labeling based on a custom (Uniref-based) database.
Natively, the custom fasta header, that I would like to use, looks like this:
>UniRef100_Q6GZX3:t10493:Viruses Uncharacterized protein 002L n=3 Tax=Frog virus 3 TaxID=10493 RepID=002L_FRG3G
Linked to that, there is a tax map file that looks like the example below:
_accession.version{tab}taxid
UniRef100_Q6GZX3:t10493:Viruses{tab}10493
Using diamond version 2.1.4, the output of database creation is rather unusual:
Accessions in database 1813
Entries in accession to taxid file 1813
Yet there is 0 accession or sequence mapped to taxid...
db creation command:
/SSD/software/diamond/diamond-2.1.4/build/diamond makedb --in 10k_example.fasta --taxonmap 10k_example.taxmap --taxonnodes nodes.dmp --taxonnames names.dmp --db 10k_example
Then, when I perform a search against the database with the -f 102 format option, I only get 0 0 next to each query
UniRef100_Q6GZX3:t10493:Viruses 0 0
Realizing, that the custom fasta headers in the example are rather long and ugly, I repeated the entire procedure with a shortened version, where fasta headers were trimmed back to Accession number. Example:
>UniRef100_Q6GZX3
and the matching taxon mapping:
accession.version{tab}taxid
UniRef100_Q6GZX4{tab}10492
Upon the database generation, again I get 1813 sequences, with 1813 of them being present in taxid file and yet 0 accession to taxid mapping...
Search output is also exactly the same as above, 0 0 being printed next to every ID...
Can someone please help me, how I should perform taxon-enabled database generation and taxon labeling with diamond?
Beta Was this translation helpful? Give feedback.
All reactions