Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search equivalent names with name2taxid? #87

Closed
2 tasks done
alvanuffelen opened this issue Oct 12, 2023 · 2 comments
Closed
2 tasks done

Search equivalent names with name2taxid? #87

alvanuffelen opened this issue Oct 12, 2023 · 2 comments

Comments

@alvanuffelen
Copy link

alvanuffelen commented Oct 12, 2023

Prerequisites

  • make sure you're are using the latest version by taxonkit version
  • read the usage

Describe your issue

echo "Saccharomyces boulardii" | taxonkit name2taxid gives no taxid output.
Looking in the names.dmp, 'Saccharomyces boulardii' can be found under:
252598 | Saccharomyces boulardii | | equivalent name |.

Is there a reason why 'name2taxid' only searches for scientific names or synonyms?

if !(items[6] == "scientific name" || items[6] == "synonym") {

@shenwei356
Copy link
Owner

Oh, I thought scientific names were sufficient, Then someone said synonym were of equal importance.
Besides equivalent name, are other kinds of names widely used? Is there a need to support all of them?

$ cut -f 7 names.dmp  | csvtk freq -Ht -nr | csvtk pretty -Ht
scientific name       2533553
authority             694337 
synonym               252076 
type material         241458 
includes              78861  
equivalent name       58225  
genbank common name   30413  
common name           14663  
acronym               2118   
in-part               667    
blast name            230    
genbank acronym       25

@shenwei356
Copy link
Owner

Just removed the restriction of name types.

$ memusg -t -s 'echo "Saccharomyces boulardii" | taxonkit name2taxid --verbose '
14:04:26.645 [INFO] parsing names file: /home/shenwei/.taxonkit/names.dmp
14:04:29.704 [INFO] 3895687 names parsed
Saccharomyces boulardii 252598

elapsed time: 3.187s
peak rss: 912.28 MB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants