Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude species with no rank #64

Closed
mweberr opened this issue Feb 12, 2024 · 1 comment
Closed

Exclude species with no rank #64

mweberr opened this issue Feb 12, 2024 · 1 comment

Comments

@mweberr
Copy link

mweberr commented Feb 12, 2024

Hi,

is it possible to discriminate real species from subspecies, which do not have a rank ?
For example :
TAXID : 634452 is Acetobacter pasteurianus IFO 3283-01, no rank according to NCBI
TAXID: 438 : Acetobacter pasteurianus

But in the resulting taxon table both have the same species names and I am trying to exclude the first one from results.

Best, Michael

@sherrillmix
Copy link
Owner

I'm not sure the exact definition of real species versus subspecies but I guess getRawTaxonomy might get you at least part of the way there. For example:

> taxa=taxonomizr::getRawTaxonomy(c(438,634452),'accessionTaxa.sql')
>print(taxa)
$`   438`
                   species                      genus 
"Acetobacter pasteurianus"              "Acetobacter" 
                    family                      order 
        "Acetobacteraceae"         "Rhodospirillales" 
                     class                     phylum 
     "Alphaproteobacteria"           "Proteobacteria" 
              superkingdom                    no rank 
                "Bacteria"       "cellular organisms" 

$`634452`
                               no rank                                species 
"Acetobacter pasteurianus IFO 3283-01"             "Acetobacter pasteurianus" 
                                 genus                                 family 
                         "Acetobacter"                     "Acetobacteraceae" 
                                 order                                  class 
                    "Rhodospirillales"                  "Alphaproteobacteria" 
                                phylum                           superkingdom 
                      "Proteobacteria"                             "Bacteria" 
                             no rank.1 
                  "cellular organisms"
> isRealSpecies=sapply(taxa,function(xx)names(xx)[1]=='species')
>print(isRealSpecies)
   438 634452 
  TRUE  FALSE

That assumes the lowest rank of a "real species" is species while all other taxa are not "real species". That seems reasonableish at first glance but it wouldn't surprise me if there was some funny business somewhere in the taxonomy so be a bit careful with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants