You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In January I encountered a problem with taxize API due to my number of bacterial taxa from witch I want to retrieve taxonomy (10k+) (I posted about my problem here : ropensci/taxize#907)
People advised me to use taxizedb, it works offline and should fix my problem. However, when I try to apply a simple command as:
test = classification(name2taxid(c(taxa$specie_ID)))
taxa is a dataframe with only one collumn named specie_ID, as flolow:
> test = classification(name2taxid(c(taxa$specie_ID))) Error in name2taxid(c(taxa$specie_ID)) : Some of the input names are ambiguous, try setting out_type to 'summary'
When I set out_type to summary; I got that:
> test = classification(name2taxid(c(taxa$specie_ID), out_type="summary")) Error in dplyr::summarize(): ℹ In argument: taxids = paste(.data$tax_id, collapse = "|"). ℹ In group 1: name = "Morganella sp.". Caused by error in .data$tax_id: ! Column tax_idnot found in.data`.
Backtrace:
Apparently Morganella sp. is not recognized by taxizedb. I'm not particularly familiar with dplyr of with taxize. So I just would like to know, how I could retrieve the taxonomy for each of my species of bacteria, preferentially in the form of a table with collumns like that:
Specie_ID Kindom Phyllum Class Order family genus
The text was updated successfully, but these errors were encountered:
A very small change to your approach should solve your issue: Run classification() on the id column of the name2taxid() output, not the whole object (maybe this is what you wanted to do in the first place, so it's just a typo thing?):
test = classification(name2taxid(c("morganella", "escherichia"), out_type = "summary")$id)
However, taxons with multiple taxids will inflate the number elements in your results which can cause problems in your downstream analysis. Because of this I would probably run name2taxid(out_type = "summary") first, resolve taxons with multiple taxids (investigate them manually, choose one and remove the rest from the tibble) and the then run classification()` on the data set with distinct taxons. I imagine there shouldn't be many taxons with multiple taxids.
Hello,
In January I encountered a problem with taxize API due to my number of bacterial taxa from witch I want to retrieve taxonomy (10k+) (I posted about my problem here : ropensci/taxize#907)
People advised me to use taxizedb, it works offline and should fix my problem. However, when I try to apply a simple command as:
test = classification(name2taxid(c(taxa$specie_ID)))
taxa is a dataframe with only one collumn named specie_ID, as flolow:
> head(taxa$specie_ID) [1] "Staphylococcus sp." "Acinetobacter sp." "Cutibacterium sp." "Sphingomonas sp." "Paenarthrobacter sp." [6] "Paracoccus sp."
However, I receive an error:
> test = classification(name2taxid(c(taxa$specie_ID))) Error in name2taxid(c(taxa$specie_ID)) : Some of the input names are ambiguous, try setting out_type to 'summary'
When I set out_type to summary; I got that:
> test = classification(name2taxid(c(taxa$specie_ID), out_type="summary")) Error in
dplyr::summarize(): ℹ In argument:
taxids = paste(.data$tax_id, collapse = "|"). ℹ In group 1:
name = "Morganella sp.". Caused by error in
.data$tax_id: ! Column
tax_idnot found in
.data`.Backtrace:
Apparently Morganella sp. is not recognized by taxizedb. I'm not particularly familiar with dplyr of with taxize. So I just would like to know, how I could retrieve the taxonomy for each of my species of bacteria, preferentially in the form of a table with collumns like that:
Specie_ID Kindom Phyllum Class Order family genus
The text was updated successfully, but these errors were encountered: