Allow names or only taxonomic ids as input to higher level methods? #58

sckott · 2020-07-10T20:38:32Z

An aspect that's different from R taxize is that I didn't want to bring the interactive part to this package. That is, the taxize get_* fxns have a prompt if there's more than one result for a taxon name against a given data source, letting the user pick which taxon. BUT, that's not reproducible and requires an interactive session. The various higher level functions in R taxize like classification() allow input of not just ids but taxonomic names because it passes names to get_* fxns which then result in a single taxonomic id before fetching the classification. However, here we don't have the prompt thing, so i think for higher level methods like Classification/Children we should only allow taxonomic ids as input. thoughts @Daniel-Davies ?

The text was updated successfully, but these errors were encountered:

Daniel-Davies · 2020-07-11T18:36:54Z

In a previous project, when I had this issue, I decided to use a "consensus" protocol on the results of the API. That is, from the list of results returned by the API, taking the most commonly occuring value is usually enough to satisfy the query. Taking a classification example; trying GNR with "panthera tigris" returns 11 separate results; for "species", all are in agreement of "panthera tigris". For genus perhaps, 6 results may have "panthera", while 1 will have "puma", so we take "panthera". Repeating this for each key gives a sort of approximation to the classification of the entered name from the multiple sources that turns out to be reasonably robust.

I think the ID approach is good, since it gives the user an option of determinism, and it definitely needs to be a part of the package. However, if someone is willing to accept the risks, could they also try a "most-common-value-wins" approach? I'm not very trained in Taxonomy so I don't know if this is valid...

sckott · 2020-07-13T18:30:42Z

That's a good idea for selecting names. We do that in the R get_ fxns, we look for an exact match, and if there is one return that match. It could be more complicated than that of course. So sounds like we should for the Ids class avoid the interactive/prompt thing and try a best effort approach to returning a single id.

For the higher level methods (e.g., classification) sounds like we go with ONLY allowing IDs as inputs, correct? so users have to get IDs first, either using IDs class or some other method

sckott mentioned this issue Jul 10, 2020

Ids class design #59

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow names or only taxonomic ids as input to higher level methods? #58

Allow names or only taxonomic ids as input to higher level methods? #58

sckott commented Jul 10, 2020

Daniel-Davies commented Jul 11, 2020

sckott commented Jul 13, 2020

Allow names or only taxonomic ids as input to higher level methods? #58

Allow names or only taxonomic ids as input to higher level methods? #58

Comments

sckott commented Jul 10, 2020

Daniel-Davies commented Jul 11, 2020

sckott commented Jul 13, 2020