Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

taxa.py and validations/check_scientificname_and_ids.py have overlapping functionality #27

Open
jdpye opened this issue Dec 6, 2022 · 5 comments

Comments

@jdpye
Copy link
Member

jdpye commented Dec 6, 2022

Looks like the OBIS / WoRMS checks are replicated across these two files.

taxa.py works against obis's API, but doesn't actually implement anything to do with ITIS, just has an empty placeholder function there.

I think taxa.py tries to name things and put things where a R obistools user might expect them, but that the check_scientificname_and_ids.py implementation is far more advanced and actually functions.

Suggestions on how to move forward? We could migrate the functionality from the check_scientificname_and_ids.py into the naming scheme from taxa.py , rename the check_scientificname_and_ids.py functions in-place, or decide not to be congruous with the R implementation, as we please.

@MathewBiddle
Copy link
Collaborator

This is tangentially related, but I wanted to reference it here in case you don't have an implementation for the taxa check. There is an example in there that uses multiple sources for the taxa check, which could be helpful?

ioos/ioos_code_lab#43

@jdpye
Copy link
Member Author

jdpye commented Dec 6, 2022

That's an interesting API... how's it guaranteed/updated/maintained? I like the idea of a one-stop shop but I don't know whether it'd be better to plug straight into WoRMS or use this as middleware? For cases beyond marine species the allure is obvious and the implementation's impressive... hmm.

@MathewBiddle
Copy link
Collaborator

There is a python package for worms: https://github.com/iobis/pyworms (we recently pushed it to conda)

@jdpye
Copy link
Member Author

jdpye commented Dec 8, 2022

The analogous function on the obistools side is match_taxa(names) where a list is passed in and an option to ask the user to resolve any ambiguous matches interactively is defaulted to TRUE.

Now, the functionality of the ambiguous matching doesn't look to be implemented in match_taxa() (see: https://github.com/iobis/obistools/blob/master/R/match_taxa.R ) so I think the first order of business is to wrap our chosen WoRMS checking function in a match_taxa(list) wrapper to emulate the actual implemented functionality of the R version.

Future versions can attempt interactive resolution of ambiguous matching as needed.

Still soliciting other opinions, I don't want to make a call unilaterally but my motivation is to keep the functionality between the Python and R versions as similar as we can.

@pauline-chauvet
Copy link

Hi, I agree that we should be as similar as possible to R. However if check_scientificname_and_ids.py does the work of R + more, I would tend to rename the check_scientificname_and_ids.py to correspond to the R one but make sure that the differences are well documented. I like the idea of a global resolver http://resolver.globalnames.org/api... we could improve the python function if we are sure that the resolver guaranteed/updated/maintained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants