Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: show GBIF "enrichments" #3136

Closed
timrobertson100 opened this issue Oct 11, 2022 · 10 comments
Closed

Idea: show GBIF "enrichments" #3136

timrobertson100 opened this issue Oct 11, 2022 · 10 comments
Labels
enhancement Suggest an improvement to an existing function.

Comments

@timrobertson100
Copy link

Feature or enhancement

Data published through GBIF go through a series of checks and "enrichments", such as aligning scientificNames to the complete name as published in the nomenclature databases, flagging issues like a country coordinate mismatch, and detecting related records such as herbarium duplicates which may carry more data (e.g. images) or have a different identification/determination.

I gather TaxonWorks now makes it possible to publish in GBIF. It would be a fairly simple integration on specimen pages, to be able to present the user with the observations GBIF makes on the published records. This could be pulled using the /occurrence/N, /occurrence/N/verbatim and the /occurrence/N/experimental/related APIs. If the GBIF ID (N) is not known the datasetKey and local occurrenceID can be used e.g. like this. GBIF could provide a different endpoint that provides it all (JSON or HTML) if useful.

If it is of interest we would be happy to discuss how to do this, and how it should look, and offer a PR with implementation.

CC @mjy

Location

Specimen pages

Screenshot, napkin sketch of interface, or conceptual description

No response

Your role

Aggregator of content, looking for ways to give back

@timrobertson100 timrobertson100 added the enhancement Suggest an improvement to an existing function. label Oct 11, 2022
@mjy
Copy link
Member

mjy commented Oct 11, 2022

@timrobertson100 Thanks. Jim Beach's example of this yesterday got me thinking about it as well.

I think we're actually going to spin this off as a software agnostic module to do this, as a (more than) proof of concept. The idea will be Google maps widget style, anyone can add it to their page after including the JS, into their pipeline, via CDN etc.

This will further explore our thought of a DwC "vector" (data + headers) being a key UI driving data structure. I imagine that a minimal initialization will be possible with a single configuration option, among many other ways:

<div id='my_comparsion' class='vue_dwc_enrichments'
  data-source-dwc="https://api.taxonworks.org/api/v1/collection_objects/123?extend[]=dwc_fields"
  data-option-1
  data-option-n
>

@jlpereira is on holiday, we'll start poking at this for our updated CollectionObject page when he gets back.

@timrobertson100
Copy link
Author

timrobertson100 commented Oct 11, 2022

Thanks, @mjy

One idea could be GBIF offers this kind of JS widget that provides a view of the record that was published, and the enrichments we apply. You may prefer to make it a TaxonWorks-specific thing or even software agnostic, bringing in more than just GBIF, like the Specify example Jim showed. It might make sense for TW to do one, and still for GBIF to offer this as standard for other publishers.

Either way, let us know if we can help

@mjy
Copy link
Member

mjy commented Oct 11, 2022

Of course we'd love it if GBIF built widgets like this that would work in JS pipelines. We've spun off a handful of key internal libraries ourselves (see below for a couple examples), and it's in our mission to try and do this where possible, so I don't mind working on a very basic version of this one.

So maybe we'll try it and GBIF could refine, add code, fork it, use it for inspiration etc. as they see fit. Or if you beat us too it and provide a npm module we'll just use that.

@mjy
Copy link
Member

mjy commented Dec 5, 2022

@timrobertson100 Is there an API endpoint like /v1/occurence/:occurrenceId? We want to hit a restful resource using a unique id to get a single record.

@jlpereira
Copy link
Member

@timrobertson100 Is it possible to get the original data from GBIF, not just the interpreted?

@timrobertson100
Copy link
Author

timrobertson100 commented Dec 5, 2022

Is there an API endpoint like /v1/occurence/:occurrenceId?

Because occurrenceID from the publisher isn’t unique, you need the datasetKey as well, so .../occurrence/<datasetKey>/<occurrenceID> such as this example.

You can find datasetKey's using the registry API. If you don't have those, then you'd need to search using .../occurrence/search?occurrenceID=123 but you'd need to disambiguate results when there are more than one.

@timrobertson100 Is it possible to get the original data from GBIF, not just the interpreted?

Yes. We have 3 versions of the record:

  1. The raw view that the publisher provided as picked up on our crawling stream; .../occurrence/gbifID/fragment. This may return JSON text for DwC-A or XML or possible other text data. It's intended mainly for diagnostics.
  2. The verbatim view which captures the raw data (1) reformated into Darwin Core without interpretation beyond what is needed to represent in DwC; .../occurrence/gbifID/verbatim
  3. The interpreted view which you will be most familiar using; .../occurrence/gbifID or the method above

The API docs are also here.

Please say if you need more. Thanks.

@mjy
Copy link
Member

mjy commented Dec 5, 2022

@timrobertson100

Because occurrenceID is unique,

Do you mean not (globally) unique?

@timrobertson100
Copy link
Author

timrobertson100 commented Dec 5, 2022

Yes. Sorry about that. Corrected above

the gbifID is unique but what publishers provide in occurrenceID isn’t necessarily unique

@mjy
Copy link
Member

mjy commented Dec 15, 2022

@timrobertson100 https://github.com/SpeciesFileGroup/gbifference.

We have that widget going live in 0.30.0 this week. More docs coming there. Closing this for issue tracking there.

@mjy mjy closed this as completed Dec 15, 2022
@timrobertson100
Copy link
Author

Thanks for letting me know - and congrats. Please ping us if you need anything changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Suggest an improvement to an existing function.
Projects
None yet
Development

No branches or pull requests

3 participants