-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source DOI as alternative identifier in EML? #15
Comments
Hi @timrobertson100, if I set the source dataset DOI as https://www.gbif-uat.org/dataset/0ef15f32-b41d-4274-ae96-eb5d0059fee6 I guess that means that GBIF would not mint a DOI for the dataset (in the example above it did that on first publication) and we would thus have a single DOI for both the source and the Pro:
Con:
I'm not against this approach. @sarahcd @timrobertson100 what do you think? |
This looks good to me. Open to feedback from @timrobertson100 based on how this is used by others. If we were using DataCite's schema we could do this: |
The GBIF IPT (or EML) does not have a field for relatedIdentifiers, only |
Ok, I think this is fine as is then. |
I tend to agree too |
@timrobertson100 so to be clear, you are fine that the subsampled Darwin Core version of the dataset is not assigned a new DOI, but reuses the one from the source? |
Yes, to me this makes sense justified by the fact that the use of dataset DOI in GBIF is representing the concept of a living dataset and not a specific version of that dataset (our downloads on the other hand are immutable datasets). Here, we are sharing a downsampled dataset to 1) aid in discovery (i.e. people with a taxonomic/geographic/temporal filter will find it in GBIF), and 2) so that the broad location data can contribute to scientific questions asked of the GBIF aggregate dataset. @dnoesgaard leads all the GBIF citation tracking - does this also seem reasonable to you please? |
We don't track citations of DOIs that aren't minted by GBIF—except a few prefixes assigned and used by specific IPTs only. That being said, most use of GBIF-mediated data happens via downloads, in which case a GBIF DOI is minted to represent the download (which of course could be a single dataset). How many datasets are we talking about here? Will they all have Zenodo DOIs? |
The current scope is 11 datasets, all from Zenodo. But in the future there could be more, with DOIs from the Movebank Data Repository |
Ok. At the moment, we will not be able to track citations of datasets with non-GBIF DOIs. The reason being that we track based on DOI prefix, so it's only feasible for us to do that when a prefix is (almost) exclusively used for datasets published in GBIF. When/if proper dataset citations are more common and included in DOI metadata, we might be able to pull them directly from Event Data—as a supplement, if nothing else. |
I think given the proposed solution here, it's ok to leave it to the DOI-granting repositories to track dataset citations. Regarding GBIF-mediated data downloads: It would be great if there was a way to recognize DOIs of data that contribute to these downloads. That and/or other ways to track use of their data would certainly get more movement ecologists interested in having their data on GBIF. |
As I mentioned, most use of GBIF-mediated data happens through downloads that are also assigned (unique) GBIF-minted DOIs. We will still track use of downloaded data containing records from datasets without GBIF-assigned DOIs. This information is also aggregated at the level of the contributing datasets and their publishers. The metadata of download DOIs contains Example: https://doi.org/10.15468/dl.5tm8an This download was cited by this paper: https://doi.org/10.1016/j.ecss.2022.107883 You'll see this reflected in the DOI metadata (https://api.datacite.org/dois/10.15468/dl.5tm8an) as:
|
That's cool! So if a paper cites a GBIF download, and that download includes a dataset with a non-GBIF-minted DOI, the GBIF dataset page for that dataset would show that paper as a citation. It's just that if the dataset was cited directly, that it wouldn't show up, since you don't track these? |
Correct |
Documented in function documentation: https://inbo.github.io/movepub/reference/write_dwc.html#metadata
|
Reopening this with a question regarding versioned DOIs. How will GBIF handle the following workflow:
|
Fwiw, citations are linked to the GBIF dataset key (via downloads), so they remain unchanged. |
Great, so no issues to be expected when updating the DOI. |
Correct, a dataset can change its DOI and we'll still link the citations that GBIF has tracked in the GBIF database. If others are tracking citations through a DOI metadata graph (e.g. DataCite) they won't be updated. |
Me:
@timrobertson100:
The text was updated successfully, but these errors were encountered: