Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define how to populate metadata #9

Closed
25 tasks done
peterdesmet opened this issue Apr 21, 2022 · 6 comments
Closed
25 tasks done

Define how to populate metadata #9

peterdesmet opened this issue Apr 21, 2022 · 6 comments
Assignees

Comments

@peterdesmet
Copy link
Member

peterdesmet commented Apr 21, 2022

Hi @sarahcd, here's a first attempt at a Movebank dataset on GBIF: https://www.gbif-uat.org/dataset/0ef15f32-b41d-4274-ae96-eb5d0059fee6

Dataset that is included:

  • Basis metadata:
    • Title + [subsampled representation]
    • Language + type + update frequency + publishing org: to be set by user in IPT
    • License: set, but issue License not recognized by GBIF #19
    • Description: copied from source dataset, first paragraph added
    • Contact: provided by user or first creator
    • Creators: original dataset authors/creators
    • Metadata provider: same as contact
    • Funding sources: provided as separate paragraph in source dataset and thus copied to IPT
  • Geographic coverage: not set, not directly available in source dataset
  • Taxonomic coverage: could be derived from data, not sure if worth it?
  • Temporal coverage: could be derived from data, not sure if worth it?
  • Keywords: copied from source dataset
  • Associated parties: not set
  • Project data: not set
  • Sampling methods: not set
  • Citations
    • Resource citation: left to automatic one by GBIF
    • Bibliography: not set, could potentially be derived from relatedIdentifiers in DataCite, not sure if worth it? no
  • Collection data: not applicable
  • External links: website set to Movebank Study ID (as a link)
  • Additional metadata:
@peterdesmet peterdesmet self-assigned this May 4, 2022
@sarahcd
Copy link
Collaborator

sarahcd commented May 10, 2022

  • Citations: I think it would be good to include citations for 1 or more related publications.
  • Taxonomic/temporal/geographic coverage: @timrobertson100 how does this impact search results in GBIF? Are taxonomy, time, etc queried based on what's in the occurrence table? If it would help with discovery, I imagine taxonomic/temporal scope are straightforward to derive from the data, geographic scope less so.

@sarahcd
Copy link
Collaborator

sarahcd commented May 10, 2022

In the datasets published in the Movebank Repository, related publications will be in the DataCite metadata like this:

<relatedIdentifier relatedIdentifierType="DOI" relationType="IsSupplementTo">thePaperDOI</relatedIdentifier>

@peterdesmet
Copy link
Member Author

Decided not to add related publications in bibliography, because:

  1. Although it is possible to get a related DOI (e.g. isSupplementTo in https://api.datacite.org/dois/10.5441/001/1.vp4cf4qg), we would also have to provide the human readable citation. That would require a call to another service (unclear which one, not DataCite)
  2. The relevant publication is often added or included in the description, so it will up in human readable form in the dataset description on GBIF

https://api.datacite.org/dois/10.5441/001/1.h0t27719 (has an other description):

Stabach JA, Hughey LF, Crego RD, Fleming CH, Hopcraft JGC, Leimgruber P, Morrison TA, Ogutu JO, Reid RS, Worden JS, Boone RB. 2022. Increasing anthropogenic disturbance restricts wildebeest movement across East African grazing systems. Front Ecol Evol. doi:10.3389/fevo.2022.846171

The ability to move is essential for animals to find mates, escape predation, and meet energy and water demands. This is especially important across grazing systems where vegetation productivity can vary drastically between seasons or years. With grasslands undergoing significant change ...

https://api.datacite.org/dois/10.5281/zenodo.5056105 (included):

MH_ANTWERPEN - Western marsh harriers (Circus aeruginosus, Accipitridae) breeding near Antwerp (Belgium) is a bird tracking dataset published by the Research Institute for Nature and Forest (INBO). It contains animal tracking data collected by the LifeWatch GPS tracking network for large birds (http://lifewatch.be/en/gps-tracking-network-large-birds) for the project/study MH_ANTWERPEN, using trackers developed by the University of Amsterdam Bird Tracking System (UvA-BiTS, http://www.uva-bits.nl). The study has been operational since 2018. In total 4 individuals of Western marsh harriers (Circus aeruginosus) have been tagged in their breeding area near the city of Antwerp (Belgium), mainly to study their habitat use and migration behaviour. Data are periodically uploaded from the UvA-BiTS database to Movebank and from there archived on Zenodo (see https://github.com/inbo/bird-tracking). See Milotic et al. (2020, https://doi.org/10.3897/zookeys.947.52570) for a more detailed description of this dataset. ...

@sarahcd
Copy link
Collaborator

sarahcd commented May 10, 2022

Agreed that given there is no easy solution, if it is in the description it's not needed in the bibliography.

@timrobertson100
Copy link

timrobertson100 commented May 11, 2022

Taxonomic/temporal/geographic coverage: @timrobertson100 how does this impact search results in GBIF?

@sarahcd - they don't. It's good practice to try and document these as "structured" descriptive metadata, but since GBIF is indexing at the data record level, we can provide much more detailed results than metadata-based search. If it is easy to do I'd suggest adding them but I don't think they are critical. They can always be added in a future improvement if necessary as well.

@peterdesmet
Copy link
Member Author

The EML mapping is now documented in the function description: https://inbo.github.io/movepub/reference/write_dwc.html#metadata

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants