Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fill CoL gap: class/order **Diplura** #577

Open
yroskov opened this issue Nov 2, 2023 · 13 comments
Open

Fill CoL gap: class/order **Diplura** #577

yroskov opened this issue Nov 2, 2023 · 13 comments
Labels
source dataset request Requests to change or add a new source dataset to the CoL

Comments

@yroskov
Copy link

yroskov commented Nov 2, 2023

For consideration by the Taxonomy Group (@dhobern & @olafbanki), 2023-11-02:

Dataset title
Global checklist published in the paper of 2021 (Ed pointed to this paper): Diversity, ecology, distribution and biogeography of Diplura by Alberto Sendra, Alberto Jiménez-Valverde, Jesús Selfa, Ana Sofia P. S. Reboleira
https://resjournals.onlinelibrary.wiley.com/doi/10.1111/icad.12480

Dataset contact & access
The checklist is available as a spreadsheet as a supplement to the paper
https://resjournals.onlinelibrary.wiley.com/doi/10.1111/icad.12480
Also, it might be available with extended data or as a databse from the authors.
@gdower may crawl data in the CLB.

Taxonomic group & CoL sector
It's gap in the CoL: kingdom: Animalia > phylum: Arthropoda > class: Diplura

Dataset description
Complete checklist with 1008 accepted species, which belong to 141 genera and 10 families

@yroskov yroskov added the source dataset request Requests to change or add a new source dataset to the CoL label Nov 2, 2023
@yroskov
Copy link
Author

yroskov commented Nov 2, 2023

@dhobern, as soon as TG give a green light, I'll contact authors for more information about the checklist and permission to use data in the CoL

@mdoering
Copy link
Member

mdoering commented Nov 2, 2023

I have converted the excel sheet to coldp here:
https://github.com/CatalogueOfLife/data-diplura

and imported into a private dataset here:
https://www.checklistbank.org/dataset/275935/imports

@yroskov
Copy link
Author

yroskov commented Nov 2, 2023

I have converted the excel sheet to coldp here

Thank you, @mdoering. I am not sure that the spreadsheet attached to the paper contains all data which we need (for example, synonyms and combinations are missing there). We need to contact the authors for more details after TG has confirm the interest.

@olafbanki
Copy link

Thanks Yuri! If you get a go from the COL Taxonomy Group could you also immediately make a request for either a CC-0 or a CC-BY license? The present license of the article is not workable: https://creativecommons.org/licenses/by-nc-nd/4.0/
@camiplata can assist you with some text explanation of either a CC-0 or CC-BY license.

@mdoering
Copy link
Member

Any progress here? Is there much debate needed if we have a true gap with no species at all currently? Anything is better than that

@DaveNicolson
Copy link

I understood that complete gaps in COL should be filled with the first credible/robust data source available, and it sounds like that's this paper. If the data can be handled through Checklist Bank into COL that sounds relatively quick and easy if the author agrees to it.

Otherwise we can add it to the queue for ITIS work, so it would take time.

@mdoering
Copy link
Member

I have digitised the list and it is accessible in CLB here:
https://www.checklistbank.org/dataset/275935/imports

941 accepted species, some infraspecies (all subspecies?) and a classification with subfamilies. There are no synonyms or vernacular names, but if the authors indeed have that somewhere these can easily be added in the next round and should not prevent us from filling this total gap.

@mdoering
Copy link
Member

Things that can still be improved in the ColDP file:

  • use subspecies for all infraspecific names
  • link subspecies to species properly

@camiplata
Copy link
Contributor

@yroskov should I contact the authors regarding the license ?

@yroskov
Copy link
Author

yroskov commented Oct 3, 2024

@camiplata, oh yes, it would be nice

@dhobern
Copy link

dhobern commented Oct 3, 2024

Thanks @camiplata, that would be great. Here is what I wrote in November last year:

Dear Ana Sofia,

We are pleased to see the publication of the new paper from yourself and your colleagues on world diversity in Diplura. On behalf of the Catalogue of Life team, I am writing to ask whether you are willing to make the spreadsheet in the supplementary materials available for use as an update to this section of the Catalogue of Life Checklist. This would have several consequences, including:

  • A version of your dataset would be discoverable, searchable and downloadable via a DOI on the ChecklistBank website (https://checklistbank.org/).
  • Your species list would become the structure for Diplura data in GBIF and various other biodiversity data infrastructures.
  • We would be able to provide feedback and reporting on the use of your data through Catalogue of Life.
  • If you wish, we could also assist with getting the data into a platform that would make it easier to maintain and edit it collaboratively, or alternatively, we would be able to accept new versions of the spreadsheet with updates and corrections and show these as new versions (with a change history) in ChecklistBank.

If you are interested, I would be happy to discuss this further. The only obvious issue I can see right now is that your paper is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives licence. We can certainly provide clear attribution to you, your authors and to the paper, but the Catalogue of Life Checklist is published under a plain Creative Commons Attribution licence to ensure that these basic reference data can be used consistently anywhere by anyone. In principle, a paid environmental consultancy would be barred from using NonCommercial datasets. In the same way, we cannot handle a NoDerivatives licence, since we expect to support reuse in multiple formats and allow for users to access subsets or combinations of data from multiple species lists.

If you and your colleagues are comfortable with a Creative Commons Attribution (CC BY) licence for the supplementary spreadsheet, we can handle all the other aspects.

Many thanks for your consideration.

Best wishes,

Donald

@camiplata
Copy link
Contributor

camiplata commented Oct 4, 2024

@dhobern my apologies I didn't remember you had been in contact with them. Did they give you any kind of answer?

@dhobern
Copy link

dhobern commented Oct 5, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
source dataset request Requests to change or add a new source dataset to the CoL
Projects
None yet
Development

No branches or pull requests

6 participants