Collection Descriptions Interest Group
This is the repository for the Collection Descriptions Interest Group.
About the group
The new Collection Descriptions standard will be a successor to the unratified draft Natural Collections Description data standard, whose development has been discontinued.
The day-to-day operations of the Interest Group is documented in this repository. You can also track and participate in the work of the group by watching this repository and monitoring the group's issues tracker.
Quick reference index
For quick reference, an index of classes and properties, and summaries of the current data model can be found in this Google sheet.
|Name||Affiliation||github name or Email|
|Matt Woodburn||Natural History Museum London||@mswoodburn|
|Name||Affiliation||github name or Email|
|Wouter Addink||Naturalis Biodiversity Center||@wouteraddink|
|James Beach||University of Kansas||beach53 AT gmail.com|
|Ana Casino||CETAF||ana.casino AT cetaf.org|
|Heather Cole||Ag Canada||Heather.Cole AT AGR.GC.CA|
|Dag Endresen||Univerity of Oslo Natural History Museum||@dagendresen|
|Falko Glöckler||MfN Berlin||falko.gloeckler AT mfn-berlin.de|
|Sharon Grant||Field Museum of Natural History||@rondlg|
|Quentin Groom||Meise Botanic Garden/TDWG/Synthesys+||@qgroom|
|Jana Hoffman||MfN Berlin||jana.hoffmann AT mfn-berlin.de|
|Sharif Islam||Naturalis Biodiversity Center||@sharifX|
|Janeen Jones||Field Museum of Natural History||jjones AT fieldmuseum.org|
|Kerstin Lehnert||Columbia University||@klehnert55|
|Holly Little||Smithsonian Institution / National Museum of Natural History||@hollyel|
|Anissa Lybaert||Ag Canada||Anissa.lybaert AT agr.gc.ca|
|James Macklin||Ag Canada||@jmacklin|
|Mareike Petersen||MfN||Mareike.Petersen AT mfn.berlin|
|Judith Price||CMN (retired)|
|Niels Raes||Naturalis Biodiversity Center||niels.raes AT naturalis.nl|
|Connie Rinaldo||Harvard University||crinaldo AT oeb.harvard.edu|
|Dave Smith||Natural History Museum London||d.a.smith AT nhm.ac.uk|
|Barbara Thiers||NYBG||bthiers AT nybg.org|
|Maarten Trekels||Meise Botanic Garden/Synthesys+||@mtrekels|
|Mike Trizna||Smithsonian Institution||@MikeTrizna|
|Melissa Tulig||NYBG||mtulig AT nybg.org|
|William Ulate||Missouri Botanical Garden / Centro de Investigación en Informática de la Biodiversidad (CRBio.org)||@WUlate|
|Wim van Dongen||Picturae||@cannedit|
|Sarah Vincent||Natural History Museum London||@essvee|
|Kate Webbink||Field Museum of Natural History||@magpiedin|
Collection Descriptions Standard (CD) Repository Navigation
Contents of this README.md page assist with understanding of how to contribute and where to find materials related to the development of the collections description data standard. Note that where needed, there exists a very brief description of contents you will find at each link shared below. This group manages development using GitHub as much as possible.
A (not so) brief description of our group
A detailed description of our rationale and goals, motivation, tasks, and strategy. This document outlines the goals and objectives of the task group and plan for reaching these goals.
The community is asked to review these and add to them if they see a missing use case.
This document gathers some of the key known issues to keep in mind as the CD standard is developed. It is meant to help guide and structure both design and implementation considerations of CD and resulting products that CD enables.
CD Way of Work
As much as possible, each group is taking on a self-selected task and will manage delivery of it as they choose (meeting as needed). They may link to working documents however they choose (google docs, other, ...) but will upload summary and completed documents directly to GitHub in the appropriate folder (e.g. meetings and documents) for that task. Where possible, links to external working documents should be added to the document links page to make them easily findable by TG members.
The CD TG as a whole will meet 1/x month. 4th Wednesday of each month (2019) except where holidays require date/time to change. Meetings are held 2x on that day (one Eastern-time friendly, one Western)
We started with a spreadsheet acting as a template for a Gantt-style chart of all our envisioned tasks with dependencies. From this chart, we created GitHub milestones where each group can manage tracking the issues and timelines related to that task. These tasks are now each grouped into GitHub projects.
- Landscape and requirements analysis
- Communication plan
- Data model
- Data standards
- Reference examples
- Develop extensions
To manage group activities in more detail, TG members can add new issues and allocate them to the appropriate project and milestone on the right-hand side of the form. This will mean that issues are displayed on the appropriate project page, and their statuses can be easily monitored.
- 2016 met at TDWG
- 2017 met at TDWG
- 2018 met at SPNHC-TDWGNZ, with some online meetings
- 2019 plans to meet in-person and at Biodiversity Next
- 2020 deliver a standard with implementations
Reference and Historical Materials
This current effort evolves from work started over 10 years ago by the Natural Collections Description Standard IG/TG group (NCD). Here we attempt to link to materials resulting from their efforts. These documents provide a foundation for the CD Group. Some have been copied over into this CD Repo to insure they do not get lost.
Old NCD repository
Other historical docs
- CETAF - Consortium of European Taxonomic Facilities. CETAF is the Consortium of European Taxonomic Facilities: a European network of Natural Science Museums, Natural History Museums, Botanical Gardens and Biodiversity Research Centres with their associated biological collections and research expertise.
- EML - Ecological Metadata Language. The Ecological Metadata Language (EML) is a metadata standard developed by the ecology discipline and for the ecology discipline. It is based on prior work done by the Ecological Society of America and associated efforts (Michener et al., 1997, Ecological Applications). EML is implemented as a series of XML document types that can by used in a modular and extensible manner to document ecological data. Each EML module is designed to describe one logical part of the total metadata that should be included with any ecological dataset.
- DiSSCo - Distributed System of Scientific Collections. DiSSCo is a new pan-European Research Infrastructure initiative of 21 European countries with a vision to position European natural science collections at the centre of data-driven scientific excellence and innovation in environmental research, climate change, food security, one health and the bioeconomy.
- GBIF - Global Biodiversity Information Facility. GBIF—the Global Biodiversity Information Facility—is an international network and research infrastructure funded by the world’s governments and aimed at providing anyone, anywhere, open access to data about all types of life on Earth.
GRBio - Global Repository of Biodiversity Repositories. GRBio is now shepherded by GBIF. Developments underway include implementing the CD standard to support sharing of collection-level metadata worldwide.
- ICEDIG - Innovation and consolidation for large scale digitisation of natural heritage - is an EU-funded project that aims at supporting the implementation phase of the new Research Infrastructure DiSSCo (“Distributed System of Scientific Collections”) by designing and addressing the technical, financial, policy and governance aspects necessary to operate such a large distributed initiative for natural sciences collections across Europe.
- iDigBio - Integrated Digitized Biocollections. An NSF-funded initiative to provide access and capacity/community support for digitization, data mobilization, and use of scientific collections both neontological and paleontological.
- MOBILISE - Mobilising Data, Policies and Experts in Scientific Collections. European Natural Science Collections host approximately 1.5 billion biological and geological collection objects, which represent about 80% of the known current and past biological and geological diversity on earth. The scope of this MOBILISE is to foster a cooperative network in Europe to support excellent research activities, and facilitate knowledge and technology transfer around natural science collections. This will prepare the ground for a future pan-European Distributed System of Scientific Collections (DiSSCo).
- RDF - Resource Description Framework. From the W3C Semantic Web: RDF is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed. Also see the TDWG Beginner's Guide to RDF.
- SPNHC - The Society for the Preservation of Natural History Collections [SPNHC] is an international society whose mission is to improve the preservation, conservation and management of natural history collections to ensure their continuing value to society.
- SYNTHESYS - Synthesis of Systematic Resources. SYNTHESYS+ is a European Commission - funded project, creating an integrated European infrastructure for natural history collections.
The current repository structure is described below.
├── README.md : Description of this repository ├── LICENSE : Repository license │ ├── charters : Interest Group and Task Group charters │ └── draft : Draft charters and historical versions │ ├── documents │ ├── draft : Working folder for draft documents │ ├── final : Final versions of group documents │ ├── historical : Historical and deprecated documents, and snapshots of exernal drafts in Google Docs etc │ └── DOCUMENT_LINKS.md : Links to working documents in Google Docs, Office 365 etc │ ├── meetings : Agendas and minutes of IG and TG meetings │ ├── reference │ ├── crosswalks : Crosswalks of existing and previous collection descriptions standards and initiatives │ ├── use_cases : Documented use cases for a collection descriptions standard │ └── REFERENCE_LINKS.md : Links to relevant information resources (publications, sites etc) │ ├── standard │ ├── data_model : Data model definitions, schemata and diagrams │ └── vocabularies : Controlled vocabularies, ontologies etc relevant to the standard │ └── .gitignore : Files and directories to be ignored by git
Collections Descriptions interest group. 2019. Collection Descriptions (CD), in development. Biodiversity Information Standards (TDWG) http://www.tdwg.org/standards/