Request for comment: Exploring DataCite metadata enrichments #209
KellyStathis
announced in
Requests for Comment
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
DataCite is exploring approaches for improving metadata quality through community-based enrichment processes. The focus is on key metadata properties and use cases for citation, discovery, and reuse, and the premise is that any enriched metadata will co-exist with (not replace or override) original metadata.
We have started modeling how we might represent enrichments in our system, and want to share the initial plans with you to provide a general sense of the direction we're heading and ask for your input.
Why metadata enrichments?
DataCite is exploring systems to support community curation and enrichment workflows in an effort to advance the following key goals:
Learn more about this initiative and other work we’re doing to make metadata better.
DataCite is collaborating with COMET, a community initiative to collaboratively enrich metadata, to source an initial pool of enrichments that target key use cases for DataCite DOI metadata. Learn about the COMET model and COMET’s current enrichment projects.
How might metadata enrichments work?
Enrichments, sourced initially from COMET, would be represented as a series of enrichment records.
What might enrichment records look like?
An individual enrichment record would specify an action that describes what metadata to change and how. Each enrichment record would also include a source (the entity that produced the record) and a process (the workflow that produced the record, identified with a DOI).
An enrichment record might look like this:
{ "doi": "10.12345/dataset.0203273", "source": "comet", "process": "10.82461/process.6789", "action": "update_child", "field": "titles", "originalValue": { "title": "Dataset" }, "enrichedValue": { "title": "Nilpotence, radicaux et structures monoıdales" }, "created": "2025-06-22T10:00:00Z", "updated": "2025-06-22T10:00:00Z", "produced": "2025-06-22T10:00:00Z" }This mock enrichment record indicates a proposed update to the original DataCite DOI metadata for DOI 10.12345/dataset.0203273. The enrichment updates a title in the Titles property from “Dataset” to “Nilpotence, radicaux et structures monoıdales”. The source of this enrichment is COMET, which generated it through a process documented under DOI 10.82461/process.6789.
Each enrichment record field is described below:
What might enriched DataCite metadata records look like?
An enrichment record or series of enrichment records would be processed by DataCite to produce a newJSON metadata record. Individual enrichment records would be included with the enriched DataCite DOI metadata record for provenance. The original DataCite metadata record would be stored and accessible separately and would not be modified by the enrichment process. See below for a visualization of this workflow:
An enriched DataCite DOI metadata record with the above enrichment record applied might look like this:
{ "data": { "id": "10.12345/dataset.0203273", "type": "dois", "attributes": { "doi": "10.12345/dataset.0203273", "prefix": "10.12345", "suffix": "dataset.0203273", ...[Snipped for length]... "titles": [ { "title": "Nilpotence, radicaux et structures monoıdales" } ], "publisher": { "name": "arXiv" }, "container": {}, "publicationYear": 2002 "subjects": [ { "lang": "en", "subject": "Category Theory (math.CT)", "subjectScheme": "arXiv" }, ...[Snipped for length]... "created": "2022-03-19T08:11:37.000Z", "registered": "2022-03-19T08:11:38.000Z", "published": "2002", "updated": "2025-08-27T02:08:12.000Z" }, "relationships": { "enrichments": { "data": [ { "doi": "10.12345/dataset.0203273", "source": "comet", "process": "10.82461/process.6789", "action": "update_child", "field": "titles", "originalValue": { "title": "Dataset" }, "enrichedValue": { "title": "Nilpotence, radicaux et structures monoıdales" }, "created": "2025-06-22T10:00:00Z", "updated": "2025-06-22T10:00:00Z", "produced": "2025-06-22T10:00:00Z" } ] }, "client": { "data": { "id": "arxiv.content", "type": "clients" } }, ...[Snipped for length]... } } }Next steps
As we continue to develop our approach to community-based enrichment processes, we’d love to hear your feedback and suggestions. We invite you to share input via this survey or email. Input is welcome any time, and we will continue to keep this discussion updated as we move forwards.
Beta Was this translation helpful? Give feedback.
All reactions