Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

harmonize formats for metadata schema and dataset creation #4451

Closed
pameyer opened this issue Feb 2, 2018 · 9 comments
Closed

harmonize formats for metadata schema and dataset creation #4451

pameyer opened this issue Feb 2, 2018 · 9 comments
Labels
Feature: Metadata Type: Suggestion an idea User Role: Superuser Has access to the superuser dashboard and cares about how the system is configured

Comments

@pameyer
Copy link
Contributor

pameyer commented Feb 2, 2018

User stories:

  • As a repository administrator / curator / metadata person, I would like to deal with hierarchical metadata schema in a way that is less awkward than a spreadsheet.
  • As a curator / metadata person, I would like to be able to configure which metadata fields should be included in particular export formats without having to do additional development.
  • As a data depositor, I would like there to be a clearer relationship between the definition of the metadata schema for my dataset and the API calls to create that dataset.
  • As an external (aka - developing stuff that interoperates with dataverse) developer / data depositor, I would like to be able to verify that my attempt to create or edit a dataset via API will match what Dataverse is expecting prior to making an API call and failing.
  • As a metadata person / dataverse developer, I would like to have the metadata crosswalk handled programmatically (rather than requiring a person to keep TSV files in synchronization with google spreadsheets).
  • As a researcher, I would like to reduce the amount of development time required for Dataverse to provide metadata to newly developed/discovered systems, and to adapt to updated versions of existing metadata schemas.

A first step towards addressing this could be replacing the TSV files used for metadatablocks / DatasetFieldTypes with a file format supporting hierarchical structures (JSON, YAML, XML), and updating the APIs in Dataverse reading these (with no additional information provided). This format should be something that either can be, or can be easily transformed into a form to, validate dataset creation/edit API input files.

  • For a second step, define fields in the metadata schema format to indicate a metadata element should be exported (or potentially a more detailed indication about which types of exporters the element should be sent to), and indicate which other metadata schema this element corresponds to.
  • For a third step, places where metadata is currently exported (HTML meta tags, schema.org, DataCite/EZID DOI registration / Handle registration?, OAI-PMH feeds, others?) would be modified to use the provided information (aka - remove hard-coding).
  • For a fourth step, new (or updated) dataset creation/edit/export APIs should be made compatible with the schema in the first step.
@pdurbin
Copy link
Member

pdurbin commented May 29, 2018

I keep thinking that if we tweak our tsv files a bit they'll look nice on GitHub but isaacs/github#848 has convinced me that it won't be easy to make them "beautiful and searchable":

screen shot 2018-05-29 at 8 37 21 am

@pdurbin
Copy link
Member

pdurbin commented Jun 19, 2018

#3168 is related in the sense that perhaps we should document our crazy TSV format before we switch to JSON or YAML or XML or whatever.

@jggautier
Copy link
Contributor

Just adding that #5960 is related to some of the points in this issue.

@pdurbin
Copy link
Member

pdurbin commented Feb 27, 2020

Related:

@BPeuch
Copy link
Contributor

BPeuch commented Mar 23, 2020

I have a question about metadata customization. We have created a new metadata block for our Dataverse installation at the State Archives, but while we can rely on it appearing in the JSON metadata output for now, we would like to see it included in the DDI output (and it would follow the standard by translating into DDI 2.5 approved fields).

pameyer wrote:

  • As a curator / metadata person, I would like to be able to configure which metadata fields should be included in particular export formats without having to do additional development.

I believe this means that customizing metadata output is not yet possible but it is under consideration. Is that correct?

@pdurbin
Copy link
Member

pdurbin commented Apr 6, 2020

@BPeuch fields in custom metadata blocks can be exported in Dataverse's native JSON format but are not automatically exported in any other format (such as DDI). If you would like to see some new fields appear in DDI output you should open an issue about this. 😄

@BPeuch
Copy link
Contributor

BPeuch commented Apr 7, 2020

Okay, I will. Thanks @pdurbin. So good to have you back aboard!

@qqmyers
Copy link
Member

qqmyers commented May 4, 2020

Just reading metadata related issues - fwiw: the OAI-ORE format automatically includes all the metadata in added metadata blocks, and associates them with whatever community vocab URL was specified in the block.

@cmbz
Copy link

cmbz commented Aug 20, 2024

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

@cmbz cmbz closed this as completed Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Metadata Type: Suggestion an idea User Role: Superuser Has access to the superuser dashboard and cares about how the system is configured
Projects
None yet
Development

No branches or pull requests

7 participants