Skip to content
This repository has been archived by the owner on Jan 5, 2021. It is now read-only.

Metadata

Nathan Tallman edited this page Nov 4, 2019 · 14 revisions

Basic Principles

  • Resources are described and documented by metadata that is defined by work type schemas.
  • The data dictionary contains all possible fields that resources might have and define the field properties.
  • Some fields are designated as core fields, all work and collection resources have these fields, regardless of their work type.
  • Work type metadata schemas pull fields from the data dictionary and define customizations needed for different classes of resources.

MVP

  • During MVP, metadata is either editable (descriptive) or non-editable (administrative, technical, rights, etc.)
  • External controlled vocabularies will pull from source on regular basis and cached in CHO for MVP. (1.x will use a triplestore)
  • Local controlled vocabularies will be managed outside of CHO and imported/exported as CSV files. Term must be indexed for a resource to be created.
  • Metadata for works can be created and updated in batch with CSV spreadsheets. These requirements are in-flux and best documented with the batch file specifications.

Default behavior

  • If no titles are included (work or file set), the filename is used.
  • CSV headers must use the data dictionary label for field names.
    • Note: alternate_ids = Identifier
  • batch_id is not a CHO field, but assigned by the Digital Production Team. Most batch_id values should end in YYYY-MM-DD.
  • repeating fields are entered in the same cell, delimited with a double pipe || -- no spaces
  • The creator, which is repeating, may also contain creator roles. Roles are delimited from their creator with a single pipe |.
    • e.g. Doe, John|au||Doe, Jane|ill
  • File sets and Files that are not assigned metadata or listed in a CSV or assigned an identifier, but are in a bag
    • Representative file sets
    • _media files
  • As of 2019-06, the access_rights field name cannot be changed without breaking the application. The application uses this field to control access to collections and resources via application roles. This strategy may change in future...
  • Only an All Fields search will search extracted text, is excluded from all other searches