Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
The Clearinghouse is intended to be an extensible collection/repository of metadata application profiles, mappings, and related specifications that aid or guide descriptive metadata conventions in digital repository collections. The Clearinghouse intends to make these example documents freely-available as downloadable files, as web pages are subject to change and web links are subject to "reference rot."
The development of metadata guidelines is often a broad community task and may involve managers of services, experts in the materials being used, application developers, and potential end-users of the services. Just as often, however, the creation of metadata guidelines is an institutional task, designed to facilitate an organization’s or department’s specific needs. Diverse needs within the metadata community has resulted in a proliferation of metadata practices, even across institutions with common metadata needs. While this disparate implementation is inevitable, it has negative impacts on the community at large, including—but not limited to—hobbled interoperability and a lack of standardization in term and descriptor usage.
The mission of this Clearinghouse project is to provide the metadata community with a central hub of varied example approaches to metadata guidance, as well as providing a way for the community to share work with peers in that community. Whether creating an application profile from scratch or updating a legacy profile, it can be helpful to review the metadata guidelines of other institutions and, where it makes sense to do so, align local practices with community standards.
The starting scope for this project is primarily focused on digital repository descriptive metadata documentation and specifications. This initial groundwork does not preclude the inclusion of other metadata types or sources, nor does it mean that future iterations of this work will not broaden its scope.
Potential users include anyone responsible for managing, maintaining, or analyzing digital collection metadata across a wide range of collection types, institutions, and communities. Metadata librarians, digital curators, archivists, preservation specialists, catalogers, and taxonomy/information management professionals are examples of the roles that may benefit from the Clearinghouse project. This work is intended for all audiences regardless of technical expertise.
Definition of Terms/Concepts
The Clearinghouse aims to present a broad overview of three different kinds of metadata guideline documents and procedures: application profiles (APs), mappings and crosswalks, and code.
An application profile is a document that outlines an institutional/consortial metadata schema practice. It defines metadata elements and properties, and delineates obligations and constraints for use. An application profile also establishes context for metadata implementers and aggregators. The document provides a human-readable summary of a schema’s characteristics, which is critical for metadata assessment planning, review, and revision. These guidelines establish a foundation for the development of approach to metadata assessment by clearly specifying requirements, ranges (e.g., controlled vocabularies and/or data types), and permissible cardinality. Application profiles can also include how external standards and schemas map to institutional metadata.
Mappings and Crosswalks
Mappings/crosswalks describe the "translation" of metadata from one distinct format, model, or standard to another. Mapping is done for data transformation (or data mediation) between a source and a destination for migration, metadata harvesting, or record exchange, as well as identifying redundant data for consolidation or elimination. Often, mappings and crosswalks are carried out by using transformative tools (see next section), but require clear and precise definitions of the elements in each standard before transformation. Mappings/crosswalks can be a part of APs or can stand alone as separate documents.
For the purposes of the Clearinghouse, “code” is represented by transformative tools, such as scripts, programming, or transformation methods to carry out the transformation of metadata from one format or style to another.
Documents providing best-practice guidelines for metadata.
Clearinghouse Data Standards and Types of Documents
Documentation, despite our best efforts, is always changing and never perfect; therefore, contributions to the Clearinghouse do not need to be an institution’s ideal or even current working practices, but rather instructive examples that others in the community can look to as they gather ideas for for their own strategic metadata guidelines and decisions. Working documents are actively encouraged for inclusion as static snapshots (such as PDFs), but others are welcome in the form of the following broad, preferred file formats:
Documents either published originally as a PDF document, or exported as a PDF or another format to generate a static snapshot of a living document (e.g., web pages, text files, spreadsheets).
Documents published originally by an institution as a text document (e.g., plain/rich text; open/proprietary formats, such as ODF and Microsoft Word).
Structured or encoded data files (e.g. spreadsheets, comma/tab-separated files, JSON, XML, XSL). Bear in mind that some file formats, like Microsoft Excel (.xls, .xlsx), contain embedded formatting that may hinder sharing across platforms.
Scripts, crosswalks, and other transformational tools. Using common coding languages minimizes the risk of limited interoperability.
Archival file formats (ZIP, RAR, TAR) that facilitate the bulk upload of many documents and/or tools.