9 Document the identifiers you issue and use
Clone this wiki locally
Rule 9: Document the identifiers you issue and use
Permalink URI: https://w3id.org/id-rules/1
The global-scale identification cycle is a shared responsibility and provider/consumer roles often overlap in the context of data integration. Whether you issue your own identifiers or just reference those of others, you must document your identifier policies.
Supplemental Table S6 provides a set of questions that data providers and re-distributors can use to develop such documentation. Documentation should be published alongside and/or included together in a dataset description, as outlined in the recommendations for Dataset Descriptions developed by the W3C Semantic Web in the Health Care and Life Sciences Interest Group . For examples of such documentation see ChEMBL and Monarch ; the format may vary.
Questions that good identifier documentation should answer
|Scope||Question to answer||Recommendation|
|Provider||What types of entities are identified, what is the scope of these entities? (See Note 1)||Must include|
|Provider||What is your primary resolving namespace, if only one exists? If multiple, equally-valid resolving namespaces co-exist, what are these?
(e.g. INSDC.org has four such schemes as the entire dataset is fully represented by each of four authorities: NCBI, GenBank, ENA, and DDBJ)
|Provider||Are you aware of any alternate URIs (eg. different resolvers) that other groups use for your identifiers? (Even though alternates are not recommended for use, knowing what which URIs are equivalent facilitates data integration.)||Should include|
|Provider||What is the prefix you wish others to use if they reference your entities in an abbreviated way? If this prefix is registered, where? What is the compact URI you wish others to use? (See Note 2||Must include|
|Provider||What is your persistence policy regarding maintenance of the URIs? What is your persistence policy regarding the corresponding entities and metadata?||Must include|
|Provider||Can machine-readable representations of your entities be accessed? If so, where and in what formats?||Must include|
|Provider||What is the regular expression of your Local Resource Identifiers and URIs?||Strongly recommended|
|Provider||Are there relationships between your identifiers? Where are these described? (See Note 1)||Should include|
|Provider||Under what license are identifiers made available?||Should include|
|Provider||Does the lifecycle of the entities potentially include versioning, splitting, merging, or deprecation? How are these changes managed, communicated, and synchronized between those using that entity? (See Note 1)||Must include|
|Provider-Redistributor||Do you identify entities that are also identified by others? Who are these others? Where are these mappings found and who, if anyone, maintains them?||Strongly recommended|
|Provider-Redistributor||Do you reference identifiers that are issued by other authorities? If so, in what cases? How often are the identifiers synchronized?||Must include|
|Provider-Redistributor||If you reference identifiers that are issued by other authorities, what are the prefix-to-resolving-namespace mappings used? What is the source of these mappings (e.g. manual or identifier service). Where can your mappings be found?||Must include|
Note 1: Adapted from the Metadata Recommendations For Linked Open Data Vocabularies
Note 2: If your LRIs already have a colon, make it clear to users what your preferred corresponding compact URI syntax is. We recommend referencing the LRI as if it were already a compact URI. For instance, the case of
GO:0007049, the prefix
GO can be expanded to
http://purl.obolibrary.org/obo/GO_ and prepended to the numeric fragment (after
:) to yield http://purl.obolibrary.org/obo/GO_0007049, in accordance with their documentation.