3 Document Prefix and Namespace

Stian Soiland-Reyes edited this page Mar 12, 2016 · 5 revisions

Rule 3: Help local identifiers travel well by documenting Prefix and Namespace

Permalink URI: https://w3id.org/id-rules/3

TODO: This wiki page lacks hyperlinks.

Data does not live in silos: it is reused, broken into parts and integrated with other data, most notably in database external references (aka “XRefs”), in the Semantic Web, and in publications (articles and research datasets). The Local Resource Identifier (Box 2) alone is insufficient for these tasks because it is only guaranteed to be locally unique. For instance, the LRI “9606” corresponds to numerous entities whose local accessions are based on simple digits, including: a Pubmed article, a CGNC gene, a PubChem chemical, as well as an NCBI taxon, a BOLD taxon, and a GRIN taxon.

Despite its vulnerabilities, the location-based identifier scheme (URI, Box 2) is the best available identifier form for machine-driven global data integration because it is a) widely adopted and b) its uniqueness is ensured by a single well-established name-granting process (DNS). Juty et al. [10] summarise why name-based global identifier schemes (e.g. URNs) have had poor uptake by comparison.

The length of URIs (Box 2) can make them unwieldy for tasks involving human readability, even within structured machine-parsable documents[11]. Compact URIs (CURIEs[12], Box 1) are a well established convention in such contexts (e.g. JSON-LD and RDFa) as they enable URIs to be understood and conveniently accessed. CURIEs complement URIs, rather than replace them. Therefore document the prefix which others may use to abbreviate your identifiers for human readability, wherever needed. If you are a database provider, it is in your best interests to document a) the prefix (Box 2) that you would like others to use and b) its binding to a resolving namespace (Box 2). Your chosen prefix should be unique, at least among datasets that are likely to be used in the same context. To facilitate this, we strongly recommend that you register your prefix and resolving namespace; Different prefix and namespace registries may be suitable depending on the kind of data.

This language of this section was edited from the original to provide more natural hyperlinks.