Skip to content

Commit

Permalink
editorial work on first c.300 lines
Browse files Browse the repository at this point in the history
  • Loading branch information
pwin committed Mar 15, 2019
1 parent 4a85130 commit 37c98af
Showing 1 changed file with 30 additions and 29 deletions.
59 changes: 30 additions & 29 deletions dcat/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -39,35 +39,34 @@
This can increase the discoverability of datasets and data services.
It also makes it possible to have a decentralized approach to publishing data catalogs and makes federated search for datasets across catalogs in multiple sites possible using the same query mechanism and structure.
Aggregated DCAT metadata can serve as a manifest file as part of the digital preservation process.</p>
<p style="text-align: center;">The namespace for DCAT terms is <code>http://www.w3.org/ns/dcat#</code></p>
<p style="text-align: center;">The suggested prefix for the DCAT namespace is <code>dcat</code></p>
<p style="text-align: center;">The (revised) DCAT vocabulary is available <a href="https://raw.githubusercontent.com/w3c/dxwg/gh-pages/dcat/rdf/dcat.ttl">here</a>.</p>
<p style="text-indent: 100px;">The namespace for DCAT terms is <code>http://www.w3.org/ns/dcat#</code></p>
<p style="text-indent: 100px;">The suggested prefix for the DCAT namespace is <code>dcat</code></p>
<p style="text-indent: 100px;">The (revised) DCAT vocabulary is available <a href="https://raw.githubusercontent.com/w3c/dxwg/gh-pages/dcat/rdf/dcat.ttl">here</a>.</p>
</section>

<section id="sotd">

<p>
Since the <a href="https://www.w3.org/TR/2018/WD-vocab-dcat-2-20180508/">First Public Working Draft</a>, the main changes to the DCAT vocabulary have been:</p>
<ul>
<li>addition of a <a href="#Class:Resource"><code>dcat:Resource</code></a> class for representing any resource than can be included in the catalog, this is
<li>addition of a <a href="#Class:Resource"><code>dcat:Resource</code></a> class for representing any asset than can be included in the catalog, this is
now the super-class of <a href="#Class:Dataset"><code>dcat:Dataset</code></a></li>
<li>addition of <a href="#Class:Data_Service"><code>dcat:DataService</code></a>, as a sub-class of <a href="#Class:Resource"><code>dcat:Resource</code></a>, to support cataloguing service end-points providing access to resources</li>
<li>addition of <a href="#Class:Data_Service"><code>dcat:DataService</code></a>, as a sub-class of <a href="#Class:Resource"><code>dcat:Resource</code></a>, to support cataloguing service end-points providing access to data assets</li>
<li>addition of <a href="#Class:Data_Distribution_Service"><code>dcat:DataDistributionService</code></a>, as a sub-class of <a href="#Class:Data_Service"><code>dcat:DataService</code></a>,
representing service end-points providing access to datasets through their distributions, respectively </li>
<li>addition of ways to representing <a href="#bag-of-files">loosely structured catalogs</a>, where there is no distinction between a dataset and its distributions</li>
<li>more details for the ways of representing <a href="#examples-dataset-provenance">dataset provenance</a> and <a href="#quality-information">dataset quality</a></li>
representing service end-points providing access to datasets through their distributions </li>
<li>addition of ways to represent <a href="#bag-of-files">loosely structured catalogs</a>, where there is no distinction between a dataset and its distributions</li>
<li>more details for the ways of representing <a href="#examples-dataset-provenance">dataset provenance</a> and <a href="#quality-information">quality</a></li>
<li>an <a href="#dcat-sdo">alignment</a> between the DCAT vocabulary and the schema.org vocabulary</li>
</ul>


<p>
The detailed differences between the two documents can be seen <a href="https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fwww.w3.org%2FTR%2F2018%2FWD-vocab-dcat-2-20180508%2F&doc2=https%3A%2F%2Fwww.w3.org%2FTR%2F2018%2FWD-vocab-dcat-2-20181011%2F">here</a>
and the list of all the changes since the previous version of DCAT in the <a href="#changes">Change History</a> section.
The detailed differences between the two documents can be seen <a href="https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fwww.w3.org%2FTR%2F2018%2FWD-vocab-dcat-2-20180508%2F&doc2=https%3A%2F%2Fwww.w3.org%2FTR%2F2018%2FWD-vocab-dcat-2-20181011%2F">here</a>.
</p>

<h3 id="dcat_history">DCAT history</h3>
<p>The original DCAT vocabulary was developed and <a href="http://vocab.deri.ie/dcat">hosted</a> at the Digital Enterprise Research Institute (DERI), then refined by the <a href="http://www.w3.org/egov/">eGov Interest Group</a>, and finally standardized in 2014 [[?VOCAB-DCAT-20140116]] by the <a href="http://www.w3.org/2011/gld/">Government Linked Data (GLD)</a> Working Group.</p>
<p>This revised version of DCAT was developed by the <a href="https://www.w3.org/2017/dxwg/">Dataset Exchange Working Group</a> in response to a new set of Use Cases and Requirements [[?DCAT-UCR]] submitted on the basis of experience with the DCAT vocabulary from the time of the original version, and new applications not originally considered. A summary of the changes from [[?VOCAB-DCAT-20140116]] can be found at <a href="#changes">Change History</a></p>
<p>This revised version of DCAT was developed by the <a href="https://www.w3.org/2017/dxwg/">Dataset Exchange Working Group</a> in response to a new set of Use Cases and Requirements [[?DCAT-UCR]] gathered from peoples' experience with the DCAT vocabulary from the time of the original version, and new applications that were not considered in the first version. A summary of the changes from [[?VOCAB-DCAT-20140116]] can be found at <a href="#changes">Change History</a></p>

<h3 id="external_terms">External terms</h3>
<p>DCAT incorporates terms from pre-existing vocabularies where stable terms with appropriate meanings could be found, such as <a href="http://xmlns.com/foaf/0.1/homepage">foaf:homepage</a> and <a href="http://purl.org/dc/terms/title">dct:title</a>.
Expand All @@ -86,14 +85,14 @@ <h2>Introduction</h2>
<p>Sharing data resources among different organizations, researchers, governments and citizens requires the provision of metadata.
This is irrespective of the data being open or not.
DCAT is a vocabulary for publishing data catalogs on the Web, which was originally developed in the context of government data catalogs
such as <a href="https://www.data.gov/">data.gov</a> and <a href="https://data.gov.uk">data.gov.uk</a>, but it has also been used in other contexts.
such as <a href="https://www.data.gov/">data.gov</a> and <a href="https://data.gov.uk">data.gov.uk</a>, but it is also applicable and has been used in other contexts.
<p>

<p>
This revision of DCAT has extended the previous version to support further use cases and requirements [[?DCAT-UCR]].
These include the possibility of cataloguing other data resources in addition to
datasets, such as data services. The revision also supports describing relationships between datasets as well as between
datasets and other catalogued resources, guidance on how to document licenses and rights statements associated with the catalogued items.
datasets and other catalogued resources. Guidance on how to document licenses and rights statements associated with the catalogued items is provided.
</p>

<p>
Expand All @@ -112,31 +111,31 @@ <h2>Introduction</h2>


<p>
Data can come in many formats, ranging from spreadsheets, through XML and RDF, to various specialty formats.
DCAT does not make any assumptions about the serialization format of the datasets described in a catalog, but it does
Data described in a catalog can come in many formats, ranging from spreadsheets, through XML and RDF to various specialized formats.
DCAT does not make any assumptions about these serialization formats of the datasets but it does
distinguish between the abstract dataset and its different manifestations or distributions.
</p>

<p>
Data is often provided through a service, accessed through a form or API which supports selection of an extract, sub-set, or combination of data.
Data is often provided through a service accessed through a form or API which supports selection of an extract, sub-set, or combination of data.
DCAT allows the description of a data access service to be included in a catalog.
</p>

<p>
Complementary vocabularies can be used together with DCAT to provide more detailed format-specific information.
For example, properties from the VoID vocabulary [[?VOID]] can be used to express various statistics about a DCAT-described dataset if that dataset is in RDF format.
For example, properties from the VoID vocabulary [[?VOID]] can be used withn DCAT to express various statistics about a dataset if that dataset is in RDF format.
</p>

<p>
This document does not prescribe any particular method of deploying data expressed in DCAT.
DCAT is applicable in many contexts including RDF accessible via SPARQL endpoints, embedded in HTML pages as [[?HTML-RDFa]], or serialized as RDF/XML [[?RDF-SYNTAX-GRAMMAR]], [[?N3]], [[?Turtle]], [[?JSON-LD]] or other formats.
This document does not prescribe any particular method of deploying data catalogs expressed in DCAT.
DCAT information can be presented in many forms including RDF accessible via SPARQL endpoints, embedded in HTML pages as [[?HTML-RDFa]], or serialized as RDF/XML [[?RDF-SYNTAX-GRAMMAR]], [[?N3]], [[?Turtle]], [[?JSON-LD]] or other formats.
Within this document the examples use Turtle because of its readability.
</p>
</section>

<section id="motivation" class="informative"><h2>Motivation for change</h2>
<p>The original Recommendation [[?VOCAB-DCAT-20140116]], published in January 2014, provided the basic framework for describing datasets. Importantly, it made the distinction between a dataset as an abstract idea and a distribution as a manifestation of the dataset. Although DCAT has been widely adopted, it has become clear that the original specification lacked a number of essential features that were added either through application profiles, such as the European Commission's DCAT-AP [[?DCAT-AP]], or the development of larger vocabularies that, to a greater or lesser extent, built upon the base standard, such as the Healthcare and Life Sciences Community Profile [[?HCLS-Dataset]], the Data Tag Suite [[?DATS]] and more. This version of DCAT has been developed to address the specific shortcomings that have come to light through the experiences of different communities, the aim being, of course, to improve interoperability between the outputs of these larger vocabularies.
For example, in this new DCAT version, we provide classes, properties and guidance to address <a href="#Dereferenceable-identifiers">identifiers</a>, <a href="#quality-information">quality information</a>, <a href="#data-citation">data citation</a> issues.</p>
<p>The original Recommendation [[?VOCAB-DCAT-20140116]] published in January 2014 provided the basic framework for describing datasets. It made an important distinction between a <i>dataset</i> as an abstract idea and a <i>distribution</i> as a manifestation of the dataset. Although DCAT has been widely adopted, it has become clear that the original specification lacked a number of essential features that were added either through the mechanism of an application profile, such as the European Commission's DCAT-AP [[?DCAT-AP]], or the development of larger vocabularies that to a greater or lesser extent built upon the base standard, such as the Healthcare and Life Sciences Community Profile [[?HCLS-Dataset]], the Data Tag Suite [[?DATS]] and more. This revision of DCAT has been developed to address the specific shortcomings that have come to light through the experiences of different communities, the aim being to improve interoperability between the outputs of these larger vocabularies.
For example, in this new DCAT version we provide classes, properties and guidance to address <a href="#Dereferenceable-identifiers">identifiers</a>, <a href="#quality-information">dataset quality information</a>, and <a href="#data-citation">data citation</a> issues.</p>
<p>This draft includes re-writing of the specification throughout. Significant changes from the 2014 Recommendation are marked within the text using "Note" sections, as well as being described in the <a href="#changes">Change History</a>.</p>

</section>
Expand Down Expand Up @@ -179,14 +178,14 @@ <h2 >Namespaces</h2>
<ul>
<li> Access to data is organized into datasets, distributions, and data-services. </li>
<li> An RDF description of the catalog itself and its datasets, distributions, and data-services is available (but the choice of
RDF syntaxes, access protocols, and access policies is not mandated by this specification).</li>
<li> The contents of all metadata fields that are held in the catalog, and that contain data about the catalog itself and its datasets, distributions, and data-services, are included in this RDF description, expressed using the appropriate classes and properties from DCAT, except where no such class or property exists.</li>
RDF syntax, access protocol, and access policy are not mandated by this specification).</li>
<li> The contents of all metadata fields that are held in the catalog and that contain data about the catalog itself and its datasets, distributions, and data-services, are included in this RDF description and are expressed using the appropriate classes and properties from DCAT, except where no such class or property exists.</li>
<li> All classes and properties defined in DCAT are used in a way consistent with the semantics declared in this specification.</li>
<li>DCAT-compliant catalogs <em title="MAY" class="rfc2119">MAY</em> include additional non-DCAT metadata fields and additional RDF data in the catalog's RDF description.</li>
</ul>

<p>
A <strong>DCAT profile</strong> is a specification for data catalogs that adds additional constraints to DCAT. A data catalog that conforms to the profile also conforms to DCAT. Additional constraints in a profile <em title="MAY" class="rfc2119">MAY</em> include:
A <strong>DCAT profile</strong> is a specification for a data catalog that adds additional constraints to DCAT. A data catalog that conforms to the profile also conforms to DCAT. Additional constraints in a profile <em title="MAY" class="rfc2119">MAY</em> include:
</p>
<ul>
<li> Cardinality constraints, including a minimum set of required metadata fields </li>
Expand Down Expand Up @@ -215,7 +214,7 @@ <h3>DCAT scope</h3>
DCAT is based around eight main classes:</p>
<ul>
<li>
<a href="#Class:Catalog"><code>dcat:Catalog</code></a> represents a catalog, which is a dataset in which each individual item is a metadata record describing some resource; the scope of <code>dcat:catalog</code> is collections of metadata about <b>datasets</b> or <b>data services</b>.
<a href="#Class:Catalog"><code>dcat:Catalog</code></a> represents a catalog, which is a dataset in which each individual item is a metadata record describing some resource; the scope of <code>dcat:Catalog</code> is collections of metadata about <b>datasets</b> or <b>data services</b>.
</li>
<li>
<a href="#Class:Resource"><code>dcat:Resource</code></a> represents an individual item in a catalog.
Expand All @@ -226,18 +225,20 @@ <h3>DCAT scope</h3>
<li>
<a href="#Class:Dataset"><code>dcat:Dataset</code></a> represents a dataset in a catalog.
A dataset is a collection of data, published or curated by a single agent.
Data comes in many forms, including numbers, words, pixels, imagery, sound and other multi-media, and potentially other types, any of which might be collected into a dataset.
Data comes in many forms including numbers, words, pixels, imagery, sound and other multi-media, and potentially other types, any of which might be collected into a dataset.
</li>
<li>
<code><a href="#Class:Distribution">dcat:Distribution</a></code> represents an accessible form of a dataset, for example a downloadable file.
<code><a href="#Class:Distribution">dcat:Distribution</a></code> represents an accessible form of a dataset such as a downloadable file.
</li>
<li>
<a href="#Class:Data_Service"><code>dcat:DataService</code></a> represents a data service in a catalog.
A data service is a collection of operations, accessible through an interface (API) that provide access to one or more datasets or data processing functions.
A data service is a collection of operations accessible through an interface (API) that provide access to one or more datasets or data processing functions.
</li>
<!--
<li>
<a href="#Class:Data_Distribution_Service"><code>dcat:DataDistributionService</code></a> represents a kind of data service that provides access to distributions of one or more datasets or extracts of datasets.
</li>
-->
<!--
<li>
<a href="#Class:DataTransformationService"><code>dcat:DataTransformationService</code></a> represents a service that can transform a dataset, e.g. spatial coordinate transformation; interpolation or resampling of a dataset.
Expand All @@ -258,7 +259,7 @@ <h3>DCAT scope</h3>
</figure>
<p class="note">
Along with the rest of the <a href="#vocabulary-overview">Vocabulary overview</a>, this diagram is <b>non-normative</b>.
Furthermore, while the diagram uses UML-style class notation, it should be interpreted following the usual RDF open-world assumptions around the presence/absence of properties, relationships, and their cardinality.
Furthermore, while the diagram uses UML-style class notation it should be interpreted following the usual RDF open-world assumptions around the presence/absence of properties, relationships, and their cardinality.
The properties shown in each class reflect those recommended in the descriptions of classes in the <a href="#vocabulary-specification">Vocabulary specification</a>.
To assist in understanding the full scope of each class, properties are copied down from each '::super-class'.
Cardinalities are shown in a few places to reinforce expectations, but these are not axiomatized or enforced in any way by this recommendation.
Expand Down

0 comments on commit 37c98af

Please sign in to comment.