From 8c2bbda291c6dbdeed2acc4125982b8187ca8283 Mon Sep 17 00:00:00 2001
From: Simon Cox From DCAT 2014 [[!VOCAB-DCAT-20140116]] Data can come in many formats, ranging from spreadsheets, through XML and RDF, to various speciality formats. DCAT does not make any assumptions about the serialisation format of the datasets described in a catalog. Other, complementary vocabularies may be used together with DCAT to provide more detailed format-specific information. For example, properties from the VoID vocabulary [[VOID]] can be used to express various statistics about a DCAT-described dataset if that dataset is in RDF format. Data can come in many formats, ranging from spreadsheets, through XML and RDF, to various speciality formats. DCAT does not make any assumptions about the serialisation format of the datasets described in a catalog. Other, complementary vocabularies MAY be used together with DCAT to provide more detailed format-specific information. For example, properties from the VoID vocabulary [[VOID]] can be used to express various statistics about a DCAT-described dataset if that dataset is in RDF format. This document does not prescribe any particular method of deploying data expressed in DCAT. DCAT is applicable in many contexts including RDF accessible via SPARQL endpoints, embedded in HTML pages as RDFa, or serialized as e.g. RDF/XML or Turtle. The examples in this document use Turtle simply because of Turtle's readability. The namespace for DCAT is
A dataset in DCAT is defined as a "collection of data, published or curated by a single agent, and available for access or download in one or more formats".
-A dataset is a conceptual entity, and may be represented by one or more distributions that serialize the dataset for transfer.
+A dataset is a conceptual entity, and can be represented by one or more distributions that serialize the dataset for transfer.
Distributions of a dataset can be provided via data distribution services.
Detailed properties for a data distribution service API are out of the scope of this version of DCAT. Datasets and data services, and potentially other types of thing, may be included in a catalog.
+ Datasets and data services, and potentially other types of thing, MAY be included in a catalog.
Types of data service that might be found in a catalog include data distribution services, discovery services such as portals and catalog services, data transformation services such as coordinate transformation services, re-sampling and interpolation services, and various data processing services.
Introduction
Introduction
Namespaces
http://www.w3.org/ns/dcat#
.
- However, it should be noted that DCAT makes extensive use of terms from other vocabularies, in particular Dublin Core [[DCTERMS]].
+ However, it can be noted that DCAT makes extensive use of terms from other vocabularies, in particular Dublin Core [[DCTERMS]].
DCAT itself defines a minimal set of classes and properties of its own.
A full set of namespaces and prefixes used in this document is shown in the table below.Vocabulary Overview
Vocabulary Overview
@@ -244,7 +244,7 @@ Classifying dataset types
- The type or genre of a dataset may be indicated using the dct:type property: + The type or genre of a dataset MAY be indicated using the dct:type property:
:dataset-001 dct:type dctype:Text .@@ -329,7 +329,7 @@
- The DCAT RDF representation is modularized into several files or graphs to help users access a version of DCAT with just the alignments that they need. This mechanism may also be used to capture different levels of axiomatization, though the status of such proposals has not been finalized. See Issue #134 and the issues enumerated below. + The DCAT RDF representation is modularized into several files or graphs to help users access a version of DCAT with just the alignments that they need. This mechanism can also be used to capture different levels of axiomatization, though the status of such proposals has not been finalized. See Issue #134 and the issues enumerated below.
-The definitions (including domain and range) of terms outside the dcat namespace are provided here only for convenience and must not be considered normative. The authoritative definitions of these terms are in the corresponding specifications: [[!DC11]], [[!DCTERMS]], [[!FOAF]], [[!RDF-SCHEMA]], [[!SKOS-REFERENCE]], [[!XMLSCHEMA11-2]] and [[!VCARD-RDF]]. +The definitions (including domain and range) of terms outside the dcat namespace are provided here only for convenience and MUST NOT be considered normative. The authoritative definitions of these terms are in the corresponding specifications: [[!DC11]], [[!DCTERMS]], [[!FOAF]], [[!RDF-SCHEMA]], [[!SKOS-REFERENCE]], [[!XMLSCHEMA11-2]] and [[!VCARD-RDF]].
Description of DCAT vocabulary elements from DCAT 2014 [[VOCAB-DCAT-20140116]] except where indicated.
@@ -507,7 +507,7 @@If a catalog is represented as an RDF Dataset with named graphs (as defined in [[SPARQL11-QUERY]]), then it is appropriate to place the description of each dataset (consisting of all RDF triples that mention the dcat:Dataset, dcat:CatalogRecord, and any of its dcat:Distributions) -into a separate named graph. The name of that graph should be the IRI of the catalog record. +into a separate named graph. The name of that graph SHOULD be the IRI of the catalog record.
RDF Property: | dct:accessRights |
---|---|
Definition: | Access Rights may include information regarding access or restrictions based on privacy, security, or other policies. |
Definition: | Access Rights MAY include information regarding access or restrictions based on privacy, security, or other policies. |
Range: | dct:RightsStatement |
- Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not in stead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset should be avoided as this may create legal conflicts. + Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not in stead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts.
@@ -1143,7 +1143,7 @@This section is not-normative as it provides guidance on how to document the quality of DCAT first class entities (e.g., datasets, distributions) and it does not define new DCAT terms. The guidance relies on the Data Quality Vocabulary(DQV)[[vocab-dqv]], which is a W3C Group Note.
The need to choose or define a data quality model has been identified as a requirement to be satisfied in the revision of DCAT. -
+ The Data Quality Vocabulary (DQV) offers common modelling patterns for different aspects of Data Quality. It can relate DCAT datasets and distributions with different types of quality information includingThe following examples make no comments on where the quality information would reside and how it is managed. That is out of scope for the DCAT vocabulary. The assumption made is that the quality individuals are available using the URIs indicated. Besides, the examples and more in general the DQV is neutral to the data portal design choices on how to collect quality information. For example, data portals can collect DQV instances by implementing specific UI to annotate data or by taking inputs from 3rd-party services. @@ -1695,19 +1695,19 @@
The need to provide hook for quality information concerning a dcat:Dataset has been identified as a requirement to be satisfied in the revision of DCAT.
A data consumer (:consumer1) describes the quality of the dataset :genoaBusStopsDataset that includes a georeferenced list of bus stops in Genoa. He/she annotates the dataset with a DQV quality note (:genoaBusStopsDatasetCompletenessNote) about data completeness (ldqd:completeness) to warn that the dataset includes only 20500 out of the 30000 stops. - +:genoaBusStopsDataset a dcat:Dataset ; dqv:hasQualityAnnotation :genoaBusStopsDatasetCompletenessNote . -:genoaBusStopsDatasetCompletenessNote +:genoaBusStopsDatasetCompletenessNote a dqv:UserQualityFeedback ; oa:hasTarget :genoaBusStopsDataset ; oa:hasBody :textBody ; @@ -1719,9 +1719,9 @@The activity :myQualityChecking employs the service :myQualityChecker to check the quality of the :genoaBusStopsDataset dataset. The metric :completenessWRTExpectedNumberOfEntities is applied to measure the dataset completeness (ldqd:completeness) and it results in the quality measurement :genoaBusStopsDatasetCompletenessMeasurement.Providing quality information
:textBody a oa:TextualBody ; rdf:value "Incomplete dataset: it contains only 20500 out of 30000 existing bus stops" ; - dc:language "en" ; - dc:format "text/plain" - . + dc:language "en" ; + dc:format "text/plain" + .
:genoaBusStopsDataset @@ -1734,18 +1734,18 @@- +Providing quality information
dqv:value "0.6833333"^^xsd:decimal ; prov:wasAttributedTo :myQualityChecker ; prov:generatedAtTime "2018-05-27T02:52:02Z"^^xsd:dateTime ; - prov:wasGeneratedBy :myQualityChecking + prov:wasGeneratedBy :myQualityChecking . -:completenessWRTExpectedNumberOfEntities +:completenessWRTExpectedNumberOfEntities a dqv:Metric ; skos:definition "it returns the degree of completeness as ratio between the actual number of entities included in the dataset and the declared expected number of entities."@en ; dqv:expectedDataType xsd:decimal ; dqv:inDimension ldqd:completeness . -# :myQualityChecker is a service computing some quality metrics +# :myQualityChecker is a service computing some quality metrics :myQualityChecker - a prov:SoftwareAgent ; + a prov:SoftwareAgent ; rdfs:label "A quality assessment service"^^xsd:string . # Further details about quality service/software can be provided, for example, # deploying vocabularies such as Dataset Usage Vocabulary (DUV), Dublin Core or ADMS.SW @@ -1760,7 +1760,7 @@Providing quality information
prov:endedAtTime "2018-05-27T02:52:02Z"^^xsd:dateTime; prov:startedAtTime "2018-05-27T00:52:02Z"^^xsd:dateTime .
DCAT should be aligned with other recent Linked Data based Recommendations.
- +- DCAT provides a data model for representation of metadata about datasets in the form of Linked Data, but it does not specify how this metadata should be accessed or modified. + DCAT provides a data model for representation of metadata about datasets in the form of Linked Data, but it does not specify how this metadata can be accessed or modified. The DCAT compatible metadata can be viewed as collections of Catalog Records, Datasets and Data Services contained in a Catalog, and a collection of Distributions contained in a Dataset. The Linked Data Platform [[ldp]] specification deals with access to and modification of Linked Data Platform Containers (LDPCs). This section provides guidance on how to represent DCAT metadata as LDP Containers, which supports namely the implementation of Solid based DCAT catalogs. @@ -1905,15 +1905,15 @@
In the second example, we add LDPCs </records/>
for Catalog Records and </services/>
for Data Services, discoverable using dcat:records
and dcat:services
predicates from the Catalog:
For catalogs with many datasets, catalog records, data services or distributions, - the Linked Data Platform Paging mechanism [[ldp-paging]] SHOULD be used to provide access to them.
- + +For catalogs with many datasets, catalog records, data services or distributions, + the Linked Data Platform Paging mechanism [[ldp-paging]] SHOULD be used to provide access to them.
+In the next sections we formally define the additional properties used for discovery of LDP containers.
- +
Linked Data Notifications (LDN) [[ldn]] can be used with DCAT e.g. for feedback collection.
- Any resource can have an LDN Inbox.
+ Any resource can have an LDN Inbox.
In the following example we show a dataset </datasets/001>
as an LDN Target with an LDN Inbox.
-The definitions (including domain and range) of terms outside the dcat namespace are provided here only for convenience and MUST NOT be considered normative. The authoritative definitions of these terms are in the corresponding specifications: [[!DC11]], [[!DCTERMS]], [[!FOAF]], [[!RDF-SCHEMA]], [[!SKOS-REFERENCE]], [[!XMLSCHEMA11-2]] and [[!VCARD-RDF]]. +The definitions (including domain and range) of terms outside the DCAT namespace are provided here only for convenience and MUST NOT be considered normative. The authoritative definitions of these terms are in the corresponding specifications: [[!DC11]], [[!DCTERMS]], [[!FOAF]], [[!RDF-SCHEMA]], [[!SKOS-REFERENCE]], [[!XMLSCHEMA11-2]] and [[!VCARD-RDF]].
Description of DCAT vocabulary elements from DCAT 2014 [[VOCAB-DCAT-20140116]] except where indicated.