diff --git a/dcat/index.html b/dcat/index.html index 94abf142e..895ebf035 100644 --- a/dcat/index.html +++ b/dcat/index.html @@ -29,7 +29,7 @@

Introduction

From DCAT 2014 [[!VOCAB-DCAT-20140116]]

-

Data can come in many formats, ranging from spreadsheets, through XML and RDF, to various speciality formats. DCAT does not make any assumptions about the serialisation format of the datasets described in a catalog. Other, complementary vocabularies may be used together with DCAT to provide more detailed format-specific information. For example, properties from the VoID vocabulary [[VOID]] can be used to express various statistics about a DCAT-described dataset if that dataset is in RDF format.

+

Data can come in many formats, ranging from spreadsheets, through XML and RDF, to various speciality formats. DCAT does not make any assumptions about the serialisation format of the datasets described in a catalog. Other, complementary vocabularies MAY be used together with DCAT to provide more detailed format-specific information. For example, properties from the VoID vocabulary [[VOID]] can be used to express various statistics about a DCAT-described dataset if that dataset is in RDF format.

This document does not prescribe any particular method of deploying data expressed in DCAT. DCAT is applicable in many contexts including RDF accessible via SPARQL endpoints, embedded in HTML pages as RDFa, or serialized as e.g. RDF/XML or Turtle. The examples in this document use Turtle simply because of Turtle's readability.

@@ -43,7 +43,7 @@

Introduction

Namespaces

The namespace for DCAT is http://www.w3.org/ns/dcat#. - However, it should be noted that DCAT makes extensive use of terms from other vocabularies, in particular Dublin Core [[DCTERMS]]. + However, it can be noted that DCAT makes extensive use of terms from other vocabularies, in particular Dublin Core [[DCTERMS]]. DCAT itself defines a minimal set of classes and properties of its own. A full set of namespaces and prefixes used in this document is shown in the table below.

@@ -113,11 +113,11 @@

Vocabulary Overview

A dataset in DCAT is defined as a "collection of data, published or curated by a single agent, and available for access or download in one or more formats". -A dataset is a conceptual entity, and may be represented by one or more distributions that serialize the dataset for transfer. +A dataset is a conceptual entity, and can be represented by one or more distributions that serialize the dataset for transfer. Distributions of a dataset can be provided via data distribution services. Detailed properties for a data distribution service API are out of the scope of this version of DCAT.

-

Datasets and data services, and potentially other types of thing, may be included in a catalog. +

Datasets and data services, and potentially other types of thing, MAY be included in a catalog. Types of data service that might be found in a catalog include data distribution services, discovery services such as portals and catalog services, data transformation services such as coordinate transformation services, re-sampling and interpolation services, and various data processing services.

@@ -144,7 +144,7 @@

Vocabulary Overview

UML model of DCAT classes and properties
- Overview of DCAT model, showing the classes of resources that may be members of a Catalog and the relationships between them. + Overview of DCAT model, showing the classes of resources that can be members of a Catalog and the relationships between them.
@@ -244,7 +244,7 @@

Classifying dataset types

- The type or genre of a dataset may be indicated using the dct:type property: + The type or genre of a dataset MAY be indicated using the dct:type property:

  :dataset-001 dct:type	dctype:Text . 
@@ -329,7 +329,7 @@

Vocabulary specification

RDF representation

- The DCAT RDF representation is modularized into several files or graphs to help users access a version of DCAT with just the alignments that they need. This mechanism may also be used to capture different levels of axiomatization, though the status of such proposals has not been finalized. See Issue #134 and the issues enumerated below. + The DCAT RDF representation is modularized into several files or graphs to help users access a version of DCAT with just the alignments that they need. This mechanism can also be used to capture different levels of axiomatization, though the status of such proposals has not been finalized. See Issue #134 and the issues enumerated below.

@@ -357,7 +357,7 @@

RDF representation

alignments to other vocabularies
  • - additional axioms, which may be useful in some contexts + additional axioms, which can be useful in some contexts
  • some profiles of DCAT, including a profile that corresponds to the 2014 version of DCAT [[VOCAB-DCAT-20140116]] @@ -376,7 +376,7 @@

    RDF representation

    Dependencies

    -The definitions (including domain and range) of terms outside the dcat namespace are provided here only for convenience and must not be considered normative. The authoritative definitions of these terms are in the corresponding specifications: [[!DC11]], [[!DCTERMS]], [[!FOAF]], [[!RDF-SCHEMA]], [[!SKOS-REFERENCE]], [[!XMLSCHEMA11-2]] and [[!VCARD-RDF]]. +The definitions (including domain and range) of terms outside the dcat namespace are provided here only for convenience and MUST NOT be considered normative. The authoritative definitions of these terms are in the corresponding specifications: [[!DC11]], [[!DCTERMS]], [[!FOAF]], [[!RDF-SCHEMA]], [[!SKOS-REFERENCE]], [[!XMLSCHEMA11-2]] and [[!VCARD-RDF]].

    Description of DCAT vocabulary elements from DCAT 2014 [[VOCAB-DCAT-20140116]] except where indicated.

    @@ -507,7 +507,7 @@

    Property: homepage

    Definition:The homepage of the catalog. Range:foaf:Document - Usage note:foaf:homepage is an inverse functional property (IFP) which means that it should be unique and precisely identify the catalog. This allows smushing various descriptions of the catalog when different URIs are used. + Usage note:foaf:homepage is an inverse functional property (IFP) which means that it SHOULD be unique and precisely identify the catalog. This allows smushing various descriptions of the catalog when different URIs are used. @@ -566,7 +566,7 @@

    Property: license

    Definition:This links to the license document under which the catalog is made available and not the datasets. Even if the license of the catalog applies to all of its - datasets and distributions, it should be replicated on each distribution. + datasets and distributions, it SHOULD be replicated on each distribution. Range:dct:LicenseDocument See also:catalog rights, distribution license @@ -581,7 +581,7 @@

    Property: rights

    Definition:This describes the rights under which the catalog can be used/reused and not the datasets. Even if theses rights apply to all the catalog - datasets and distributions, it should be replicated on each distribution. + datasets and distributions, it SHOULD be replicated on each distribution. Range:dct:RightsStatement See also:catalog license, distribution rights @@ -705,7 +705,7 @@

    Class: Catalog record

    Usage noteThis class is optional and not all catalogs will use it. It exists for catalogs where a distinction is made between metadata about a dataset and metadata about the dataset's entry in the catalog. For example, the publication date property of the dataset reflects the date when the information was originally made available by the publishing agency, while the publication date of the catalog record is the date when the dataset was added to the catalog. - In cases where both dates differ, or where only the latter is known, the publication date should only be specified for the catalog record. + In cases where both dates differ, or where only the latter is known, the publication date SHOULD only be specified for the catalog record. Notice that the W3C PROV Ontology [[PROV-O]] allows describing further provenance information such as the details of the process and the agent involved in a particular change to a dataset. See alsoDataset @@ -716,7 +716,7 @@

    Class: Catalog record

    If a catalog is represented as an RDF Dataset with named graphs (as defined in [[SPARQL11-QUERY]]), then it is appropriate to place the description of each dataset (consisting of all RDF triples that mention the dcat:Dataset, dcat:CatalogRecord, and any of its dcat:Distributions) -into a separate named graph. The name of that graph should be the IRI of the catalog record. +into a separate named graph. The name of that graph SHOULD be the IRI of the catalog record.

    @@ -888,7 +888,7 @@

    Property: access rights

    - +
    RDF Property:dct:accessRights
    Definition:Access Rights may include information regarding access or restrictions based on privacy, security, or other policies.
    Definition:Access Rights MAY include information regarding access or restrictions based on privacy, security, or other policies.
    Range:dct:RightsStatement
    @@ -909,7 +909,7 @@

    Class: Dataset

    - Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not in stead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset should be avoided as this may create legal conflicts. + Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not in stead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts.

    @@ -1143,7 +1143,7 @@

    Property: release date

    Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]] - Usage note:This property should be set using the first known date of issuance. + Usage note:This property SHOULD be set using the first known date of issuance. See also: dataset release date @@ -1169,7 +1169,7 @@

    Property: license

    Definition:This links to the license document under which the distribution is made available. Range:dct:LicenseDocument - Usage note:Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not in stead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset should be avoided as this may create legal conflicts. + Usage note:Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not in stead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts. See also: distribution rights, catalog license @@ -1186,7 +1186,7 @@

    Property: rights

    Usage note:dct:license, which is a sub-property of dct:rights, can be used to link a distribution to a license document. However, dct:rights allows linking to a rights statement that can include licensing information as well as other information that supplements the licence such as attribution.
    - Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset should be avoided as this may create legal conflicts. + Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts. See also: distribution license, catalog rights @@ -1262,7 +1262,7 @@

    Property: download URL

    Domain:dcat:Distribution Range:rdfs:Resource Usage note: - dcat:downloadURL is a specific form of dcat:accessURL. Nevertheless, DCAT does not define dcat:downloadURL as a subproperty of dcat:accessURL not to enforce this entailment as DCAT profiles may wish to impose a stronger separation where they only use dcat:accessURL for non-download locations. + dcat:downloadURL is a specific form of dcat:accessURL. Nevertheless, DCAT does not define dcat:downloadURL as a subproperty of dcat:accessURL not to enforce this entailment as DCAT profiles can impose a stronger separation where they only use dcat:accessURL for non-download locations. See alsodistribution access URL @@ -1371,7 +1371,7 @@

    Class: Catalogued resource

    This class carries properties common to all catalogued resources, including datasets and data services. It is strongly recommended to use a more specific sub-class when available. - See also: + See also:Catalog record @@ -1410,7 +1410,7 @@

    Property: release date

    Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]] - Usage note:This property should be set using the first known date of issuance. + Usage note:This property SHOULD be set using the first known date of issuance.
    @@ -1424,7 +1424,7 @@

    Property: update/modification date

    Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]] - Usage note:The value of this property indicates a change to the actual item, not a change to the catalog record. An absent value may indicate that the item has never changed after its initial publication, or that the date of last modification is not known, or that the item is continuously updated. + Usage note:The value of this property indicates a change to the actual item, not a change to the catalog record. An absent value MAY indicate that the item has never changed after its initial publication, or that the date of last modification is not known, or that the item is continuously updated. See also:frequency @@ -1670,23 +1670,23 @@

    Quality information

    This section is not-normative as it provides guidance on how to document the quality of DCAT first class entities (e.g., datasets, distributions) and it does not define new DCAT terms. The guidance relies on the Data Quality Vocabulary(DQV)[[vocab-dqv]], which is a W3C Group Note.

  • - +

    The need to choose or define a data quality model has been identified as a requirement to be satisfied in the revision of DCAT. -

    +

    The Data Quality Vocabulary (DQV) offers common modelling patterns for different aspects of Data Quality. It can relate DCAT datasets and distributions with different types of quality information including -Each type of quality information can pertain to one or more quality dimensions, namely, quality characteristics relevant to the consumer. The practice to see the quality as a multi-dimensional space is consolidated in the field of quality management to split the quality management into addressable chunks. DQV does not define a normative list of quality dimensions. It offers the quality dimensions proposed in ISO/IEC 25012 [[ISOIEC25012]] and Zaveri et al. [[ZaveriEtAl]] as two possible starting points. It also provides an RDF representation for the quality dimensions and categories defined in the latter. Ultimately, implementers will need to choose themselves the collection of quality dimensions that best fits their needs. +Each type of quality information can pertain to one or more quality dimensions, namely, quality characteristics relevant to the consumer. The practice to see the quality as a multi-dimensional space is consolidated in the field of quality management to split the quality management into addressable chunks. DQV does not define a normative list of quality dimensions. It offers the quality dimensions proposed in ISO/IEC 25012 [[ISOIEC25012]] and Zaveri et al. [[ZaveriEtAl]] as two possible starting points. It also provides an RDF representation for the quality dimensions and categories defined in the latter. Ultimately, implementers will need to choose themselves the collection of quality dimensions that best fits their needs. -The following section shows how DCAT and DQV can be coupled to describe the quality of datasets and distributions. -For a comprehensive introduction and further examples of use, please refer to the Data Quality Vocabulary (DQV) group note [[vocab-dqv]]. +The following section shows how DCAT and DQV can be coupled to describe the quality of datasets and distributions. +For a comprehensive introduction and further examples of use, please refer to the Data Quality Vocabulary (DQV) group note [[vocab-dqv]].

    The following examples make no comments on where the quality information would reside and how it is managed. That is out of scope for the DCAT vocabulary. The assumption made is that the quality individuals are available using the URIs indicated. Besides, the examples and more in general the DQV is neutral to the data portal design choices on how to collect quality information. For example, data portals can collect DQV instances by implementing specific UI to annotate data or by taking inputs from 3rd-party services. @@ -1695,19 +1695,19 @@

    Quality information

    We might want to include examples of quality documentation related to services. -

    - +

    +

    Providing quality information

    The need to provide hook for quality information concerning a dcat:Dataset has been identified as a requirement to be satisfied in the revision of DCAT.

    A data consumer (:consumer1) describes the quality of the dataset :genoaBusStopsDataset that includes a georeferenced list of bus stops in Genoa. He/she annotates the dataset with a DQV quality note (:genoaBusStopsDatasetCompletenessNote) about data completeness (ldqd:completeness) to warn that the dataset includes only 20500 out of the 30000 stops. - +
    :genoaBusStopsDataset a dcat:Dataset ;
         dqv:hasQualityAnnotation :genoaBusStopsDatasetCompletenessNote .
     
    -:genoaBusStopsDatasetCompletenessNote  
    +:genoaBusStopsDatasetCompletenessNote
         a dqv:UserQualityFeedback ;
         oa:hasTarget :genoaBusStopsDataset ;
         oa:hasBody :textBody ;
    @@ -1719,9 +1719,9 @@ 

    Providing quality information

    :textBody a oa:TextualBody ; rdf:value "Incomplete dataset: it contains only 20500 out of 30000 existing bus stops" ; - dc:language "en" ; - dc:format "text/plain" - . + dc:language "en" ; + dc:format "text/plain" + .
    The activity :myQualityChecking employs the service :myQualityChecker to check the quality of the :genoaBusStopsDataset dataset. The metric :completenessWRTExpectedNumberOfEntities is applied to measure the dataset completeness (ldqd:completeness) and it results in the quality measurement :genoaBusStopsDatasetCompletenessMeasurement.
    :genoaBusStopsDataset
    @@ -1734,18 +1734,18 @@ 

    Providing quality information

    dqv:value "0.6833333"^^xsd:decimal ; prov:wasAttributedTo :myQualityChecker ; prov:generatedAtTime "2018-05-27T02:52:02Z"^^xsd:dateTime ; - prov:wasGeneratedBy :myQualityChecking + prov:wasGeneratedBy :myQualityChecking . -:completenessWRTExpectedNumberOfEntities +:completenessWRTExpectedNumberOfEntities a dqv:Metric ; skos:definition "it returns the degree of completeness as ratio between the actual number of entities included in the dataset and the declared expected number of entities."@en ; dqv:expectedDataType xsd:decimal ; dqv:inDimension ldqd:completeness . -# :myQualityChecker is a service computing some quality metrics +# :myQualityChecker is a service computing some quality metrics :myQualityChecker - a prov:SoftwareAgent ; + a prov:SoftwareAgent ; rdfs:label "A quality assessment service"^^xsd:string . # Further details about quality service/software can be provided, for example, # deploying vocabularies such as Dataset Usage Vocabulary (DUV), Dublin Core or ADMS.SW @@ -1760,7 +1760,7 @@

    Providing quality information

    prov:endedAtTime "2018-05-27T02:52:02Z"^^xsd:dateTime; prov:startedAtTime "2018-05-27T00:52:02Z"^^xsd:dateTime .
    - +
    @@ -1875,12 +1875,12 @@

    Relation to other W3C Recommendations

    DCAT should be aligned with other recent Linked Data based Recommendations.

    - +

    Linked Data Platform (LDP)

    - +

    - DCAT provides a data model for representation of metadata about datasets in the form of Linked Data, but it does not specify how this metadata should be accessed or modified. + DCAT provides a data model for representation of metadata about datasets in the form of Linked Data, but it does not specify how this metadata can be accessed or modified. The DCAT compatible metadata can be viewed as collections of Catalog Records, Datasets and Data Services contained in a Catalog, and a collection of Distributions contained in a Dataset. The Linked Data Platform [[ldp]] specification deals with access to and modification of Linked Data Platform Containers (LDPCs). This section provides guidance on how to represent DCAT metadata as LDP Containers, which supports namely the implementation of Solid based DCAT catalogs. @@ -1905,15 +1905,15 @@

    Linked Data Platform (LDP)

    <> a dcat:Catalog ; dcat:datasets </datasets/> ; dcat:dataset </datasets/001> . - + </datasets/> a ldp:Container, ldp:DirectContainer ; ldp:membershipResource <> ; ldp:hasMemberRelation dcat:dataset ; ldp:contains </datasets/001> . - + </datasets/001> a dcat:Dataset .
    - +

    In the second example, we add LDPCs </records/> for Catalog Records and </services/> for Data Services, discoverable using dcat:records and dcat:services predicates from the Catalog:

    @@ -1932,7 +1932,7 @@

    Linked Data Platform (LDP)

    dcat:datasets </datasets/> ; dcat:services </services/> ; dcat:dataset </datasets/001> . - + </records/> a ldp:Container, ldp:DirectContainer ; ldp:membershipResource <> ; ldp:hasMemberRelation dcat:record ; @@ -1942,7 +1942,7 @@

    Linked Data Platform (LDP)

    ldp:membershipResource <> ; ldp:hasMemberRelation dcat:dataset ; ldp:contains </datasets/001> . - + </services/> a ldp:Container, ldp:DirectContainer ; ldp:membershipResource <> ; ldp:hasMemberRelation dcat:service ; @@ -1950,7 +1950,7 @@

    Linked Data Platform (LDP)

    </records/001> a dcat:CatalogRecord ; foaf:primaryTopic </datasets/001> . - + </datasets/001> a dcat:Dataset ; </services/001> a dcat:DataService . @@ -1980,14 +1980,14 @@

    Linked Data Platform (LDP)

    </datasets/001/distributions/001> a dcat:Distribution . - -

    For catalogs with many datasets, catalog records, data services or distributions, - the Linked Data Platform Paging mechanism [[ldp-paging]] SHOULD be used to provide access to them.

    - + +

    For catalogs with many datasets, catalog records, data services or distributions, + the Linked Data Platform Paging mechanism [[ldp-paging]] SHOULD be used to provide access to them.

    +

    In the next sections we formally define the additional properties used for discovery of LDP containers.

    - +

    Property: datasets

    @@ -2044,10 +2044,10 @@

    Property: distributions

    Linked Data Notifications (LDN)

    - +

    Linked Data Notifications (LDN) [[ldn]] can be used with DCAT e.g. for feedback collection. - Any resource can have an LDN Inbox. + Any resource can have an LDN Inbox. In the following example we show a dataset </datasets/001> as an LDN Target with an LDN Inbox.