# DCAT - Landscape analysis 

## What is [DCAT](https://www.w3.org/TR/vocab-dcat-3/)? 
**DCAT** (Data Catalogue Vocabulary) is a vocabulary for sharing data catalogs on the web. It provides standardized guidelines, rules and convensions on how to structure and describe metadata about datasets, with the goal of increasing interoperability and discoverability of datasets across domains. The vocabulary is machine-readable as it formulates [RDF](https://www.w3.org/RDF/) (Resource Description Framework) classes and properties. 

DCAT uses terms from other well-established metadata standards such as [FOAF](http://xmlns.com/foaf/spec/) and [Dublin Core](https://www.dublincore.org/), and introduces new classes and properties in the [`dcat` namespace](http://www.w3.org/ns/dcat#). 


Here's a timeline of the development of the vocabulary: 
- 2010: First development of DCAT by [Vassilios Peristeras](https://www.ihu.gr/ucips/cv/vassilios-peristeras) at the former [Digital Enterprise Research Institute](https://en.wikipedia.org/wiki/Digital_Enterprise_Research_Institute) in Ireland. 
-2012: DCAT Is taken over by the [W3C](https://www.w3.org/) Government Linked Data Working Group. 
- 2014: Fully standardized version of DCAT is released. 
- 2020: Release of DCAT version 2. 
- 2022: Release of DCAT version 3. 

The different versions of DCAT are backwards compatible, meaning that older versions remain in conformance with the newer versions. The updates generally included new classes and properties, and some contraints were relaxed. 
                                                          


### Structure 
DCAT is based around seven main classes. Each class provides guidelines for the description of a certain object based. 

- `dcat:Dataset`: represents a collection of data. 
- `dcat:Catalog`: represents a catalog, or collection of datasets, in which each individual item is a metadata record. 
- `dcat:Resource`: represents a dataset (or other resource) that may be described by a metadata record in a catalog. 
- `dcat:Distribution`: represents an accesible form of a dataset such as a downloadable file. 
- `dcat:Dataservice`: represents a collection of operations accessbile through an interface (API). 
- `dcat:DatasetSeries`: a dataset that represents a collection of datasets that are published separately, but share some characteristics that group them. 
- `dcat:CatalogRecord`: a metadata record in the catalog 

Each class has a number of predicates with different levels of obligation. Some elements are mandatory, such as the title or description of a dataset. Others are recommended or optional.  

![Overview of the classes](img/dcat_class_diagram.png)

See [this file](basic-example.ttl) for an example of a data catalogue described with DCAT. 

- Mandatory/recommended items
- [OWL 2](https://www.w3.org/TR/owl2-overview/)
- 


## EU Recommendations: [DCAT-AP](https://op.europa.eu/nl/web/eu-vocabularies/dcat-ap) 
In 2013, the European Union revised some legislation related to the re-use of public data. The revision included the adoption of the "open by default" principle, which formulates that data of public bodies should be open and free by default and design, as well as easily accessible via API if possible. As a response to this new standard, European public administrations set up Open Data portals, setting the first steps to a European open data ecosystem. Lack of standardization lead to a fragmented landscape of in which it was difficult to exchange metadata between the over 150 data portals, and which all had to be queried individually. 


The European Commission recognized the need to improve the fragmented open data environment in Europe. A common metadata language would make it easier to discoer and reuse datasets across portals. With this in mind, the commission extended the already existing DCAT by developing additional requirements and recommendations that would adhere to the European context, resultin in **DCAT-AP** (DCAT Application Profile). It's a specification of DCAT that reuses terms but adds more specificity. It identifies more mandatory, recommended and optional elements that should be used in particulat situations, as well as recommendations for controlled vocabularies that could be used. DCAT-AP is extendable, meaning that further needs for specifications can be implemented. 


This section is based on the information found in [this paper](https://doi.org/10.1504/EG.2022.121856). 


## Other data repositories that use DCAT-AP 

## Overview of elements 


* [DCAT](https://www.w3.org/TR/vocab-dcat-3/) v3: Ontology
    * what is? ontology `=` definition of terms (classes and propretries)
    * development history (when was 1st created; goals; current version 3)
    *   
* [DCAT-AP](https://semiceu.github.io/DCAT-AP/releases/3.0.0/): Schema (how to use the ontology terms: mandatory, property ranges and domains)
     * what is DCAT-AP?
     * focus: around dataset class
     * how is the schema defined: in documentation and SHACL shapes
     * what is the recommendation?
     *  who is using dcat-ap? how are they publishing the metadata in this format?
     * links:
          * https://semiceu.github.io/DCAT-AP/releases/3.0.0/
          * https://op.europa.eu/nl/web/eu-vocabularies/dcat-ap
          * https://interoperable-europe.ec.europa.eu/collection/semic-support-centre/dcat-ap
          * https://interoperable-europe.ec.europa.eu/collection/semic-support-centre/solution/dcat-application-profile-data-portals-europe/release/300
          * https://interoperable-europe.ec.europa.eu/collection/semic-support-centre/solution/dcat-application-profile-data-portals-europe
* [DCAT-AP NL](https://geonovum.github.io/DCAT-AP-NL30/): Dutch version of schema
    * reasoning behind the development DCAT-AP NL?
    * what are the additions introduced by DCAT-AP NL?
    * state of development? 
    *  who is using DCAT-AP NL ?
* DCAT Health (EU) : Health domain version of the schema 
    * reasoning behind the development DCAT Health?
    * what are the additions introduced by DCAT Health?
    * state of development? 
    *  who is using DCAT Health?
    * what is the relation between DCAT Health and [Health-RI metadata](https://github.com/Health-RI/health-ri-metadata/tree/develop) 
    * links
        * https://healthdcat-ap.github.io/
        * https://www.health-ri.nl/nieuws/input-gevraagd-consultatie-nederlands-profiel-op-dcat-ap-30
        * https://ehds2pilot.eu/upcoming_results/extension-of-dcat-ap-healthdcat-ap/
        * 
* [Health-RI](https://github.com/Health-RI/health-ri-metadata/tree/develop)
    * what is?
     