Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catalogues in which dataset is an undifferentiated set of files #256

Closed
dr-shorthair opened this issue Jun 19, 2018 · 5 comments
Closed

Comments

@dr-shorthair
Copy link
Contributor

dr-shorthair commented Jun 19, 2018

Catalogues in which dataset is an undifferentiated set of files

Status: Proposed

Identifier: ID53 (Proposed)

Creator: Simon Cox

Deliverable(s): DCAT

Tags

dcat, dcat:Dataset

Stakeholders

Operators of legacy catalogues, consumers of legacy catalogues

Problem statement

In many legacy catalogues and repositories (e.g. CKAN), ‘datasets’ are ‘just a bag of files’. There is no distinction made between part/whole, distribution (representation), and other kinds of relationship (e.g. documentation, schema, supporting documents) from the dataset to each of the files.

While the precision we provide in DCAT is valuable in terms of semantics, it is often difficult to implement on these legacy systems. In particular, we only mention one property to link from dcat:Dataset to another artefact, i.e. dcat:distribution. This is designed to link from a dataset to a representation of the whole dataset. Guidance is required for links to resources which are deemed to be elements of a dataset, but where the nature of the relationship is unspecified.

Existing approaches

People mostly use the dcat:distribution relationship for links to all dataset elements. This is strictly incorrect in many cases where dct:hasPart, dct:conformsTo, dct:requires, dct:references, or another relationship, would properly describe the relationship.

Links

''Optional link list to documents and projects this use case refers to''

Requirements

  1. add a recommendation ('should' statement) to use dct:hasPart to link from a dcat:Dataset to each file or other resource that is part of the dataset package, where the nature of the relationship is unknown
  2. add the axiom dcat:distribution rdfs:subPropertyOf dct:hasPart . so that the recommendation is consistent with all potential relationships between datasets and their elements

Note: other relevant relationship predicates dct:conformsTo, dct:requires, dct:references are already sub-properties of dct:hasPart.

Related use cases

ID32 - Relationships between datasets

Comments

''Optional section for editorial comments, suggestion and their interactive resolution''


@dr-shorthair
Copy link
Contributor Author

This issue is to formally create the use-case for the UCR document. Discussion is already underway in #253

@dr-shorthair
Copy link
Contributor Author

Also raised in #81

@dr-shorthair
Copy link
Contributor Author

Comment received by email from Nuno Friere is potentially related to this and also to #259. Nuno suggests that void:rootResource might be a useful property to a dataset 'manifest'.

@dr-shorthair
Copy link
Contributor Author

UC accepted by DXWG plenary https://www.w3.org/2018/07/03-dxwg-minutes#x09

@dr-shorthair dr-shorthair changed the title Catalogues in which dataset is a bag of files Catalogues in which dataset is an undifferentiated set of files Jul 5, 2018
@dr-shorthair dr-shorthair added this to To do in DCAT revision via automation Jul 9, 2018
@dr-shorthair dr-shorthair moved this from To do to In progress in DCAT revision Jul 9, 2018
@dr-shorthair
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
DCAT revision
  
Done
Development

No branches or pull requests

2 participants