Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distribution > dcat:accessURL #1334

Open
bertvannuffelen opened this issue Mar 22, 2021 · 3 comments
Open

distribution > dcat:accessURL #1334

bertvannuffelen opened this issue Mar 22, 2021 · 3 comments
Assignees
Labels
dcat:accessURL dcat:DataService dcat:Distribution dcat feedback Issues stemming from external feedback to the WG future-work issue deferred to the next standardization round requires discussion Issue to be discussed in a telecon (group or plenary)

Comments

@bertvannuffelen
Copy link

In https://www.w3.org/TR/vocab-dcat-2/#Property:distribution_access_url

dcat:accessURL matches the property-chain dcat:accessService/dcat:endpointURL. In the RDF representation of DCAT this is axiomatized as an OWL property-chain axiom.

is stated.

Why is this stated in this way, and why is it axiomatized? This brings a very strict one-on-one relationship between dataservice and distribution which I do not think is often the case.

In many services the dataservice endpointurl points to many datasets, and thus many distributions.
Eg. consider https://docs.github.com/en/rest/reference all are on the same endpoint (https://api.github.com), but for each entity there is a different dataset. See https://docs.github.com/en/rest/reference/projects where there is a collection of projects per organisation and the collection of all projects. So I have for the same endpoint many access urls.

Also inversely, starting from the distribution, the accessURL for a distribution of github projects is https://docs.github.com/en/rest/reference/projects and not https://api.github.com/projects.

The above axiomatisation is maybe possible in some (selective) contexts, but it is for me more a severe restriction than a gain.
Also this axiomatisation creates a binding between distribution and data service which should be clarified at the level of Distribution.

@andrea-perego andrea-perego added dcat dcat:accessURL feedback Issues stemming from external feedback to the WG labels Mar 22, 2021
@andrea-perego andrea-perego added this to To do in DCAT Sprint: Feedback via automation Mar 22, 2021
@andrea-perego andrea-perego added this to the DCAT3 2PWD milestone Mar 22, 2021
@andrea-perego andrea-perego added the requires discussion Issue to be discussed in a telecon (group or plenary) label Mar 22, 2021
@andrea-perego
Copy link
Contributor

andrea-perego commented Mar 28, 2021

Just for the records, the decision behind the specification of the property chain stems from #124

@dr-shorthair
Copy link
Contributor

@bertvannuffelen you may be correct that this property-chain-axiom is too restrictive. While it will be true in some cases, it is unlikely to match all deployments in practice. The statement and the axiomatization was partly pedagogy - to explain the relationships between the various URLs and elements of the DCAT backbone.

However, I wonder if you have noticed that there is also dcat:downloadURL which might be a better match to the scenarios that you are describing? dcat:accessURL is different to this - it is the URL of an API from where the Distribution may be accessed (maybe along with many others).

@bertvannuffelen
Copy link
Author

Before continuing my explanation for me Distributions and Data Services are different things. So if you are combining them then you are talking about entities that are Distributions and Data Services.

For many reasons, I believe that I would try to separate them, trying to make the intersection as small as possible in the usage explanations. I understand there are communities that would maximize the intersection, but this is not my perspective.

For our discussion it should be clear, if the semantical descriptions apply to entities that are both Distributions and Data Services, or that apply to entities that are Distributions but NOT Data services.


In the context that a distribution is distinct from any data service, then
dcat:accessURL is for me the generic way of getting access to the actual data. It might be a webpage in which is written that one has to phone a person and this one will send a tape.

dcat:downloadURL is the URL in which one can download the data in one click in your browser. It is a restriction on dcat:accessURL. So an application has first to consider dcat:downloadURL and then dcat:accessURL.

This last property is actually why for me a distribution is not a REST API. For a REST API I would have no intuitive semantics to give to dcat:downloadURL. It is not part of the REST terminology, nor part from the API terminology. If one talks in API design about downloads then one implements this in a different way, using different technology using different technical specifications. Very often it is even a responsability of a different team.

So personally I do think that the axiomatisation is creating confusion and questions then illustrating semantics.

Lets consider a snapshot that is also be exposed through a REST API.
Then I would model the snapshot dump as a distribution (having dcat:accessURL or dcat:downloadURL) and have dcat:accessService point to the REST API. The endpoint of the REST API is then different from the dcat:accessURL. The values are not bound in any way.


In the context that an entity is expressed both as distribution and data service I do not know what the semantics are. Should and implementation take priority to the value of dcat:endpoint or dcat:downloadURL? Should they be the same? In case of differences what is the intention? I cannot explain that. And for that reason I rather stick to above context.

@andrea-perego andrea-perego modified the milestones: DCAT3 2PWD, DCAT3 3PWD May 4, 2021
@riccardoAlbertoni riccardoAlbertoni added this to To do in DCAT Sprint: Data services via automation Jun 21, 2021
@andrea-perego andrea-perego modified the milestones: DCAT3 3PWD, DCAT3 4PWD Jan 26, 2022
@davebrowning davebrowning added the future-work issue deferred to the next standardization round label Feb 13, 2023
@davebrowning davebrowning added this to To analyse in DCAT: Potential new requirements via automation Feb 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dcat:accessURL dcat:DataService dcat:Distribution dcat feedback Issues stemming from external feedback to the WG future-work issue deferred to the next standardization round requires discussion Issue to be discussed in a telecon (group or plenary)
Development

No branches or pull requests

5 participants