Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is one of dcat:accessURL, dcat:accessService and dcat:downloadURL required in a dcat:Distribution? #1197

Closed
nfreire opened this issue Dec 18, 2019 · 13 comments
Labels
dcat due for closing Issue that is going to be closed if there are no objection within 6 days feedback Issues stemming from external feedback to the WG

Comments

@nfreire
Copy link

nfreire commented Dec 18, 2019

Hello DXWG,

At Europeana, we are designing a profile where we need to use the class dcat:Distribution without using any of the three properties dcat:accessURL, dcat:accessService and dcat:downloadURL.
Question: Is this a correct usage of dcat:Distribution? Or one of the three properties should always be stated?

We felt unsure after reading the DCAT 2 latest editor draft.
I believe these are the passages that made us unsure:

The dcat:Distribution states in the last paragraph of section 6.7:

Links between a dcat:Distribution and services or Web addresses where it can be accessed are expressed using dcat:accessURL, dcat:accessService, dcat:downloadURL, as shown in Figure 1 and described in the definitions below.

In the usage notes for these three properties, it is writen:

dcat:accessService SHOULD be used to link to a description of a dcat:DataService that can provide access to this distribution.

dcat:accessURL SHOULD be used for the URL of a service or location that can provide access to this distribution, typically through a Web form, query or API call.

dcat:downloadURL SHOULD be used for the URL at which this distribution is available directly,

Many thanks in advance.

Kind regards,
Nuno

@nfreire
Copy link
Author

nfreire commented Dec 18, 2019

(Note to @aisaac: FYI)

@aisaac
Copy link
Contributor

aisaac commented Dec 18, 2019

Interested to hear from the DCAT authors, e.g. @andrea-perego and @riccardoAlbertoni :-)

@aisaac aisaac added the dcat label Dec 18, 2019
@makxdekkers
Copy link
Contributor

@nfreire @aisaac
Can you maybe give more information about your use case for using dcat:Distribution without a link to the actual data? As far as I see it, the main, or maybe the only function of dcat:Distribution is to give access to the data.

@riccardoAlbertoni
Copy link
Contributor

Hi @nfreire and @aisaac,

Hello DXWG,

At Europeana, we are designing a profile where we need to use the class dcat:Distribution without using any of the three properties dcat:accessURL, dcat:accessService and dcat:downloadURL.

I admit I have some problems imagining the use case. What do these distributions represent?

The dcat:Distribution states in the last paragraph of section 6.7:

Links between a dcat:Distribution and services or Web addresses where it can be accessed are expressed using dcat:accessURL, dcat:accessService, dcat:downloadURL, as shown in Figure 1 and described in the definitions below.

In the usage notes for these three properties, it is writen:

dcat:accessService SHOULD be used to link to a description of a dcat:DataService that can provide access to this distribution.

dcat:accessURL SHOULD be used for the URL of a service or location that can provide access to this distribution, typically through a Web form, query or API call.

dcat:downloadURL SHOULD be used for the URL at which this distribution is available directly,

Many thanks in advance.

Your use of dcat:Distribution is not the classic intended case, but in general, I like to take a liberal interpretation where "anything is possible unless not explicitly forbidden".
In this case, I do not remember an explicit prohibition, but other editors/contributors might recall otherwise here.

I have reported their bcp 14 definitions below, as I think the difference between "MUST" and "SHOULD" helps,

  1. MUST This word, or the terms "REQUIRED" or "SHALL", mean that the definition is an absolute requirement of the specification.
    ...
  2. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.

In this spirit, I think you can use the dcat:distribution without the properties above provided that
(i) you are not "reinventing" a solution for a use case we have already solved differently;
(ii) you are not using other properties in place of those suggested by DCAT to say what you would have featured with the original DCAT properties.

I hope this general consideration is helpful. If it is not, I am afraid that we need further info about your specific case to frame the reply more.

@nfreire
Copy link
Author

nfreire commented Dec 18, 2019

Thanks a lot for the feedback.

Let me try to explain the use case briefly.
We want data providers of Europeana to describe a dataset with a distribution having a SPARQL endpoint access service. This is all achievable with DCAT2.

The part we are missing from DCAT2 is that the SPARQL endpoint serves more data than the dataset to deliver to Europeana. So providers specify the SPARQL query that will provide the resources that are in the dataset for Europeana.

To represent this distribution of the dataset, we are considering to use a pattern mostly based on PROV. The pattern is this one:.

Dataset SPARQL

The alternative we though of, was to refer to the AccessService from the prov:Activity and also from the dcat:Distribution. As shown below:

Dataset SPARQL (1)

And thanks again for your support.

@andrea-perego
Copy link
Contributor

Thanks for your explanation and contribution, @nfreire .

I would say that your option 2 (i.e., the one using dcat:accessService) is more in line and conformant with DCAT 2.

About the pattern prov:wasGeneratedBy / schema:SearchAction / prov:used (which is not in DCAT 2):

  • IMO, this is an interesting approach that can be potentially applied also to other service types / protocols. Actually, you are outlining a solution for an issue that was raised in the collected use cases (see, e.g., https://www.w3.org/TR/dcat-ucr/#ID18), but that it was not addressed in DCAT 2.
  • I wonder whether you considered using instead prov:wasDerivedFrom, possibly in its qualified form (so to link together the distribution, the data service and the search action with a ternary relationship).

An alternative option would of course be to instantiate the SPARQL endpoint URL with the query itself, and use it as the dcat:downloadURL of the distribution.

@nfreire
Copy link
Author

nfreire commented Jan 8, 2020

Thanks @andrea-perego
You feedback is giving me much to think about.
I'll need to discuss these options within Europeana, but meanwhile I prepared a diagram with the suggestion of the qualification of prov:wasDerivedFrom:

Dataset SPARQL

Any further feedback is welcome.

Best

@andrea-perego andrea-perego added this to To do in DCAT revision via automation Jan 15, 2020
@andrea-perego andrea-perego added the feedback Issues stemming from external feedback to the WG label Jan 15, 2020
@aisaac
Copy link
Contributor

aisaac commented Jan 24, 2020

@andrea-perego we've given a second thought about using prov:derivedFrom. It seems that saying that a distribution derives from an activity is stretching a bit the semantics of the property in PROV. prov:generatedBy seems more natural between an entity like a dcat:Distribution and a prov:Activity. Actually if prov:isDerivedFrom was to be used, it would feel more natural between the Distribution and the DataService (SPARQL endpoint). But then it competes with dcat:accessService, and probably we prefer to keep using dcat:accessService, don't we?

@andrea-perego
Copy link
Contributor

Yes, @aisaac , I see your point.

About this

Actually if prov:isDerivedFrom was to be used, it would feel more natural between the Distribution and the DataService (SPARQL endpoint).

I am not completely sure. I'd rather say that the distribution d is "derived from" the set D of data items available from a data service (e.g., in the sense that d is a subset of D).

@andrea-perego andrea-perego added the future-work issue deferred to the next standardization round label Feb 7, 2020
@riccardoAlbertoni
Copy link
Contributor

@andrea-perego said:

An alternative option would of course be to instantiate the SPARQL endpoint URL with the query itself, and use it as the dcat:downloadURL of the distribution.

For the record, I expand and make a little more explicit what was already said by @andrea-perego as a possible alternative. I think this can be useful to others for future reference.

You (@nfreire) can indicate the URL for the SPARQL query in a dcat:Distribution. Something like
dcat:accessURL http://data.nationallibrary.fi/bib/sparql?query=PREFIX+schema%3A+%3Chttp%3A%2F%2Fschema.org%2F%3E+++Select+%3Furi+where+%7B%3Furi+a+schema%3ACreativeWork+.+%3Furi+schema%3Aurl+%3Furl%7D+%0D%0ALIMIT+10%0D%0A&format=application%2Fsparql-results%2Bjson

In this way, you do not need to specify a dcat:dataService as you are neglecting the fact that there is an endpoint. You might need to specify the proper media type (application/sparql-results+json, I guess).

This alternative is compatible with DCAT 2 and DCAT 2014.

@riccardoAlbertoni
Copy link
Contributor

@nfreire and @aisaac: it seems to me that our replies have explained the support DCAT 2 can provide for this issue. If there are no new updates/doubts from your side, I think we can close this issue.

As we are now considering new requirements for the next round of specification (DCAT3), I wonder if the group and in particular the other editors, want to mint a new specific requirement for DCAT 3 from this issue.

Something like
"DCAT should provide its own way to indicate the query to pose to the data service endpoints."
That would be in the spirit of promoting interoperability beyond the specific @nfreire case.

Any opinion on this?

@nfreire
Copy link
Author

nfreire commented Apr 7, 2020

@riccardoAlbertoni, @aisaac, @andrea-perego
In my opinion, the proposed solution is adequate and we may close the issue.

Further refining a solution for this use case in DCAT 3 would be good, in my opinion. We may have other data services besides SPARQL with similar requirements.
...and thanks all for helping.

@riccardoAlbertoni riccardoAlbertoni added the due for closing Issue that is going to be closed if there are no objection within 6 days label Apr 7, 2020
@riccardoAlbertoni
Copy link
Contributor

The new candidate requirement is registered as a new issue now. So we can close this issue.

DCAT revision automation moved this from To do to Done Apr 8, 2020
@andrea-perego andrea-perego removed the future-work issue deferred to the next standardization round label Mar 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dcat due for closing Issue that is going to be closed if there are no objection within 6 days feedback Issues stemming from external feedback to the WG
Projects
DCAT revision
  
Done
Development

No branches or pull requests

5 participants