-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Values for dcterms:conformsTo of instances dcat:DataService (sparql endpoints, for example) #1211
Comments
It seems that DCAT uses mostly URLs of specifications as in Example 45. Is it a best practice we should follow? |
My preference would be for the use of specification URIs, not namespace URIs. This is because, according to all of the profiles thinking we have done in DXWG, a specification/profile is fundamentally a different thing from a namespace. Obviously specifications might have namespaces for their technical content but a specification is a "larger" thing than just a namespace. The problem here is, of course, that most specifications' URIs are not set up for any sort of machine actioning so the best you could currently hope for there is that the URI provide a universally unique identifier. I hope that, in time, specifications that wish to be well machine-readable provided Linked Data functionality for their specification URI so that you can get to both a human-readable specification document, as you can presently for a spec like, say, DCAT but that you can also get an RDF version of the specification. That specification version should probably be something like a Profiles Vocabulary description of it which then tells you where all the other profile parts, such as machine-actionable constraints etc., are. This question is interesting for the profiles work so I'm tagging it profiles-vocabulary too. |
Namespaces just give you a collection of elements with some axioms - i.e. it is the ingredients, not really the recipe. It would usually be the recipe that you want to conform to - the patterns and usage rules. These might be expressed in a machine actionable form (e.g. SHACL) but will often be supplemented by additional instructions that are expressed in natural language. As @nicholascar says, there will typically be a suite of artefacts associated with a specification. The profiles-vocabulary provides one way to describe and 'package' them. I would expect a document conforming-to the profiles vocabulary to be accessible from the specification URI (i.e. the /TR/ URI, not the /ns/ URI) (using conneg by profile of course ...) |
In our case we would like to use the |
@luizbonino you have a noble goal in mind here but the issue is that the use you're describing is not standardised therefore if you do this, others might not. The proposal in the Profiles Vocabulary, which is aiming at being a standard is to identify specifications and then link things like SHACl documents to them. The documents can have roles so you can know that I think the Profiles Vocabulary can cater for your requirement and it would have you define your specifications of interest (and profiles of them) and the relevant validation resources. |
@luizbonino We also have a similar requirement to yours. In our work around Europeana, we also have difficulties with the use of dcterms:conformsTo for data resources. Some data sources will state that the data conforms to a namespace, others use the URL to an XML-Schema. And we know also that any of these choices will not be appropriate in cases of general data container formats. It is also necessary to know the data profile in use. @nicholascar you pointed out the key aspects of unique identification and the machine readability. |
@nicholascar has pointed out that this use case is a motivation for the Profiles Vocabulary as an information resource that can address such Use Cases. Its worth pointing out that implicit in this is the HTTP-range14 problem - you need to reference the conceptual specification as @dr-shorthair suggested - and then decide what information resource it best has. When you are talking about "schema" - thats a perfect example of where flexibility is needed for different resources that form part of the recipe for a specification. For XML that will be XSD, for JSON that will be JSON-schema, for RDF its RDFS, OWL and/or SHACL. But if using JSON you also might want JSON-LD contexts. At this point you can both describe all the available resources using Profiles - but you can also dereference specification URI using content-negotiation and content-negotiation-by-profile to access resources directly without processing the prof description. Ultimately you need a framework for disseminating the identifiers to the community - having a URL link to some information resource is inherently fragile and non-extensible, so you need to manage non-information-resource URIs and infrastructure to dereference these. At the OGC I am working through publishing profile descriptions for specifications, and setting up such infrastructure - but there is a legacy of embedded document references to think about too. Some members of the DXWG are interested in best practices for publishing specifications from a practical sense, but whether this emerges as a formal deliverable in some guidance document is not clear yet. Please include us in reviews of any proposals to formalise requirements. |
The intent of dct:conformsTo has to do with "established standards". The full definition is: "An established standard to which the described resource conforms." This is where you can say that your resource conforms to ISO 2709 or Oasis-Open "legaldocml-akomantoso". An internal document or application, such as a SHACL file, doesn't really fit this definition. The profiles vocabulary may be better suited to this. I don't see PROF as a substitute for this DC term, but a statement with a different meaning. |
@kcoyle , thanks you Karen, this distinction between established standards and internal documents is key for our needs. I think that, for our case, PROF would be appropriate. The PROF's example 1 (https://www.w3.org/TR/dx-prof/#eg-initial-example) is very close to what we intend. |
It seems to me the important question is from the client perspective. The client parses a metadata document and finds various distributions for a resource of interest; the dc:conformsTo property should provide the criteria to identify a distribution that the client will be able to parse and use. The problem is that there is a dependency on the sophistication of the client-- some clients might require a very specific serialization of the resource representation to work (e.g. xml according to schema X, using vocabulary Y for property values), others might have the logic necessary to handle any application/xml. Thus the dc:conformsTo profile URI needs to be hierarchical or multivalued. Established standards are good, but for a particular community and client, as long as the client recognizes a conformsTo URI for a distribution it can work with, it really doesn't matter if its a 'standard'. |
@smrgeoinfo What does matter is not to change the semantics of the dct:conformsTo, since that is defined by DCMI. If you need different semantics then you need a different property. So, yes, it does matter that it is a 'standard' although the interpretation of 'established standard' is somewhat loose. I would say that using dct:conformsTo to link from a dataset to a specific processing program or internal schema is more than a stretch, and doesn't help the client select an application vs a defined standard. Obviously there are no Dublin Core Cops to stop you, but you may not be communicating well outside of your own narrow community if you use dct:conformsTo with two very different meanings. |
PROF uses To achieve what @smrgeoinfo wants, we just need to see more things that are used as "An established standard...", as per the I imagine that for a "big" standard like ISO 2709, you would use PROF to indicate conformance to it with |
I'll just add that I've joined ISO's Technical Committee 211 with a view to ensuring that the ISO 19* series standards (about geographic information) have sensible URIs to use for conformance claims. This is partly already the case, see https://def.isotc211.org/ which shows how to generate URIs for parts of 19* standards. |
Just remember there are three things happening - all of which are fairly common sense unless mixed up...
So PROF isnt a pre-requisite of |
I'm concerned that the dots are not being fully connected here. That begs the question of what do you get when you dereference the URI denoting a 'standard'? But you could also do content-negotiation on the same URI and get a PROF resource, which is an alternative representation of a standard, in particular to support traversal to a variety of artefacts, some of which are executable in a particular environment. |
@dr-shorthair - there is no actual grounding for "established" - thats up to whoever uses the URI to determine if it meets their needs. there is thus no control over what dereferencing will mean - only a best practice. Generally humans should be able to cope with a HTML rendering of Prof as a "Landing page" - the only thing that would be upset would be automated document harvesters that are unable to ask for the right profile (which would be advertised). Still working through this implementation detail - at least we have canonical mechanisms available now, the question is only one of transition strategy ;-) |
Actually I was not particularly interested in the 'established' question. |
@kcoyle |
Thanks everyone for the input! With @nfreire we've decided that we're going to put our bet on PROF. So ideally we would have URIs for the specifications used with |
This is all food for the Profile Guidance document which looks like it’s still definitely needed! |
Yes good point @nicholascar ! I guess that instead of closing it we could keep it open and tag it with "guidance" so that we don't forget to add it there. And remove the tags "profile-vocabulary" and "DCAT"... |
@aisaac said:
Yep, better to create a separate issue, not to overload this one. @rob-metalinkage , I've copy-pasted your comment in #1338 . I'll reply to you there. |
@aisaac said:
You're right, @aisaac . The revision you refer to stems from discussion in #1225 - in particular, #1225 (comment) -, and I'm afraid I forgot about what discussed in this thread.
Following the discussion in #1225 , we have updated the draft by providing some guidelinles - see the last part of §13.2.1 (Conformance to a standard). But those guidelines are not discussing specifically whether you should point to the protocol or the query language specification. |
@andrea-perego thanks for the answer. I must say I am mildly convinced by the discussion on #1225. The protocol URI seems to have appeared out of the blue there, without explanation. I have tried to see if the European Data Portal could give us expectation of the usage of the protocol URI vs the query language URI. Firing the query
at https://www.europeandataportal.eu/sparql-manager/en/ Maybe this is something we could ask advice from the wider group? Unless you can show me some more motivation, which I would probably accept :-) |
@aisaac said:
I think it came out after I made a reference - see #1225 (comment) - to the list of URIs maintained by OSGeo, where they use https://www.w3.org/TR/sparql11-protocol : https://github.com/OSGeo/Cat-Interop/blob/master/LinkPropertyLookupTable.csv
An argument could be that what distinguishes a service / API are the protocol and query parameters, whereas it is not necessarily bound to a specific query language (there are services / APIs that support different query languages). But we should indeed ask input from the group. |
@aisaac this is not proof as this comes from our Czech catalog where I used it in connection to the referenced discussion. I agree with @andrea-perego in the argumentation that the protocol (e.g. HTTP methods and Media types) is what characterizes the data service more than the specification of the query language. |
Hi, following a suggestion from my colleague @Abbe98 suggested we could have two |
Sorry for my late reply, @aisaac . Not sure The current approach is minimal: the description of the service / API interface is meant to be included in the "document" linked to from Said that, it might be worth extending the current approach by making DCAT able to specify at least a subset of the information in the endpoint description - as the supported query languages, formats, and profiles - if relevant for uses in scope of DCAT (e.g., search / filtering services / APIs). @aisaac , I wonder whether you could elaborate on the requirement of having the query languages included in DCAT records, and the related use cases. PS: Probably, we'd better open a new issue on this specific point. |
The last DXWG plenary (https://www.w3.org/2021/05/18-dxwg-minutes#t02) discussed the option of creating a new GH repo for the Profile Guidance document, and move there all the issues labelled with As this one is also labelled with |
I'm not sure why this is labelled profile-guidance. I don't expect that group to be discussing individual properties. I think we can remove profile-guidance from this, and if it comes up again (?) in that group we can create our own issue. |
I think it is there to remind us to explain how to go about describing a data service using DCAT. |
I don't think we'll be getting into that level of detail in the guidance document. I've been thinking more along the lines of general guidance, like:
If we are going to get into specific aspects of DCAT, like how to define data services, that will be quite a bit more work, and will be less extensible to profiles in general. I can go either way, but we should decide on our scope ASAP. |
Looking over the thread again, I think my initial impression was just wrong. |
The underlying issue is we (the wider world) have a hole around generic description of conformance and specifications and relationship to data, and another one around direct vs. service orientated access to resources. PROF helps but only handles two immediate needs around direct inheritance and qualified link to support annotation of resources that may be used to describe a specification. Specification models - description of types of conformance target - seem to be the key - how to separately describe a range of specifications in play, including but not limited to:
Probably a good thing for DCAT to separate the concerns here and consider canonical properties for different aspects, or a qualified role mechanism. |
Hi, sorry for having let so much time pass. First, I can confirm that my comment was indeed not about profile guidance in general, only about the specific DCAT model. Second, to react to @rob-metalinkage , while I see there might be a need for more advanced machinery, for the moment I can live with anything simple that allows me to indicate that a data service complies with SPARQL. This was the requirement, @andrea-perego ! And whether the solution based on a 'document' would be acceptable for data consumers, I guess this should be passed on the owners of the European Data Portal, as I am no owner of catalogue of data service, just a mere data provider :-) Originally we had agreed that this requirement was to be achieved by using the URI of the SPARQL language with dcterms:conformsTo. If this is deemed less appropriate than using the URI of the SPARQL protocol, I can live with it. Especially because there is a one-to-one relationship between the SPARQL protocol and the query language. . |
Hi, I would like to know if this issue is still tracked somewhere. I'm still keen to discuss this with my colleague @nfreire ! |
Dear DXWG,
I'm working in a profile for dataset description in cultural heritage (within Europena).
I'm looking for guidance on how to specify the values that the
dcterms:conformsTo
values should have, and ensure machine interoperability.I've observed some real-life cases and noticed sometimes the values are namespaces and in other cases the values are the links to the specifications.
Since we are interested in machine interoperability, namespaces look more appropriate but I'm far from certain.
Here is an example. In the case of a SPARQL endpoint...
... should namespaces defined by the protocol be used:
... or the specification documentation:
https://www.w3.org/TR/sparql11-query/
Is there a common practice, or recommendation, within DCAT?
Thanks in advance,
Nuno
The text was updated successfully, but these errors were encountered: