Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Usage notes [RUN] #86
In theory, yes, although in my experience such information is typically specified at the dataset level. But I don't think we be should prescriptive here, and prevent people from specifying usage notes / caveats at the distribution level, if they want to do so.
+1 to add this info also at the distribution level.
I think we should also provide some common sense sort of guidance, for instance:
You can find below some of the possible solutions to support the representation of Usage Note, also based on discussion we had with @andrea-perego and @agbeltran at the last DCAT teleconference. Please let me know which is more reasonable for you so that I can prepare a Pull Request.
Solution 1 : Using skos:note and its subproperties ( e.g., skos:scopeNote, skos:editorialNote).
Pros: This is a simple pattern I have seen applied. For example, we are documenting properties and classes in the DCAT ttl with skos:scopeNote.
Solution 2 : If we think that skos:scopeNote is misleading, we might define a fresh new DCAT term such as dcat:usageNote.
Solution 3: Using Web Annotation Vocabulary
We probably need to be more specific about the motivation of the annotation. Following the instructions about Extending Motivations we can define a dedicated motivation ( i.e. dcat:usageNote, or dcat:useNote if dcat:usageNote clashes with the name of the property defined in solution 2) as it follows
Solution 4: Using dqv:QualityAnnotation. The concept of Usage Note is somehow related to fitness for use, and then to the notion of quality. We can express the same note above by dqv:QualityAnnotation ( which is a subclass of oa:Annotation).
Balancing simple and complex solutions
As far as I remember you cannot have property chain ending by literals, so we might think of providing some sparql Contruct/ insert to convert solution 3 or 4 into 1 or 2. For example,
Probably 1+4 is the one which maximises the reuse of what is already othe there as it reuses skos, DQV, and OA being dqv:QualityAnnotation based on web annotation vocabulary.
Let's try a quick vote on the solutions to see how the group is oriented. I suggest to exploit the emoticons with the following meaning
Combining more that one emoticons people can try to express a quite composite opinion, let's see if it works ;)
Thanks for elaborating this proposal, @riccardoAlbertoni .
I would not opt for SKOS, as none of the documentation properties seems to denote "usage".
I would nonetheless support the use of a simple property, although not excluding the possibility of using a more articulated representation (in case people need it). So, I welcome the idea of providing a mutual mapping. BTW, as I mentioned during the last DCAT call, this links back to the issue around the mapping of qualified and non-qualified relationships (#80).
For the simple property, an option (documented in UC9) is
Although VANN is meant to provide a way to annotate vocabularies, the definition of
The possible issue relates to a discussion we had on the use of this property for annotating DCAT, around the question on whether this is a datatype or object property, since it is a subproperty of
However, if the group decides that this option is not acceptable, I would be in favour of Solution 2 - namely, defining a property
About the alternative, more complex representation, I have yet to make up my mind on your 2 proposals, @riccardoAlbertoni - namely, using either OA (more generic) or DQV (focussing on data quality aspects). It is indeed true that usually this is about fitness for purpose, but there are also other cases, and I'm not sure all of them may fit into the notion of data quality (even considered in its broadest sense). So, maybe it is "safer" to use OA. But, again, at the moment I don't have a strong position in favour of / against either options.
Said that, I think the complex option needs to take into account another aspect. Both OA and DQV have been designed to enable the creation of annotations by anyone - providers and consumers. Since the use case/requirement we are discussing here is about giving data providers a way to specify usage notes, the annotation creator must be explicitly indicated.
Based on this, I would suggest we first take a decision on whether we would like to support or not the solution based on a simple property, and consider the more articulated option afterwards.
About the simple property, my vote is as follows (in cascade):
BTW, I've also contributed my vote accordingly, but it's incomplete, as Solution 0 is not included in the options.
My main concern is that at the moment the Vann URI, http://purl.org/vocab/vann/ is not HTTP dereferenceable and I can't find an RDF representation for vann except for the LOV copy. I wonder if that is another factor that can retain us from recommending the vann:usageNote in DCAT. What do you think?
The members can continue to express their preference as previously explained, they can cast a "heart" on the previous @andrea-perego's post to support solution 0.
I agree that as the VANN vocabulary is not available (getting 404 with the term http://purl.org/vocab/vann/usageNote), it wouldn't be good to recommend it. @larsgsvensson mentioned that this might be a server configuration issue, but the fact is that this seems to be a problem since last year (see #233 (comment)).
While it is quite compact in its definitions, it does provide a class
However, I note that while the definitions seem to refer on how the data can be used (that is the requirement we are trying to address), the example given is more about how the data is actually used (see example 6 at https://www.w3.org/TR/vocab-duv/#examples). The property name
In terms of vocabularies for specifying data use restrictions or limitations for Data Use Agreements, in the context of the Global Alliance for Genomic Health, the Data Use Ontology has been developed. It defines common terms for determining how the data case be used (see, for example, data use limitations such as 'data available for future general research use', etc).
My understanding is that also ODRL could be used to express conditions of usage.
So, we need a property that allows us to refer to some textual description of the data use conditions (usage notes), an ontology term (such as those from DUO), as well as a more detailed expression of the usage conditions (such as an ODRL expression).
(BTW, in the context of DATS, we did some analysis on data access and use requirements that is available in this pre-print: https://doi.org/10.1101/518571)
@agbeltran , use conditions are what licences are meant to describe. The purpose of "usage notes" here is to provide guidance on how to use (as well as how not to use) the data, indicate for which types of "experiments" they are fit, and/or advise about possible limitations, errors, etc. So, I don't think ODRL would be fit for this.
About DUV vs. DQV, I would go for the latter in this case. As you said, DUV has been designed to represent feedback on how data is used. However, "usage notes" (in the sense we mean here) are meant to be a statement from data producers, and I see it more strictly related to intrinsic (qualitative) characteristics of the data.
Said that, I still think we should provide first a simple / flat way of specifying usage notes, and go for DQV (or other vocabularies as OA) for more complex cases.
Funny, I just tried it and it works for me... (redirects to http://vocab.org/vann/#usageNote):
(Already sent by e-mail: https://lists.w3.org/Archives/Public/public-dxwg-wg/2019Mar/0275.html)
I think we need to be careful. A statement of this kind by an individual has less weight than a statement from organisations like W3C or Dublin Core. A person might not be able to honour that commitment -- as in the case of the Canadian cryptocurrency exchange.
A problem is that the vocab.org site recommends all terms to have URIs in purl.org space, but it looks like they were not moved over when purl.org changed owner. So all of them are basically broken.
Unless we can find a way to un-break the VANN URIs, I don't think we can use them
This is correct. However as far as I can tell the URIs are still resolvable.
They are not hosted on purl.org but are registered as a redirect to vocab.org so there should not be any issue with purl.org changing owner.
It's a shame that the purl.org URIs appear to have some availability issues, but this is the web so occasional temporary breakages are to be expected. The meaning of the URIs stands regardless of whether they can be resolved at any particular time.
I chose purl.org as a URI provider to give more stability and longevity than could be expected from an individual. That said, I have provided the vocab.org website for 15 years and have no plans to cease providing it. I also have an availability statement on the site that states I will hand it over to another party if I am unable to continue hosting it: http://vocab.org/policies/availability
One mitigation for purl.org availability would be to import the schema from vocab.org directly http://vocab.org/vann/vann-vocab-20050401.rdf
@iand Makx has pointed out in email that if you go to the purl.org site and search for "vann" or "vocab/vann" you get zero hits. A search on "vocab" gets various uses of that term. On the second page you see "vocab" alone and that links to vocab.org but this is the full list of subdomains:
Try a search on "iot" and you can see that this works better than on vocab.org. (I'm on Skype and findable if you want to chat.)