Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage notes [RUN] #86

Closed
jpullmann opened this issue Jan 18, 2018 · 20 comments
Closed

Usage notes [RUN] #86

jpullmann opened this issue Jan 18, 2018 · 20 comments
Labels
dcat due for closing Issue that has been addressed and it is going to be closed if there are no objection within 6 days provenance referencing requirement roles
Milestone

Comments

@jpullmann
Copy link

Usage notes [RUN]

Ability to provide information on how to use the data.

Should this also apply to specific distributions?


Related requirements: Profiles listing [RPFL] Dataset aspects [RDSAT] Dataset publications [RDSP] 
Related use cases: Common requirements for scientific data [ID9] 
@andrea-perego
Copy link
Contributor

andrea-perego commented Jan 20, 2018

Should this also apply to specific distributions?

In theory, yes, although in my experience such information is typically specified at the dataset level. But I don't think we be should prescriptive here, and prevent people from specifying usage notes / caveats at the distribution level, if they want to do so.

@riccardoAlbertoni
Copy link
Contributor

+1 to add this info also at the distribution level.

I think we should also provide some common sense sort of guidance, for instance:
We might suggest that information on how to use the data is provided at the dataset level when applies independently from the specific distributions. Whilst Details and info that applies to distributions should be provided at that level.
Does it make any sense?

@nicholascar
Copy link
Contributor

nicholascar commented Jan 30, 2018

This Requirement, interpreted as a provenance Use Case, in the RDA system: http://patterns.promsns.org/usecase/44

@riccardoAlbertoni
Copy link
Contributor

I suggest to untag "quality", this is not a quality-specific requirement, as this requirement is only indirectly related to data quality assessment.

@azaroth42
Copy link

As previously, and with apologies for the broken record sound, could someone clarify the relationship to content_negotiation for this issue or remove the tag please?

@larsgsvensson
Copy link
Contributor

I agree with @azaroth42 that this does not relate to content_negotiation nor to profile_negotiation, so I remove the tag.

@riccardoAlbertoni
Copy link
Contributor

You can find below some of the possible solutions to support the representation of Usage Note, also based on discussion we had with @andrea-perego and @agbeltran at the last DCAT teleconference. Please let me know which is more reasonable for you so that I can prepare a Pull Request.

Solution 1 : Using skos:note and its subproperties ( e.g., skos:scopeNote, skos:editorialNote).

 ex:dataset a dcat:Dataset;
    skos:scopeNote "This dataset is not updated, it is served as an example."@en.

Pros: This is a simple pattern I have seen applied. For example, we are documenting properties and classes in the DCAT ttl with skos:scopeNote.

Solution 2 : If we think that skos:scopeNote is misleading, we might define a fresh new DCAT term such as dcat:usageNote.

 ex:dataset a dcat:Dataset;
     dcat:usageNote "This dataset is not updated, it is served as example."@en.

Solution 3: Using Web Annotation Vocabulary

 :note   
     a oa:Annotation;
     oa:hasTarget  ex:dataset  ;
     oa:hasBody :textBody ;
     oa:motivatedBy  oa:commenting .
     
 :textBody a oa:TextualBody ;
     rdf:value "This dataset is not updated, it is served as an example." ;
     dc:language "en" ; 
     dc:format "text/plain".

We probably need to be more specific about the motivation of the annotation. Following the instructions about Extending Motivations we can define a dedicated motivation ( i.e. dcat:usageNote, or dcat:useNote if dcat:usageNote clashes with the name of the property defined in solution 2) as it follows

dcat:usegeNote a oa:Motivation;
    skos:broader  oa:commenting .

Solution 4: Using dqv:QualityAnnotation. The concept of Usage Note is somehow related to fitness for use, and then to the notion of quality. We can express the same note above by dqv:QualityAnnotation ( which is a subclass of oa:Annotation).

#DQV
   ex:dataset a dcat:Dataset ;
      #By using DQV, we can use dqv:hasQualityAnnotation relates the dcat dataset to the (quality) annotation, 
        dqv:hasQualityAnnotation :note .
        
    :note  
        # instance of  dqv:QualityAnnotation instead of oa:Annotation
        a dqv:QualityAnnotation ;
        oa:hasTarget  ex:dataset  ;
        oa:hasBody :textBody ;
        # added dqv:qualityAssessment
        oa:motivatedBy dqv:qualityAssessment,  dcat:usegeNote .

    :textBody a oa:TextualBody ;
        rdf:value "This dataset is not updated, it is served as an example." ;
        dc:language "en" ; 
        dc:format "text/plain"

Balancing simple and complex solutions
There might be reasons for supporting both one of the straightforward solutions (i.e., solution 1 and 2) and one of the more complex pattern based on OA and DQV (i.e., solution 3 and 4).
Among the possible reasons, I can count flexibility, reuse of existing patterns.

As far as I remember you cannot have property chain ending by literals, so we might think of providing some sparql Contruct/ insert to convert solution 3 or 4 into 1 or 2. For example,

PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX dqv: <http://www.w3.org/ns/dqv#>
Prefix oa: <http://www.w3.org/ns/oa#>
prefix dc: <http://purl.org/dc/elements/1.1/>

Insert { ?a a dcat:Dataset;
dcat:usageNote ?blang
 }WHERE {
     ?a dqv:hasQualityAnnotation ?note .
     ?note oa:hasBody ?textBody ;
      oa:motivatedBy  dcat:useNote .
     ?textBody a oa:TextualBody ;
            rdf:value ?b;
            dc:language ?lang ; 
            dc:format "text/plain";
  BIND (STRLANG(?b,?lang) AS ?blang)
}

Probably 1+4 is the one which maximises the reuse of what is already othe there as it reuses skos, DQV, and OA being dqv:QualityAnnotation based on web annotation vocabulary.

Let's try a quick vote on the solutions to see how the group is oriented. I suggest to exploit the emoticons with the following meaning

  • laugh : solution 1
  • hooray: solution 2
  • confused : solution 3
  • heart : solution 4
  • rocket : in favour of combined solutions
  • eyes : against combined solutions

Combining more that one emoticons people can try to express a quite composite opinion, let's see if it works ;)
Comments and other attempts of solutions are welcome as well.

@andrea-perego
Copy link
Contributor

Thanks for elaborating this proposal, @riccardoAlbertoni .

I would not opt for SKOS, as none of the documentation properties seems to denote "usage".

I would nonetheless support the use of a simple property, although not excluding the possibility of using a more articulated representation (in case people need it). So, I welcome the idea of providing a mutual mapping. BTW, as I mentioned during the last DCAT call, this links back to the issue around the mapping of qualified and non-qualified relationships (#80).

For the simple property, an option (documented in UC9) is vann:usageNote.

Although VANN is meant to provide a way to annotate vocabularies, the definition of vann:usageNote does not constrain its use to vocabularies. Quoting:

A reference to a resource that provides information on how this resource is to be used.

The possible issue relates to a discussion we had on the use of this property for annotating DCAT, around the question on whether this is a datatype or object property, since it is a subproperty of rdfs:seeAlso, which is supposed to be used as an object property (see #233). As far as I know, vann:usageNote is being used in practice either with a "literal" (narrative text) or to point to a document. So, I'm personally not concerned to using it in DCAT for both options (as it could be done with SKOS documentation properties). I would refer to this option as Solution 0.

However, if the group decides that this option is not acceptable, I would be in favour of Solution 2 - namely, defining a property dcat:usageNote, that could be used either as a datatype or an object property. For this purpose, it can also be made a subproperty of skos:note.

About the alternative, more complex representation, I have yet to make up my mind on your 2 proposals, @riccardoAlbertoni - namely, using either OA (more generic) or DQV (focussing on data quality aspects). It is indeed true that usually this is about fitness for purpose, but there are also other cases, and I'm not sure all of them may fit into the notion of data quality (even considered in its broadest sense). So, maybe it is "safer" to use OA. But, again, at the moment I don't have a strong position in favour of / against either options.

Said that, I think the complex option needs to take into account another aspect. Both OA and DQV have been designed to enable the creation of annotations by anyone - providers and consumers. Since the use case/requirement we are discussing here is about giving data providers a way to specify usage notes, the annotation creator must be explicitly indicated.

Based on this, I would suggest we first take a decision on whether we would like to support or not the solution based on a simple property, and consider the more articulated option afterwards.

About the simple property, my vote is as follows (in cascade):

  1. Solution 0 (vann:usageNote)
  2. Solution 2 (dcat:usageNote with open range) if the group rules out Solution 0.

BTW, I've also contributed my vote accordingly, but it's incomplete, as Solution 0 is not included in the options.

@riccardoAlbertoni
Copy link
Contributor

Thank @andrea-perego,
In principle, I like the idea of reusing the vann:usageNote.
I have recalled that we have used vann, including vann:usageNote property with literals, for documenting DQV. Moreover, checking on the vann-related LOV page, vann looks very widely adopted.

My main concern is that at the moment the Vann URI, http://purl.org/vocab/vann/ is not HTTP dereferenceable and I can't find an RDF representation for vann except for the LOV copy. I wonder if that is another factor that can retain us from recommending the vann:usageNote in DCAT. What do you think?

The members can continue to express their preference as previously explained, they can cast a "heart" on the previous @andrea-perego's post to support solution 0.

@agbeltran
Copy link
Member

I agree that as the VANN vocabulary is not available (getting 404 with the term http://purl.org/vocab/vann/usageNote), it wouldn't be good to recommend it. @larsgsvensson mentioned that this might be a server configuration issue, but the fact is that this seems to be a problem since last year (see #233 (comment)).

@agbeltran
Copy link
Member

@riccardoAlbertoni @andrea-perego I wonder if there is a particular reason you didn't refer to the Data Use Vocabulary, https://www.w3.org/ns/duv, which we also mentioned on the last call?

While it is quite compact in its definitions, it does provide a class duv:Usage and a property d
duv:hasUsage:

duv:Usage a owl:Class, rdfs:Class ;
  rdfs:label "Usage"@en ;
  rdfs:comment "A helpful description of actions that can be performed on a given dataset or distribution."@en .
 
duv:hasUsage a rdf:Property ;
    rdfs:label "has dataset/distribution usage"@en ;
    rdfs:comment "Dataset/distribution usage guidance or instructions."@en ;
.

However, I note that while the definitions seem to refer on how the data can be used (that is the requirement we are trying to address), the example given is more about how the data is actually used (see example 6 at https://www.w3.org/TR/vocab-duv/#examples). The property name duv:hasUsage is also indicating more how the data is used than any restrictions of usage.

In terms of vocabularies for specifying data use restrictions or limitations for Data Use Agreements, in the context of the Global Alliance for Genomic Health, the Data Use Ontology has been developed. It defines common terms for determining how the data case be used (see, for example, data use limitations such as 'data available for future general research use', etc).

My understanding is that also ODRL could be used to express conditions of usage.

So, we need a property that allows us to refer to some textual description of the data use conditions (usage notes), an ontology term (such as those from DUO), as well as a more detailed expression of the usage conditions (such as an ODRL expression).

If uv:hasUsage is appropriate for this, we could re-use it. Otherwise, I think it would be good to create our own property - maybe dcat:usageNote or perhaps a better name could be dcat:useCondition if we restrict it to describe "how to use the data."

(BTW, in the context of DATS, we did some analysis on data access and use requirements that is available in this pre-print: https://doi.org/10.1101/518571)

@andrea-perego
Copy link
Contributor

andrea-perego commented Mar 11, 2019

@agbeltran , use conditions are what licences are meant to describe. The purpose of "usage notes" here is to provide guidance on how to use (as well as how not to use) the data, indicate for which types of "experiments" they are fit, and/or advise about possible limitations, errors, etc. So, I don't think ODRL would be fit for this.

About DUV vs. DQV, I would go for the latter in this case. As you said, DUV has been designed to represent feedback on how data is used. However, "usage notes" (in the sense we mean here) are meant to be a statement from data producers, and I see it more strictly related to intrinsic (qualitative) characteristics of the data.

Said that, I still think we should provide first a simple / flat way of specifying usage notes, and go for DQV (or other vocabularies as OA) for more complex cases.

@larsgsvensson
Copy link
Contributor

I agree that as the VANN vocabulary is not available (getting 404 with the term http://purl.org/vocab/vann/usageNote),

Funny, I just tried it and it works for me... (redirects to http://vocab.org/vann/#usageNote):

$ curl -I -L http://purl.org/vocab/vann/usageNote
HTTP/1.1 302 FOUND
Server: nginx/1.14.0 (Ubuntu)
Date: Tue, 12 Mar 2019 09:04:43 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 194
Connection: keep-alive
Location: http://vocab.org/vann/usageNote
Access-Control-Allow-Origin: *

HTTP/1.1 303 See Other
Server: nginx/1.6.2
Date: Tue, 12 Mar 2019 09:04:43 GMT
Content-Type: text/html
Content-Length: 168
Connection: keep-alive
Location: http://vocab.org/vann/#usageNote

HTTP/1.1 200 OK
Server: nginx/1.6.2
Date: Tue, 12 Mar 2019 09:04:43 GMT
Content-Type: text/html
Content-Length: 12563
Last-Modified: Mon, 19 Oct 2015 00:02:14 GMT
Connection: keep-alive
ETag: "56243306-3113"
Accept-Ranges: bytes

@makxdekkers
Copy link
Contributor

(Already sent by e-mail: https://lists.w3.org/Archives/Public/public-dxwg-wg/2019Mar/0275.html)

I think we need to be careful. A statement of this kind by an individual has less weight than a statement from organisations like W3C or Dublin Core. A person might not be able to honour that commitment -- as in the case of the Canadian cryptocurrency exchange.

A problem is that the vocab.org site recommends all terms to have URIs in purl.org space, but it looks like they were not moved over when purl.org changed owner. So all of them are basically broken.

Unless we can find a way to un-break the VANN URIs, I don't think we can use them

@kcoyle
Copy link
Contributor

kcoyle commented Mar 14, 2019

I'm attempting to ping Ian Davis (vocab.org) to see if he knows something. I'm confused because Lars was successful so I'm wondering in PURL just isn't unreliable. Is there a reason why we can't use the vocab.org namespace?

@makxdekkers
Copy link
Contributor

I think we can't because vocab.org says the canonical URIs are in purl.org. It's like referring to dublincore.org for the DCMI terms.

@iand
Copy link

iand commented Mar 14, 2019

I think we can't because vocab.org says the canonical URIs are in purl.org. It's like referring to dublincore.org for the DCMI terms.

This is correct. However as far as I can tell the URIs are still resolvable.

They are not hosted on purl.org but are registered as a redirect to vocab.org so there should not be any issue with purl.org changing owner.

It's a shame that the purl.org URIs appear to have some availability issues, but this is the web so occasional temporary breakages are to be expected. The meaning of the URIs stands regardless of whether they can be resolved at any particular time.

I chose purl.org as a URI provider to give more stability and longevity than could be expected from an individual. That said, I have provided the vocab.org website for 15 years and have no plans to cease providing it. I also have an availability statement on the site that states I will hand it over to another party if I am unable to continue hosting it: http://vocab.org/policies/availability

One mitigation for purl.org availability would be to import the schema from vocab.org directly http://vocab.org/vann/vann-vocab-20050401.rdf

@kcoyle
Copy link
Contributor

kcoyle commented Mar 15, 2019

@iand Makx has pointed out in email that if you go to the purl.org site and search for "vann" or "vocab/vann" you get zero hits. A search on "vocab" gets various uses of that term. On the second page you see "vocab" alone and that links to vocab.org but this is the full list of subdomains:


    /vocab
    /vocab/cpsv
    /vocab/nsrf-gr
    /vocab/nsrf-gr/
    /vocab/parlipro
    /vocab/thingful/qu
    /vocab/vfp
    /vocab/vfp/
    /vocab/xxwed

Try a search on "iot" and you can see that this works better than on vocab.org. (I'm on Skype and findable if you want to chat.)

@davebrowning davebrowning added the future-work issue deferred to the next standardization round label Sep 25, 2019
@andrea-perego andrea-perego moved this from To do to In progress in DCAT Sprint: Requirements Nov 3, 2020
@andrea-perego
Copy link
Contributor

Trying to revive this issue, taking stock of the discussion in this thread:

Should we define a property dcat:usageNote (or similar), and suggest the use of DQV for more extended descriptions?

Otherwise, if we don't plan to address this requirement, I propose we close this issue.

@andrea-perego andrea-perego added the due for closing Issue that has been addressed and it is going to be closed if there are no objection within 6 days label Mar 13, 2021
@andrea-perego
Copy link
Contributor

Noting no reactions, I'm closing this issue.

DCAT revision automation moved this from To do to Done Mar 20, 2021
DCAT Sprint: Requirements automation moved this from In progress to Done Mar 20, 2021
@andrea-perego andrea-perego removed the future-work issue deferred to the next standardization round label Mar 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dcat due for closing Issue that has been addressed and it is going to be closed if there are no objection within 6 days provenance referencing requirement roles
Projects
DCAT revision
  
Done
Development

No branches or pull requests