New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indicators for R1.2: (meta)data are associated with detailed provenance #28
Comments
Points raised in online meeting 3 on 18 June 2019
|
To add a litle more: there are many ways of recording provenance such that it can be managed autonoically. The W3C PROV recommendations are not the only way. In fact, provenance information in data models other than PROV has been used for a long time in many research domains since t is commonly critical for evaluation of the re-usability (relevance, quality) of the asset. |
@keithjeffery I am not sure the aspects you mention cannot be satisfied using PROV. As I understand it, PROV is very flexible with its Expanded and Qualified terms and might be able to express all of that. How could an indicator be formulated? Could it enumerate some critical provenance items (like the ones you list), or should we link to existing standards/guidelines that could form the basis for the indicator? If so, which standard/guidelines would be candidates for such a reference? |
Makx –
My preference always is not to be prescriptive (defining which standard to use) but analytical (defining which constraints have to be satisfied) – after all there are ‘many ways to skin a cat’ and whatever method is used is more-or-less irrelevant as long as it meets the objectives. So I would say a mechanism for provenance has to satisfy certain (to be defined) criteria.
Best
Keith
…--------------------------------------------------------------------------------
Keith G Jeffery Consultants
Prof Keith G Jeffery
E: keith.jeffery@keithgjefferyconsultants.co.uk<mailto:keith.jeffery@keithgjefferyconsultants.co.uk>
T: +44 7768 446088
S: keithgjeffery
----------------------------------------------------------------------------------------------------------------------------------
The contents of this email are sent in confidence for the use of the
intended recipient only. If you are not one of the intended
recipients do not take action on it or show it to anyone else, but
return this email to the sender and delete your copy of it.
----------------------------------------------------------------------------------------------------------------------------------
From: makxdekkers <notifications@github.com>
Sent: 01 July 2019 16:50
To: RDA-FAIR/FAIR-data-maturity-model-WG <FAIR-data-maturity-model-WG@noreply.github.com>
Cc: Keith Jeffery <Keith.Jeffery@keithgjefferyconsultants.co.uk>; Mention <mention@noreply.github.com>
Subject: Re: [RDA-FAIR/FAIR-data-maturity-model-WG] Indicators for R1.2: (meta)data are associated with detailed provenance (#28)
@keithjeffery<https://github.com/keithjeffery> I am not sure the aspects you mention cannot be satisfied using PROV. As I understand it, PROV is very flexible with its Expanded and Qualified terms and might be able to express all of that.
On the other hand, I think no-one is proposing (yet) for an indicator to reference PROV-O specifically.
How could an indicator be formulated? Could it enumerate some critical provenance items (like the ones you list), or should we link to existing standards/guidelines that could form the basis for the indicator? If so, which standard/guidelines would be candidates for such a reference?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#28>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ADALU54ETUHJIGTKNNVZD5TP5IRSLANCNFSM4H257LGA>.
|
@keithjeffery Let's see if there are suggestions for those criteria form others in the WG. |
what we expect is that communities identify what provenance information are crucial to the understanding of the digital resource. of course, we can expect general properties (e.g. who created the resource, when it was created, etc), but there will also be provenance specific to the kind of object (e.g. which instrument was used, what chip array was used, what detector was used). We expect that in many cases communities have already specified elements of provenance in their own data formats... FAIR then simply asks that it be mapped to more general purpose provenance languages such as PROV. |
@micheldumontier Are you suggesting that an indicator be added for the mapping of object-specific or domain-specific provenance items to a more general-purpose provenance language? E.g. Mapping of object-specific or domain-specific provenance information
|
I think we need to be very very careful about being prescriptive (either negatively, or positively) about any metadata element, when acting as a high-level working group. As I said during the call, a piece of ancient pottery doesn't have an author. Nor does a mammoth fossil. Nor does an animal in a zoo. Relevant metadata elements cannot be predicted, and therefore IMO, should not be within the scope of a high-level working group. If I were to "invent a metric" for R1.2 (which I have been avoiding!! ...BECAUSE I think it is absolutely none of my business to do so! It's a community-level task!).... I would design something like this:
v.v. mapping: I like the idea of mapping, though I'm loathe to encourage communities to continue to create new vocabularies that represent existing concepts. There's also the problem with providing a common way for agents to discover mappings - so then we end up (potentially) inventing new standards for how to publish mapping resources... which the communities then have to build (and may not have the expertise to do so, depending how they are implemented. Mapping isn't really a trivial problem - just ask those who have spent their careers doing schema-mapping in databases and XML ;-) ) Nevertheless, if we had mapping-made-easy (something similar to what identifiers.org does for mapping between GUIDs of the same thing in different databases) then I am OK with this idea. Anything harder than that, I suspect would not be sustainable. (it isn't even clear if identifiers.org is sustainable) |
@mark - |
@markwilkinson I understand your reluctance to prescribe a particular set of provenance descriptors, because it very much depends on the type of resource and the community in which the resource is used. In that sense, it could be left to community-specific guidelines. This creates maximum potential for reuse within that community. It seems to me that the indicators given in the first comment above, which were based on the contributions in the collaborative document, are probably too specific. Maybe we could propose two new ones: R1.2-01 Provenance information based on community-specific guidelines relevant for the resource
and R1.2-02 Mapping of object-specific or domain-specific provenance information to a cross-domain language
|
@makx - |
@keithjeffery The last bullet says 'e.g.' so it's not prescriptive. Would you have another example that could be included alongside PROV-O? |
@makx - |
@keithjeffery absolutely. I was suggesting exactly the same thing. We need "a thick cloud" of metadata, but we cannot pre-determine what that cloud is composed of (and shouldn't try!) |
This discussion also links nicely with the content of the RDA FAIRsharing WG registry, which is now one of the formally approved RDA outputs. As detailed in #29, many domain/discipline-specific community standards (for representing/reporting digital objects) already contain some provenance, both general information and specific one to the kind of object (who created, when and how, etc...what technology was used, what analytical method etc); these community are not using PROV. Adding R1.2-02 would be too specific. |
@SusannaSansone R1.2-02 tries not to be too specific -- it contains a reference to PROV-O only as an example. The objective was to try to encourage mapping from domain-specific approaches to more general approaches so that people in other domains can also understand the provenance information. |
Please find the current version of the indicator(s) and their respective maturity levels for this FAIR principle. Indicators and maturity levels will be presented, as they stand, to the next working group meeting for approval. In the meantime, any comments are still welcomed. The editorial team will now concentrate on weighing and prioritizing these indicators. More information soon. |
@makxdekkers I understand this "from domain-specific approaches to more general approaches" but then it has to be clear that this only refers to general approaches, because there are many community-specific (that can also implies domain/discipline specific) models/formats (expressed in one or more of metamodels, XML, TAB etc) that include provenance information (without using PROV). Just to pick one example: https://doi.org/10.25504/FAIRsharing.s51qk5 |
@SusannaSansone Indicator R1.2-01M is indeed about provenance information according to community-specific guidelines or standards. Is that not sufficiently clear? If not, how could it be formulated better? |
@makxdekkers if you just say "provenance information according to community-specific guidelines or standards" is ok. My comment was on the example of PROV, which some domain-specific community-specific standards do not use, yet these capture provenance information. |
Dear contributors, Below you can find the indicators and their maturity levels in their current state as a result of the above discussions and workshops. Please note that this thread is going to be closed, within a short period of time. The current state of the indicators, as of early October 2019, is now frozen, with the exception of the indicators for the principles that are concerned with ‘richness’ of metadata (F2 and R1). The current indicators will be used for the further steps of this WG, which are prioritisation and scoring. Later on, they will be used in a testing phase where owners of evaluation approaches are going to be invited to compare their approaches (questionnaires, tools) against the indicators. The editorial team, in consultation with the Working Group, will define the best approach to test the indicators and evaluate their soundness. As such, the current set of indicators can be seen as an ‘alpha version’. In the first half of 2020, the indicators may be revised and improved, based on the results of the testing. If you have any further comments, suggestions regarding that specific discussion, please share them with us. Besides, we invite you to have a look at the following two sets of issues. Prioritisation • Indicators prioritisation for Findability Scoring • Indicators for FAIRness | Scoring |
The text was updated successfully, but these errors were encountered: