Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indicate a conventional way to automatically validate data instances of application profiles #698

Open
nicholascar opened this issue Jan 26, 2019 · 9 comments

Comments

@nicholascar
Copy link
Contributor

commented Jan 26, 2019

Feedback from @paulwalk:

we don't have a conventional way to automatically validate data instances of application profiles (i.e. data which allegedly conforms to the constraints of a given application profile)

@nicholascar nicholascar added this to the PROF 2PWD milestone Jan 26, 2019

@nicholascar nicholascar self-assigned this Jan 26, 2019

@nicholascar nicholascar added this to To do in Profiles Ontology via automation Jan 26, 2019

@nicholascar

This comment has been minimized.

Copy link
Contributor Author

commented Jan 26, 2019

You can imagine a method deriving from the structures of the PROF ontology where code (a Linked Data crawler) could find a Profiles' resource with prof:role Validation (and perhaps a particular formalism of a Validation resource, such as SHACL or similar) and then it could recurse up a isProfileOf hierarchy finding all similar resources. Then, joining all those resources together, data claiming conformance could be validated against all. This would both ensure data is valid to a Profile and it's dependencies and also potentially allow Profile implementors to only have to define their extensions on the things they profile, not the full set of constraints.

In pseudo code, for data_x claiming adherence to profile_y:

profiles = get_profiles(profile_y)
validators = gather_validating_resources()
validate(data_x, aggregation_of_all_validators)

function get_profiles(profile_uri):
    recurse up a profile hierarchy indicated by isProfileOf statements and return all profiles' RDF

function gather_validating_resources(profiles_metadata, conformance_type):
    profile_validators = null
    for each profile in profiles_metadata:
        for each resource with role Validation and conformsTo conformance_type:
            add resource content to profile_validators

    return profile_validators
@paulwalk

This comment has been minimized.

Copy link

commented Jan 28, 2019

I'm not sure I quite followed this. I'm talking about the need for a conventional way to validate data against a specified application profile. If any given software validation process has to first traverse a graph in order to assemble and reconcile a collection of related validation resources, then I can't imagine that being widely implemented, or even scaling if it is. The opportunity, it seems to me, is for a documented application profile to offer a single, documented resource to facilitate automated validation of data which declares 'conformance' with that application profile.

Most probably I have misunderstood though!

@nicholascar

This comment has been minimized.

Copy link
Contributor Author

commented Jan 28, 2019

@paulwalk I'm not sure you'll get exactly what you want as noone's yet able to agree on The Correct Way to validate against a profile (application profile). For instance, what constraint language is to be used, does validation need to validate against all dependency validators or not, what if dependencies use different forms of validators etc. At the moment, everyone's talking SHACL as a constraint language... except for those who prefer ShEx. And in my XML life, I use Schematron. So right there we don't have language uniformity.

Let's break this point down into sub points, just as I listed them above, but adding a few more lower-level ones:

To "Indicate a conventional way to automatically validate data instances of application profiles" PROF may need to:

  1. Indicate which validators are available
  2. Indicate a particular constraint language validator (standard & format)
  3. Indicate the role the validator plays (all constraints, partial etc.)
  4. Indicate whether dependency validators need to be consulted for validation or if this profile's validator is sufficient

For 1.: Currently Possible. The Profile just includes one or more pref:ResourceDescriptor classes describing validators.

<http://example.org/profile/x> a preof:Profile ;
    prof:hasResource [
        prof:hasArtifact <SOME_FILE_URI> ;
        ...
    ] ;
    ....

For 2.: Currently Possible. PROF suggests use of dct:conformsTo & dct:format to indicate this:

<http://example.org/profile/x> a preof:Profile ;
    prof:hasResource [
        prof:hasArtifact <SOME_FILE_URI> ;
        dct:conformsTo <http://www.w3.org/ns/shacl> ;
        dct:format <https://w3id.org/mediatype/text/turtle> ;
        ...
    ] ;
    ....

For 3.: Currently Possible. PROF uses prof:hasRole to indicate the particular role a prof:ResourceDescriptor plays with suggested roles defined at https://w3c.github.io/dxwg/profilesont/resource_roles.html but more are possible.

<http://example.org/profile/x> a preof:Profile ;
    prof:hasResource [
        prof:hasArtifact <SOME_FILE_URI> ;
        dct:conformsTo <http://www.w3.org/ns/shacl> ;
        dct:format <https://w3id.org/mediatype/text/turtle> ;
        prof:hasrole <http://www.w3.org/ns/dx/prof/role/validation> ;
        ...
    ] ;
    ....

For 4.: Not currently spelled out but perhaps able to be indicated with roles. E.g. if a validator with Role Full Constraints is used (defn: "Complete set of constraints for a profile") then it is sufficient to use this for profile validation. If only a validator with Part Constraints is available, then more info is needed.

I could imagine a new Role: "Differential Constrains" which would be this Profile's constraints, i.e. only those that this profile adds on top of those belonging to the things it is profiling. If this was available, you'd know you'd have to pull in dependency constraints to perform a complete validation.

I will suggest updates to the Roles Vocab for this.

@paulwalk

This comment has been minimized.

Copy link

commented Jan 28, 2019

Thanks, Nick, for taking the time to go into the detail like this.

I think we have a fundamental difference in understanding on this. In the use-case where someone wishes to validate data against some declared conformance with some metadata profile, I just don't see any advantage in the validation process being able to somehow automatically interrogate the application profile to determine which mechanisms its supports. This can be simply documented in prose - it is for the systems developer to decide which mechanism they want to use. The mechanisms themselves need formal descriptions, but that is a separate issue being taken care of by those communities.

This slightly reminds me of the huge efforts made to introduce Universal Description, Discovery, and Integration (UDDI) to web-services around the beginning of this century. Then Web2.0 came along and showed that all that was really needed was some nice clear documentation aimed at developers - what has become known as the 'Web API'.

@agreiner

This comment has been minimized.

Copy link
Contributor

commented Mar 7, 2019

I think Paul's interpretation is what we should be aiming for. He says "The opportunity, it seems to me, is for a documented application profile to offer a single, documented resource to facilitate automated validation of data which declares 'conformance' with that application profile." +1
The discussion around inheritance and traversing a graph just to check conformance seems to be introducing barriers to use and to mixing vocabulary from multiple standards.

@tombaker

This comment has been minimized.

Copy link

commented Mar 18, 2019

The discussion around inheritance and traversing a graph just to check conformance seems to be introducing barriers to use and to mixing vocabulary from multiple standards.

@agreiner I completely agree. Notions of inheritance differ across the various technologies that people are likely to use, and it does not seem realistic to coin properties, such as prof:isInheritedFrom, with the assumption that people will share a common notion of inheritance.

@rob-metalinkage

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2019

OK - so lets just focus on what is proposed again:

  1. the ability to declare conformance of a resource to a specific profile
  2. the ability to indicate that a specific profile conforms to more general profiles

(and I think we have just shown that this cant be done with any universally applicable constraint language - so thats the motivator here)

  1. a flexible way of referencing the various forms of documentation in a way a machine can find an appropriate resource for a particular function.

Whilst we can understand a desire for a universal validation use case, thats just not realistic (that would be the UDDI equivalent). So this related Use Case is limited to finding what resources are available and canonical description of them, not a canonical form of them.

So, we can accept the comment "we don't have a conventional way to automatically validate data instances of application profiles (i.e. data which allegedly conforms to the constraints of a given application profile)" as a truism. I dont see a concrete need or proposal for change beyond stressing the scope. Adding in the "competency questions" section ( #732 ) should help keep scope in mind.

@rob-metalinkage

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2019

re "The discussion around inheritance and traversing a graph just to check conformance seems to be introducing barriers to use and to mixing vocabulary from multiple standards." - this is exactly what happens if you import RDFS or OWL axiomitisation. It is a barrier perhaps - but its the responsibility of the chosen set of constraints languages and how they are used that determines if navigation is necessary. The Profiles Vocabulary is agnostic about this choice. It provides the new capability of declaration of intent w.r.t. to conformance in heirarchies however, and this thread has not affected that underlying requirement or the proposed solution, so I think it can be closed as out-of-scope.

@tombaker

This comment has been minimized.

Copy link

commented Mar 19, 2019

@rob-metalinkage writes:

So this related Use Case is limited to finding what
resources are available and canonical description of
them

Okay, the requirement is resource discovery.

its the responsibility of the chosen set of constraints
languages and how they are used that determines if
[traversing a graph to check conformance] is necessary.
The Profiles Vocabulary is agnostic about this choice.

The Profiles Vocabulary implies a generalized notion of
"inheritance". Can SHACL, PDF, Schematron, CSV, etc, be
said to follow a common notion of inheritance?

  1. the ability to declare conformance of a resource to a
    specific profile

"Declaring conformance" goes well beyond resource
discovery. The Profiles Vocabulary cannot be used to
determine conformance of data to a specific profile;
that is the responsibility of the chosen constraint
language. Is the goal simply to record a conformance
result in RDF data? If so, are we to assume that
profiles and datasets are immutable? (Because otherwise,
the RDF data could be making an assertion that is no
longer true.)

Users can use appropriate technologies to test for
themselves whether a resource conforms. Is this not
better than trusting a "declaration of intent" (as you
put it) made in some RDF data at a particular point in
time?

Or is this not about conformance of data at all?

It is confusing to talk about conformance of "a resource"
to a profile, when profiles are typically used to test
the conformance of data, where you actually seem to mean
conformance of a particular expression of the profile
(e.g., SHACL, Schematron) to The Profile in a more general
sense".

"Resource" is unhelpful as a choice of words because in
RDF, everything is a resource.

  1. the ability to indicate that a specific profile
    conforms to more general profiles

(and I think we have just shown that this can't be done
with any universally applicable constraint language - so
thats the motivator here)

I'm seeing several types of conformance here:

  1. Conformance of a profile to a standard used to express
    the profile (one example in PV shows something,
    presumably a SHACL graph, which conformsTo SHACL).

  2. Conformance of data to a profile (which is perhaps out
    of scope of the PV?).

  3. Conformance of a profile to another profile. I am
    struggling to see what this means in the general
    sense. Does a profile "conform" if it restricts
    another? If it "extends" another? Or do you mean to
    say that one profile actually validates another?
    Or do you mean that alternative expressions of a
    profile conform to each other?

A requirement for discovering profiles related to
particular dataset would make sense to me, but not a
requirement for taking a SHACL/ShEx/Schematron/whatever
validation result at a given point of time and expressing
it as a "declaration of intent" in RDF.

  1. a flexible way of referencing the various forms of
    documentation in a way a machine can find an
    appropriate resource for a particular function.

Right, resource discovery, though I'm unclear on what
"particular function" means here.

It provides the new capability of declaration of intent
w.r.t. to conformance in heirarchies however, and this
thread has not affected that underlying requirement or
the proposed solution, so I think it can be closed as
out-of-scope.

I do not see a requirement to express "declaration of
intent w.r.t. to conformance in hierarchies" in RDF data
about profile documentation. Where is the use case? I
am struggling to see why this is considered to be in
scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.