Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data in ont_graph is being subjected to validation? #89

Open
sa-bpelakh opened this issue Aug 4, 2021 · 2 comments
Open

Data in ont_graph is being subjected to validation? #89

sa-bpelakh opened this issue Aug 4, 2021 · 2 comments

Comments

@sa-bpelakh
Copy link

I am invoking pySHACL with 3 separate graphs:

  • A data graph that I wish to validate
  • A shape graph
  • An additional ontology graph, which I do not wish to validate, as it is not well-behaved, but is needed to materialize inferences in my data graph that are needed for validation (mostly the superclasses being referenced in sh:targetClass).

However, I am finding that my shapes are triggering on (and generating violations) entities in the ont_graph. Is that the intended behavior? And if so, what is the point of providing that graph separately?

@ashleysommer
Copy link
Collaborator

ashleysommer commented Aug 19, 2021

Hi @sa-bpelakh

Sorry, I missed this issue when you submitted it. Just getting to addressing it now.
The very same topic came up in a different discussion thread recently.

PySHACL is applying validation to the nodes from the ont_graph, as a by-product of the way the ont_graph mix-in feature is implemented. And at this stage, yes it is the intended behaviour.

And if so, what is the point of providing that graph separately?

The main use of ont_graph is to facilitate the use of the pre-inferencing feature in PySHACL. Eg, if you need to run RDFS or OWL-RL entailment on your datagraph to inflate it prior to validating it. The target datagraph usually doesn't have the ontological constructs embedded in it (like rdfs:Class and owl:Class definitions, etc) to allow the inferencing engine to correctly run. So many users requested the feature to specify an extra ontological graph, which gets merged into (mixed-in) the datagraph prior to the pre-inferencing step. This gives the inferencing engine the knowledge it needs to complete the entailment. We have found in real world testing that this feature is also similarly useful for enabling SHACL-AF Rules to expand the graph based on the ont_graph definitions.

There is however no way to un-mix the ont_graph from the datagraph after this is done. So when the validator finally runs, it does target nodes which came from the ont_graph, if they match the targeting selector rules of the shapes in the Shapes Graph.

The ont_graph feature was implemented that way for two reasons:

  1. The inferencing engine library we use does not work on multi-graph objects like ConjunctiveGraph or Dataset. It works by taking in a single graph, and expanding it based on RDFS and OWL definitions within that graph. So they need to be in the graph when it runs.
  2. When PySHACL was first built, it similarly only operated on the datagraph in a single-graph manner. Ie, it could not validate a datagraph that was a multi-graph object like ConjuctiveGraph or Dataset. It assumed everything needed to validate the datagraph was within the datagraph itself.

Since then, PySHACL has developed support for Multi-Graph objects, so it can validate across named-graphs within a ConjunctiveGraph or a Dataset. Given the discussion in the other thread about this, I am currently working on a change to the way PySHACL implements the ont_graph feature, taking advantage of named-graph separation in a RDFLib Dataset. There is quite a big refactoring of PySHACL core features required to get this working, so it's a delicate process.

I'll update this thread when I have more progress.

@ajnelson-nist
Copy link
Contributor

ajnelson-nist commented Sep 16, 2022

(EDIT: Comment migrated to Issue 170.)

ajnelson-nist added a commit to casework/CASE-Utilities-Python that referenced this issue Sep 16, 2022
References:
* RDFLib/pySHACL#89

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants