Skip to content

Packaging rules, shapes, and data with a defined execution order #643

@liviorobaldo

Description

@liviorobaldo

In the SHACL Inference Rules Task Force, we are defining the table of contents for the SHACL 1.2 Rules draft.

See discussion 637. The provisional table of contents includes "7. Attaching Rules to Shapes", where we should explain how rules and shapes should work together. Below in the discussion, I propose moving this content to 2. Packaging SHACL. Perhaps in "7. Attaching Rules to Shapes" we could just include a brief pointer to 2. Packaging SHACL.

Concerning how rules and shapes should be executed together, in discussion 603, based on my experience with SHACL-SPARQL, I proposed to associate sets of shapes with sets of rules. This can be done with a new resource we might call "cluster" (but we should actually pick a better name, like "bundle", "package", "module", etc.), which would group data with shapes, with rules, or with both (three options in total). Something like:

:cluster-3
  rdf:type srl:Cluster;
  srl:data (
    ...
  );
  srl:ruleSet (
    ...
  );
  srl:shapeSet (
    ...
  ).

If both shapes and rules are present, we need to decide the execution order: should shapes run first, or the rules? As I argued in my recent paper (link provided at the beginning of the discussion), shapes should be executed at least once after the rules. This is because what we want to validate is the inferred knowledge graph, consisting of the initially asserted triples plus all triples that can be logically inferred from them. This aligns with the reasonable assumption that if something logically inferred is invalid, then the originally asserted triples must also be considered invalid. Rules serve then to "modularize the effort" when it is too complex, if not impossible, to write a single shape that internally computes all inferred triples before validation.

However, nothing prevents executing the shapes multiple times. For example, if the shapes already invalidate the initially asserted triples, there is no need to compute the inferred knowledge graph. Shapes could also be re-executed after each "round" of rules execution (rules are iteratively re-executed until no new triple is inferred). However, this approach could be computationally expensive.

In light of this, perhaps the best approach is for the SHACL 1.2 recommendation to eventually specify that executing the shapes is mandatory only at the end of the rules' execution (until saturation), while libraries remain free to optionally execute the shapes at the beginning or after each individual round. They could even allow the programmer to make this choice via special parameters in the relevant functions.

PS. I see that in 2. Packaging SHACL, specifically 2.1 Motivation, there are already three issues listed. Can someone add this one? I’m not sure how to do it myself (and perhaps I cannot, as I don’t have the necessary permissions). Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ProfilesFor SHACL 1.2 Profiles specRulesFor SHACL 1.2 Rules spec.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions