Skip to content

Use case: A boolean property being true disallowing another property #341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ajnelson-nist opened this issue Mar 31, 2025 · 19 comments
Open
Labels
Inferencing For SHACL 1.2 Inferencing spec. Node Expressions For SHACL 1.2 Node Expressions UCR Use Cases and Requirements

Comments

@ajnelson-nist
Copy link
Contributor

This is a fuller illustration of this comment of Issue 311, in response to this later comment also on 311. This is a modeling exercise, but rather than discuss the "right" way to model this topic, I mean to look through strategies available to SHACL 1.0 and potentially available to Node Expressions.

This example considers effects of a single boolean-valued property on other properties, but discussion should consider that there are more complex conditions that could involve multiple properties.

A community I work with has a class for actions, which might or might not link a performer of the action. For example, this would be Alice sending an email:

ex:sendEmail-1
  a ex:Action ;
  ex:performer ex:Alice ;
  .

We would like to represent that some actions are automated. If an action is automated, it would not have a performer.

ex:receiveEmail-1
  a ex:Action ;
  ex:isAutomated true ;
  .

The designation of automation was initially proposed as a boolean variable, and the question later came about whether an automated action should have a performer or should be disallowed from having a performer. We have a class hierarchy rooted at ex:Action that makes it slightly less attractive to define a subclass ex:AutomatedAction, because of some desires in the community (e.g., considerations for programming language bindings) to avoid multi-typing.

We also have automatically generated documentation, within which this rule should "render nicely."

The validation rule to implement would be, "If ex:isAutomated is true (explicitly), then the maximum count of ex:performer is 0."

The condition universal to the below examples is the constraint that there is at most one value of ex:isAutomated on any action, and it's boolean-valued:

ex:Action
  a rdfs:Class, sh:NodeShape ;
  sh:targetClass ex:Action ;
  sh:property [
    sh:path ex:isAutomated ;
    sh:datatype xsd:boolean ;
    sh:maxCount 1 ;
  ] ;
  .

Styles available to SHACL 1.0

Define a class

This implementation style eschews the boolean variable entirely.

ex:AutomatedAction
  a rdfs:Class, sh:NodeShape ;
  rdfs:subClassOf ex:Action ;
  sh:targetClass ex:AutomatedAction ;
  sh:property [
    sh:path ex:performer ;
    sh:maxCount 0 ;
  ] ;
  .

If both subclass and boolean variable are in the model, they can at least be constrained to be consistent within the subclass:

ex:AutomatedAction
  sh:not [
    sh:property [
      sh:path ex:isAutomated ;
      sh:hasValue false ;
    ] ;
  ] ;
  .

Disjunctive form of implication

Following "p → q ⇔ ¬p ∨ q", the consistency check across the two properties would be an sh:or on the node shape:

ex:Action
  sh:or (
    [
      sh:not [
        sh:property [
          sh:path ex:isAutomated ;
          sh:hasValue true ;
        ] ;
      ] ;
    ]
    [
      sh:property [
        sh:path ex:performer ;
        sh:maxCount 0 ;
      ] ;
    ]
  ) ;
  .

Without an sh:message, my understanding is this form gets pretty verbose in a SHACL validation report where the message is generated from the reporting constraint component.

Styles available to Node Expressions (?)

Please excuse misunderstandings of the current proposals of node expression syntax. Corrections welcome. I saw sh:if mentioned on #215 and #222 .

Conditional constraints

This is drawn from #311 .

ex:Action
  sh:property [
    sh:path ex:performer ;
    sh:maxCount [
      sh:if [
        sh:property [
          sh:path ex:isAutomated ;
          sh:hasValue true ;
        ] ;
      ] ;
      sh:then 0 ;
      sh:else 1 ;
    ] ;
  ] ;
  .

Targeting nodes

This is drawn from a remark as @tpluscode and @HolgerKnublauch were discussing #311 on today's WG call. The gist is that this case should be handled well enough by a node expression on sh:targetNode.

Note: I'm shaky on this syntax. Can corrections please be ported over to #339 , which I think is the most specific Issue now for sh:targetNode? I'm eyeing sh:SPARQLTarget and sh:condition from the 2017 state of SHACL-AF for syntax hints.

ex:Action-extra-shape
  a sh:NodeShape ;
  sh:property [
    sh:path ex:performer ;
    sh:maxCount 0 ;
  ] ;
  sh:targetNode [
    a sh:NodeExpression ;
    sh:condition [
      a sh:NodeShape ;
      sh:property [
        a sh:PropertyShape ;
        sh:path ex:isAutomated ;
        sh:hasValue true ;
      ] ;
    ] ;
  ] ;
  .

Styles available to Rules

Entail a class

This strategy does not use a node expression, and instead uses a sh:TripleRule from SHACL-AF---a topic for the inferencing document, IIRC, but topic-scheduling corrections are welcome.

If the subclass ex:AutomatedAction from above is defined in the ontology and shapes graph, and it's desired for users to be relieved of needing to assign rdf:type ex:AutomatedAction in their data, then a rule can entail ex:AutomatedAction before validation is run to check the no-performers constraint.

ex:Action
  sh:rule [
    a sh:TripleRule ;
    sh:subject sh:this ;
    sh:predicate rdf:type ;
    sh:object ex:AutomatedAction ;
    sh:condition [
      a sh:NodeShape ;
      sh:property [
        a sh:PropertyShape ;
        sh:path ex:isAutomated ;
        sh:hasValue true ;
      ] ;
    ] ;
  ] ;
  .

This strategy most closely mirrors what would be available in OWL by defining ex:AutomatedAction with an equivalence class, and running OWL inferencing before SHACL validation:

ex:AutomatedAction
  a owl:Class ;
  owl:equivalentClass [
    a owl:Class ;
    rdfs:subClassOf
      ex:Action ,
      [
        a owl:Restriction ;
        owl:onProperty ex:isAutomated ;
        owl:hasValue true ;
      ]
      ;
  ] ;
  .
@ajnelson-nist ajnelson-nist added Inferencing For SHACL 1.2 Inferencing spec. Node Expressions For SHACL 1.2 Node Expressions UCR Use Cases and Requirements labels Mar 31, 2025
@VladimirAlexiev
Copy link
Contributor

@ajnelson-nist thanks for the detailed "styles" analysis.

I can give plenty more examples of conditional fields.
Many years ago I worked on Customs software/forms/validation, and they go like this:

  • if key field has value X then enable field group Y (and some of them are mandatory) and disable field group Z.

For example: if TARIC code indicates motor vehicle (a list of 5 code patterns)
then enable and make mandatory a number of BG-specific fields for make, model, engine volume, VIN, etc.

@afs
Copy link
Contributor

afs commented Apr 1, 2025

Excellent use case!

#311 (comment)

dynamic computation of target nodes

A library of shapes would not have often targets. It is the application (user of the library) that is in control of targets. So while sh:targetNode [ NE ] gives control it is only at the top level of imports.

How does a library provide conditional effects?

if key field has value X then enable field group Y (and some of them are mandatory) and disable field group Z.

It is a small step to wanting a way to conveniently write:

    sh:if [ condition1 ]
    sh:then ??1 
    sh:else 
    [ 
      sh:if [ condition2 ]
      sh:then ??2
      sh:else
      [ 
        sh:if [ condition3 ]
        sh:then ??3
        sh:else
        [ 
        . . . 

that is, an ordered decision flow.

In the email example of the discussion, sending the email "on behalf of", "sent by legal representative", "by department" each of which could have specific shapes to apply.

While maybe it could all be done with subclasses, that isn't always desirable. Adding to an existing data model changing the class structure can have widespread effects whereas adding conditions is more local.

@HolgerKnublauch
Copy link
Contributor

Yes I am also aware of such use cases and this is a good example.

As an aside: In the dash namespace we added https://datashapes.org/constraints.html#CoExistsWithConstraintComponent for one such use case. If these are common enough, then maybe we should add them as general SHACL Core constraint components for things like either-or. But the boolean flag makes this hard to generalize.


Here is an alternative assuming we allow node expressions at sh:deactivated AND #173

The following is using RDF 1.2 (assuming I have understood the current draft syntax correctly https://www.w3.org/TR/rdf12-turtle/#annotation-syntax)

ex:Action
  sh:property [
    sh:path ex:performer ;
    sh:maxCount 1 ;
    sh:maxCount 0 ~ :t {| sh:deactivated [
        sh:not [
            sh:property [
                sh:path ex:isAutomated ;
                sh:hasValue true ;
            ] ;
        ] ;
    ] |} ;
  .

This means "by default use maxCount 1. Activate sh:maxCount 0 if isAutomated = true."

This mechanism of course isn't ideal either, but applies the policy that node expressions are only allowed at sh:values, sh:targetNode and sh:deactivated. What isn't ideal is that you need to formulate the condition in the inverse order than what you would probably want to do. Furthermore, there still is the sh:maxCount 1, which may lead to double evaluation. However, having the sh:maxCount 1 there is good for a UI tool because it can at least know something about this property, which is difficult when sh:maxCount is computed with a node expression.

The if-then-else is easier to read and understand. Whether this is enough to introduce the complexity is another question. Happy to see more examples.


Another option that already works in 1.0 would be to use SPARQL:

ex:Action
    ...
    sh:sparql [
        sh:message "Automated actions cannot have a performer." ;
        sh:select """
            SELECT $this ?value
            WHERE {
                $this ex:isAutomated true .
                $this ex:performer ?value .
            }
        """
    ] ;
  sh:property [
    sh:path ex:performer ;
    sh:maxCount 1 ;
  ] .

which to me doesn't look TOO bad either. SPARQL remains the swiss army knife for almost anything.

@HolgerKnublauch
Copy link
Contributor

HolgerKnublauch commented Apr 1, 2025

BTW the original snippet wouldn't work with the current node expression draft from SHACL-AF, because the value of sh:if must return a boolean. So there would need to be some intermediate node expression such as sh:hasShape that takes a shape as its value, e.g.:

ex:Action
  sh:property [
    sh:path ex:performer ;
    sh:maxCount [
      sh:if [
        sh:hasShape [
            sh:property [
                sh:path ex:isAutomated ;
                sh:hasValue true ;
            ] ;
        ] ;
      ] ;
      sh:then 0 ;
      sh:else 1 ;
    ] ;
  ] .

Surely with the Node Expr document not even started, the syntax can be optimized for similar use cases. For example the hasValue pattern could become a single function such as

ex:Action
  sh:property [
    sh:path ex:performer ;
    sh:maxCount [
      sh:if [
        sh:hasPropertyValue ( ex:isAutomated true )
      ] ;
      sh:then 0 ;
      sh:else 1 ;
    ] ;
  ] .

but similar syntactic sugar could be introduced for the sh:targetNode use cases too.

And yeah, SPARQL again is another option:

ex:Action
  sh:property [
    sh:path ex:performer ;
    sh:maxCount [
      sh:if [
        sh:sparqlExpr "EXISTS { $this ex:isAutomated true }" ;
      ] ;
      sh:then 0 ;
      sh:else 1 ;
    ] ;
  ] .

@afs
Copy link
Contributor

afs commented Apr 1, 2025

sh:maxCount 0 ~ :t {| sh:deactivated [

[] sh:maxCount 0 is still a triple in the shape regardless of the annotation.

Annotation syntax asserts the triple and separately talk about the triple but the truth value of the triple isn't changed.

@HolgerKnublauch
Copy link
Contributor

Yes I know, @afs. But the presence of sh:deactivated true would be picked up by the SHACL engine and then deactivate the triple for validation.

@ajnelson-nist
Copy link
Contributor Author

Thank you all for the reminder about SPARQL. I'd just mentally left it as a given.

@HolgerKnublauch , I think your RDF 1.2 example wouldn't work unless we relaxed this part of SHACL-SHACL:

sh:property [
sh:path sh:maxCount ;
sh:datatype xsd:integer ; # maxCount-datatype
sh:maxCount 1 ; # maxCount-maxCount
] ;

In some conditions, we'd end up with 2 sh:maxCount constraints on the same sh:PropertyShape. I suspect trying the sh:deactivated trick on the sh:maxCount 1 as well would just end up being & looking too brutish.

Also, thank you for the note on the whoopsie'd sh:if. Though, why did you suggest a new predicate sh:hasPropertyValue instead of sh:condition? It looks like sh:condition is primed to be a Boolean-returning mechanism we'd want on sh:if.

@HolgerKnublauch
Copy link
Contributor

@ajnelson-nist with node expressions, any static analysis like shacl-shacl will be next to impossible anyway. For example, node expressions can also (easily) produce multiple values for sh:maxCount, return strings etc. That's in fact one of the reasons why I am sceptic about this feature - it just allows a bit too much flexibility (besides potential performance problems that I still need to write about).

But yes, I fully agree that the sh:deactivated pattern here doesn't look good. It was just here as another option to consider. I can construct other examples where it actually looks better.

In SHACL-AF 1.1, sh:condition is only defined for rules, not general node expressions. In 1.2 we could, however, reuse that same property as a node expression function.

@afs
Copy link
Contributor

afs commented Apr 1, 2025

Yes I know, @afs. But the presence of sh:deactivated true would be picked up by the SHACL engine and then deactivate the triple for validation.

Why would it look for an annotation?
There is a SHACL 1.0 meaning.

@HolgerKnublauch
Copy link
Contributor

It would look for the reification when we have agreed to do #173

@afs
Copy link
Contributor

afs commented Apr 2, 2025

I don't think this is the same as #173 which is additional information.

If sh:maxCount is going to have the additional capability to have a calculated value, then using the node expression blank node syntax can be seen as an advantage because the syntax breaks SHACL 1.0 which might otherwise come to the wrong conclusion.

@recalcitrantsupplant
Copy link

The use case sounds like a cardinality version of qualified value shape, so something like:

ex:ActionShape 
  a sh:NodeShape ;
  sh:targetClass ex:Action ;
  
  sh:qualifiedCardinalityConstraint [
    sh:path ex:isAutomated ;
    sh:hasValue true ;
    sh:qualifiedMinCount 1 ;
    sh:qualifiedMaxCount 1 ;  
    sh:qualifiedCardinalityConstraintDisjoint true ;
  ] ;
  
  sh:qualifiedCardinalityConstraint [
    sh:path ex:performer ;
    sh:qualifiedMinCount 1 ;
    sh:qualifiedMaxCount 1 ;
    sh:qualifiedCardinalityConstraintDisjoint true ;
  ] ;

Given all of the other options listed above I don't know that another property needs to be added - it just looks conceptually similar to me.

@VladimirAlexiev
Copy link
Contributor

@HolgerKnublauch wrote

ex:Action
  sh:property [
    sh:path ex:performer ;
    sh:maxCount 1 ;
    sh:maxCount 0 ~ :t {| sh:deactivated [
        sh:not [
            sh:property [
                sh:path ex:isAutomated ;
                sh:hasValue true ;
            ] ;
        ] ;
    ] |} ;
  .
  • maxCount 0,1 should not be allowed. maxCount should be a single-valued property (maybe SHACL-SHACL already checks that)
  • From a data modeler's point of view, this is an awfully complicated way of saying
    "if isAutomated then performer is not allowed":
    all these double negations make my head spin

@ajnelson-nist
Copy link
Contributor Author

ajnelson-nist commented Apr 8, 2025

@VladimirAlexiev - SHACL-SHACL already checks for maxCount being single-valued. I'd noted this above.

@recalcitrantsupplant - I haven't used that disjointedness component before (found docs down in SHACL 1.0's 4.7.3 - the predicate's sh:qualifiedValueShapesDisjoint). Indeed, that's another way of spelling out this rule-set. Thank you for noting that!

@HolgerKnublauch
Copy link
Contributor

@VladimirAlexiev yes I am aware this is not a better syntax for this use case, and the double negation is indeed bad. Maybe we also need sh:activated to allow expressing both directions easily.

In any case, this example here is the only one anyone has yet produced as an argument in favor of allowing this flexibility. Yes it's good, but I hope to see more.

Also: Is anyone aware of any other schema/modeling language where constraint parameters can be computed dynamically?

@TallTed
Copy link
Member

TallTed commented Apr 8, 2025

this example here is the only one anyone has yet produced as an argument in favor of allowing this flexibility

I'm generally in favor of flexibility, even when it has the effect of providing users with an efficient foot-gun. Making the self-destructive behavior non-default, requiring the user to take one or more specific steps to cause that behavior, is generally sufficient protection.

@HolgerKnublauch
Copy link
Contributor

FWIW in earlier iterations of SHACL 1.0 we had something like sh:filterShape which was supposed to define pre-conditions that should be evaluated before the actual constraint check happens. It was dropped due to time constraints of the WG but could be revived in 1.2 to (arguably) better solve IF-THEN scenarios like this one here. The syntax would have been

ex:Action
    sh:property [
        sh:path ex:performer ;
        sh:maxCount 1 ;    # This is global default, which is IMO useful to state in any case
    ] ;
    sh:property [
        sh:path ex:performer ;
        sh:maxCount 0 ;
        sh:filterShape [
            sh:property [
                sh:path ex:isAutomated ;
                sh:hasValue true ;
            ] ;
        ] ;
        sh:message "Automated actions cannot have a performer." ;
    ] .

That solution wouldn't need to introduce node expressions and is IMHO quite readable. It also allows us to define a custom sh:message, which is more user friendly than treating 1 and 0 in the same property shape without explanation.

@afs
Copy link
Contributor

afs commented Apr 10, 2025

This is using the fact that maxCount 1 is harmless when maxCount 0 applies.

Do you have to write the filter shape twice, once in positive form, once in negative form?

@HolgerKnublauch
Copy link
Contributor

For this particular case, yes, if this wasn't harmless then another filterShape could be added.

As I said earlier, this particular scenario looks quite attractive with if/then/else, but I do wonder how common this really is. It is so far the only example that we are discussing, and I don't think this is a good foundation for introducing the general capability yet. We need more examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Inferencing For SHACL 1.2 Inferencing spec. Node Expressions For SHACL 1.2 Node Expressions UCR Use Cases and Requirements
Projects
None yet
Development

No branches or pull requests

6 participants