Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource of http://www.w3.org/ns/shacl#value is empty in validation report #55

Closed
michielhildebrand opened this issue Jul 9, 2020 · 10 comments

Comments

@michielhildebrand
Copy link

In the report the resource found in a http://www.w3.org/ns/shacl#value is empty, see the json below.

[
  {
    ...
    "@type": [
      "http://www.w3.org/ns/shacl#ValidationResult"
    ],
    "http://www.w3.org/ns/shacl#focusNode": [
      {
        "@id": "http://vangoghmuseum.nl/data/artwork/d0005V1962"
      }
    ],
    ...
    "http://www.w3.org/ns/shacl#value": [
      {
        "@id": "_:N6087b61f1f1d44e08519420c185ba3f2"
      }
    ]
  },
  {
    "@id": "_:N6087b61f1f1d44e08519420c185ba3f2"
  },

This report is the result of a propertyShape with a sh:node constraint. The first validation result in the example contains the information of the shape containing the sh:node. This fine. The value (N6087b61f1f1d44e08519420c185ba3f2) should contain the information of the result for the sh:node. I confirmed this in TopBraid.

@ashleysommer
Copy link
Collaborator

ashleysommer commented Jul 10, 2020

Hi @michielhildebrand
Thanks for the bug report. I'm guessing this is output from the PySHACL command line tool?
Looks like you're using JSON-LD ouput. Does it do the same when the report is output as Turtle (TTL) format?
Another question, does it have the same problem if the sh:value is a Literal or a URI? Looks like in this example the value in question is a BNode which is not correctly serialized.
I think this is a limitation of the commandline tool output that might need to be expanded.

When using PySHACL as a library as part of a larger python application and called within the code, this problem is not present, the sh:value property is always populated correctly.

@michielhildebrand
Copy link
Author

The sh:value is also empty in turtle

@prefix crm: <http://www.cidoc-crm.org/cidoc-crm/> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix vgw: <https://vangoghworldwide.org/shapes/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[] a sh:ValidationReport ;
    sh:conforms false ;
    sh:result [ a sh:ValidationResult ;
            sh:focusNode <http://vangoghmuseum.nl/data/artwork/d0005V1962> ;
            sh:resultMessage "Dimension does not conform to shape vgw:Dimension" ;
            sh:resultPath crm:P43_has_dimension ;
            sh:resultSeverity sh:Violation ;
            sh:sourceConstraintComponent sh:NodeConstraintComponent ;
            sh:sourceShape [ a sh:PropertyShape ;
                    sh:group vgw:DimensionGroup ;
                    sh:message "Dimension does not conform to shape vgw:Dimension" ;
                    sh:node vgw:Dimension ;
                    sh:path crm:P43_has_dimension ;
                    sh:severity sh:Violation ] ;
            sh:value [ ] ] .

I also tried it as a python library. I get the exact same result.

@michielhildebrand
Copy link
Author

I looked into it a bit closer. I hope this helps to figure out where it goes wrong.

pyShacl returns this report:

[] a sh:ValidationReport ;
    sh:conforms false ;
    sh:result [ a sh:ValidationResult ;
            sh:focusNode [ ] ;
            sh:resultMessage "Dimension requires a value (crm:P90_has_value)" ;
            sh:resultPath crm:P90_has_value ;
            sh:resultSeverity sh:Violation ;
            sh:sourceConstraintComponent sh:MinCountConstraintComponent ;
            sh:sourceShape [ a sh:PropertyShape ;
                    sh:message "Dimension requires a value (crm:P90_has_value)" ;
                    sh:minCount 1 ;
                    sh:path crm:P90_has_value ;
                    sh:severity sh:Violation ] ],
        [ a sh:ValidationResult ;
            sh:focusNode <http://vangoghmuseum.nl/data/artwork/d0005V1962> ;
            sh:resultMessage "Dimension does not conform to shape vgw:Dimension" ;
            sh:resultPath crm:P43_has_dimension ;
            sh:resultSeverity sh:Violation ;
            sh:sourceConstraintComponent sh:NodeConstraintComponent ;
            sh:sourceShape [ a sh:PropertyShape ;
                    sh:group vgw:DimensionGroup ;
                    sh:message "Dimension does not conform to shape vgw:Dimension" ;
                    sh:node vgw:Dimension ;
                    sh:path crm:P43_has_dimension ;
                    sh:severity sh:Violation ] ;
            sh:value [ ] ] .

In here I would expect the sh:value of the last ValidationResult to contain the sh:focusNode of the first ValidationResult. In Topbraid this is the case:

[ a       <http://www.w3.org/ns/shacl#ValidationReport> ;
  <http://www.w3.org/ns/shacl#conforms>
          false ;
  <http://www.w3.org/ns/shacl#result>
          [ a       <http://www.w3.org/ns/shacl#ValidationResult> ;
            <http://www.w3.org/ns/shacl#focusNode>
                    _:b0 ;
            <http://www.w3.org/ns/shacl#resultMessage>
                    "Dimension requires a value (crm:P90_has_value)" ;
            <http://www.w3.org/ns/shacl#resultPath>
                    crm:P90_has_value ;
            <http://www.w3.org/ns/shacl#resultSeverity>
                    <http://www.w3.org/ns/shacl#Violation> ;
            <http://www.w3.org/ns/shacl#sourceConstraintComponent>
                    <http://www.w3.org/ns/shacl#MinCountConstraintComponent> ;
            <http://www.w3.org/ns/shacl#sourceShape>
                    [] 
          ] ;
  <http://www.w3.org/ns/shacl#result>
          [ a       <http://www.w3.org/ns/shacl#ValidationResult> ;
            <http://www.w3.org/ns/shacl#focusNode>
                    <http://vangoghmuseum.nl/data/artwork/d0005V1962> ;
            <http://www.w3.org/ns/shacl#resultMessage>
                    "Dimension does not conform to shape vgw:Dimension" ;
            <http://www.w3.org/ns/shacl#resultPath>
                    crm:P43_has_dimension ;
            <http://www.w3.org/ns/shacl#resultSeverity>
                    <http://www.w3.org/ns/shacl#Violation> ;
            <http://www.w3.org/ns/shacl#sourceConstraintComponent>
                    <http://www.w3.org/ns/shacl#NodeConstraintComponent> ;
            <http://www.w3.org/ns/shacl#sourceShape>
                    []  ;
            <http://www.w3.org/ns/shacl#value>
                    _:b0
          ]
] .

@ashleysommer
Copy link
Collaborator

ashleysommer commented Jul 22, 2020

Thanks @michielhildebrand I think I've determined why this is not working properly for you, I've just got to isolate the exact set of conditions to replicate it.

Edit: yep, I see that it affects sh:focusNode too. the Validation Report graph is a bit weird because it contains nodes from both the SHACL Shape Graph, and the DataGraph, smooshed into one representation. What is important here is that FocusNode and ValueNode are the only two entries in each ValidationResult which originate from the DataGraph.

Since v0.11.0 pyshacl uses rdflib.Dataset rather than rdflib.Graph for its internal rdf store, which means when I'm pulling nodes from the Data graph to put in the validation report, it doesn't know which namedgraph to use when querying them, so just comes back blank.

I don't know why this hasn't been caught by the tests, there are tests in place to ensure this kind of regression doesn't happen.

@ashleysommer
Copy link
Collaborator

@michielhildebrand
I've pushed a new version of PySHACL to PyPI, v0.12.1, I've included a potential fix for this issue, can you please test to see if your problem is resolved in this new version?

@ashleysommer
Copy link
Collaborator

Fixed by #cf6df94

@michielhildebrand
Copy link
Author

Cool! I see something in the valueNode now. Thanks a lot.

BTW. Sorry for the delay. I was offline for a few weeks.

@michielhildebrand
Copy link
Author

michielhildebrand commented Aug 11, 2020

Hi Ashley, I noticed an issue with the fix. In case the value is a blank node you seem to generate a new id for it. Therefore the same blank node occurring in a value and in a focusNode can not be compared on id.

@ashleysommer
Copy link
Collaborator

Hi @michielhildebrand
Thanks for mentioning that.
I don't generate the IDs for Blank nodes, that is done by the RDFLib. Blank node IDs are only relevant to the graph they are currently in. In order for the value node to be displayed in the Report Graph, it needs to be copied over from the Data Graph. When a blank node is copied from one graph to another graph, it will no longer have the same ID. The ID it had in the Data Graph no longer exists in the Report Graph.

I think I might be able to fix the second part you mentioned about the valueNode and focusNode not having the same ID in the Report Graph. In the current implementation they are both copied over separately (thus they are two new blank nodes), but I could put a simple check if they are the same node, only copy it into the report graph once, then use that for both valueNode and focusNode, then they will have the same ID. I'll create a new issue for it.

@ashleysommer
Copy link
Collaborator

New issue #57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants