Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHACL shapes do not pass SHACL-SHACL validation #139

Closed
ajnelson-nist opened this issue Nov 17, 2023 · 10 comments
Closed

SHACL shapes do not pass SHACL-SHACL validation #139

ajnelson-nist opened this issue Nov 17, 2023 · 10 comments
Assignees
Labels
Final Review Tagged for final review before closing SHACL Issues related to SHACL files

Comments

@ajnelson-nist
Copy link

Name: Alex Nelson

Affiliation: I am an employee of the National Institute of Standards and Technology. I am also a community member of the Cyber Domain Ontology in some leadership roles.

Type of issue: Schema (specifically, SHACL shapes)

Issue: A review of the file dcat-us_3.0_shacl_shapes.ttl in today's state raises several SHACL-SHACL validation errors -- that is, errors specific to SHACL syntax. Unfortunately, these errors cause the shapes graph to fail to load in a SHACL-executing engine. An example shape that has errors is dcat-us-shp:Document_Shape-creator (link is to today's version of that file).

  • It has two objects for sh:path, while SHACL syntactically requires only one object.
    • Even though dcterms:creator is a sub-property of dc:creator, there are different effects for either being the object of sh:path.
  • It has multiple objects for some predicates required to have at most one object - sh:minCount, sh:nodeKind.

In total, this pySHACL (version 0.24.0) command1, which runs SHACL-SHACL validation of the shapes graph before attempting to validate the data graph, reports 70 errors across the graph:

# (current working directory: top source directory of repository)
pyshacl \
  --metashacl \
  --shacl shacl/dcat-us_3.0_shacl_shapes.ttl \
  docs/examples/activity.ttl

Recommended change(s):

  • Address issues with the SHACL shapes graph through the point that a SHACL-SHACL review reports conformance.
  • Add some Continuous Integration process that exercises the SHACL shapes graph against some example, or perhaps all examples under docs/examples.
    • I would be happy to discuss strategies my community has used to review example data conformance against our ontology and shapes if you'd like.

Footnotes

  1. Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.

@ajnelson-nist
Copy link
Author

Apologies, I should have included this in the initial post:
This contribution is made only by myself, and is not being made by the National Institute of Standards and Technology or any other organization.

ajnelson-nist added a commit to ajnelson-nist/dcat-us that referenced this issue Nov 21, 2023
The shape-component transcribed from DASH was transcribed incorrectly.
The RDF List was constructed incorrectly, needing both a `rdf:first` and
`rdf:rest` predicate.  Also, from DASH-prescribed usage with `sh:or`,
the RDF List members must be `sh:Shape`s, each housing a `sh:datatype`
constraint predicate.

To avoid other transcription or reformatting errors, this patch copies
the original shape from DASH, adding an `rdfs:isDefinedBy` to cite the
source.  Because DASH does not provide a `owl:versionIRI`, an extra
comment noting the copy date and destination file is also added.

This patch addresses 37 of the 70 reported SHACL-SHACL validation errors
noted in Issue 139.

References:
* https://datashapes.org/dash.html#StringOrLangString
* DOI-DO#139

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>
@ajnelson-nist
Copy link
Author

I've drafted some changes to address SHACL-SHACL validation errors, and will be happy to send some more PRs akin to #162 after this week's holiday.

However, I also tried running the shapes against the examples in this repository, and saw there are several validation issues raised. A shell transcript is at the end of this post.

I'd made a remark in my initial post on adding a Continuous Integration process. That seems like it might be a bigger discussion that will expand into whether the examples should conform to all, or just some, of the SHACL shapes. Would that be better handled in a separate Issue? (I'm not sure how amenable your workflow is to receiving new Issues at the moment.)

Shell transcript, using pyshacl1:

pyshacl \
	    --metashacl \
	    --shacl shacl/dcat-us_3.0_shacl_shapes.ttl \
	    docs/examples/activity.ttl
Validation Report
Conforms: False
Results (10):
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: dcat-us-shp:Concept_Shape-inScheme
	Focus Node: ex:CensusActivity
	Result Path: skos:inScheme
	Message: Less than 1 values on ex:CensusActivity->skos:inScheme
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: dcat-us-shp:Concept_Shape-prefLabel
	Focus Node: ex:CensusActivity
	Result Path: skos:prefLabel
	Message: Less than 1 values on ex:CensusActivity->skos:prefLabel
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: dcat-us-shp:Catalog_Shape-dataset
	Focus Node: ex:NationalCensus
	Result Path: dcat:dataset
	Message: Less than 1 values on ex:NationalCensus->dcat:dataset
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: dcat-us-shp:Catalog_Shape-description
	Focus Node: ex:NationalCensus
	Result Path: dcterms:description
	Message: Less than 1 values on ex:NationalCensus->dcterms:description
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: dcat-us-shp:Catalog_Shape-publisher
	Focus Node: ex:NationalCensus
	Result Path: dcterms:publisher
	Message: Less than 1 values on ex:NationalCensus->dcterms:publisher
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: dcat-us-shp:Catalog_Shape-title
	Focus Node: ex:NationalCensus
	Result Path: dcterms:title
	Message: Less than 1 values on ex:NationalCensus->dcterms:title
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: dcat-us-shp:Dataset_Shape-description
	Focus Node: ex:Census2020Dataset
	Result Path: dcterms:description
	Message: Less than 1 values on ex:Census2020Dataset->dcterms:description
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: dcat-us-shp:Dataset_Shape-identifier
	Focus Node: ex:Census2020Dataset
	Result Path: dcterms:identifier
	Message: Less than 1 values on ex:Census2020Dataset->dcterms:identifier
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: dcat-us-shp:Dataset_Shape-publisher
	Focus Node: ex:Census2020Dataset
	Result Path: dcterms:publisher
	Message: Less than 1 values on ex:Census2020Dataset->dcterms:publisher
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
	Severity: sh:Violation
	Source Shape: dcat-us-shp:Dataset_Shape-title
	Focus Node: ex:Census2020Dataset
	Result Path: dcterms:title
	Message: Less than 1 values on ex:Census2020Dataset->dcterms:title

Footnotes

  1. Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.

@fellahst fellahst added the SHACL Issues related to SHACL files label Nov 27, 2023
@fellahst
Copy link
Collaborator

Alex,

Thank you so much for your detailed feedback. Your ticket is currently under review and we will come back to you as soon as possible with remediation for the issue you raised. Thanks for your patience.

@hkdctol
Copy link

hkdctol commented Jan 9, 2024

Would like to discuss further as I don't have background/understanding on this

@fellahst
Copy link
Collaborator

fellahst commented Feb 1, 2024

An updated SHACL shapefile has been commited in the repository that fixes the issues. Also updated the activity,ttl to get the required fields to pass validation. A more complete example has been added in the Git repository that pass the validation: https://github.com/DOI-DO/dcat-us/blob/main/docs/examples/example1-dcat-us-3.0.ttl

@fellahst fellahst added the Final Review Tagged for final review before closing label Feb 1, 2024
@ajnelson-nist
Copy link
Author

Hi @fellahst ,

Thank you for the updates!

I've checked the DCAT-US 3 SHACL graph as I did before, and it now appears to be conformant with SHACL syntactic requirements.

I noticed not all of the examples under docs/examples currently conform against the shapes, though. I tried this1 Bash one-liner:

ls docs/examples/*ttl | while read x; do echo $x; pyshacl --shacl shacl/dcat-us_3.0_shacl_shapes.ttl ${x} ; done 2>&1 | egrep '^Conforms' | sort | uniq -c

I got these results:

  16 Conforms: False
  16 Conforms: True

I appreciate making all of the examples conformant might be distracting from each example's purpose. But on the other hand, examples might be copied with the hope of starting an application from a "known passing" state.

Could a README be added to docs/examples/ to describe which examples are provided expecting to be minimally-conformant demonstrations, and/or which are provided to highlight only certain terms' usage?

Also, with example1-dcat-us-3.0.{json,ttl} now generated, do you have any further thoughts on adding a Continuous Integration process? I'm happy to discuss a few different practices that have been used in a community I work with.

Footnotes

  1. Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.

@mrratcliffe
Copy link

@fellahst-- what is the response to Alex Nelson's question above about adding a README?

@fellahst
Copy link
Collaborator

fellahst commented Feb 2, 2024

Thank you for your insightful feedback and the excellent suggestion regarding the addition of a README to clarify the purpose and conformity status of examples in the docs/examples directory. We will make more examples SHACL conformant by adding the missing required fields and add a README.txt if we can not make them all compliant without significant work.

Incorporating a Continuous Integration (CI) process, particularly for the automated validation of new examples, is indeed a wise move forward. I am eager to delve into and discuss the CI practices you have suggested when we will enter the implementation phase. Such practices promise to significantly enhance our project by maintaining consistent compliance and streamlining updates.

@fellahst fellahst closed this as completed Feb 2, 2024
@fellahst fellahst reopened this Feb 2, 2024
@mrratcliffe
Copy link

+1

@fellahst fellahst closed this as completed Feb 8, 2024
@fellahst
Copy link
Collaborator

fellahst commented Feb 8, 2024

I went the extra-mile to make sure that every single file of the 123 examples are validating against the SHACL file.
You can run the following command:

find docs/examples/ -name '*.ttl' -print0 | xargs -0 -I{} sh -c 'echo {}; pyshacl --shacl shacl/dcat-us_3.0_shacl_shapes.ttl {}' | egrep '^Conforms' | sort | uniq -c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Final Review Tagged for final review before closing SHACL Issues related to SHACL files
Projects
None yet
Development

No branches or pull requests

5 participants