# SHACL validation with `pySHACL`

Here we'll show use of the [W3 Shapes Constraint Language](https://www.w3.org/TR/shacl/) (SHACL) based on the [`pySHACL`](https://github.com/RDFLib/pySHACL) library.

To build KGs we use layers:

  * [SKOS](https://www.w3.org/2001/sw/wiki/SKOS) - *thesauri* and *classification*
  * [SHACL](https://www.w3.org/TR/shacl/) - *requirements*
  * [OWL](https://www.w3.org/OWL/) - *concepts*
  * [RDF](https://www.w3.org/TR/rdf11-primer/) - *represent nodes, predictates, literals*

In [1]:
shapes_file = """
@prefix sh:     <http://www.w3.org/ns/shacl#> .
@prefix xsd:    <http://www.w3.org/2001/XMLSchema#> .
@prefix schema: <http://schema.org/> .

schema:PersonShape
    a sh:NodeShape ;
    sh:targetClass schema:Person ;
    sh:property [
        sh:path schema:givenName ;
        sh:datatype xsd:string ;
        sh:name "given name" ;
    ] ;
    sh:property [
        sh:path schema:birthDate ;
        sh:lessThan schema:deathDate ;
        sh:maxCount 1 ;
    ] ;
    sh:property [
        sh:path schema:gender ;
        sh:in ( "female" "male" ) ;
    ] ;
    sh:property [
        sh:path schema:address ;
        sh:node schema:AddressShape ;
    ] .

schema:AddressShape
    a sh:NodeShape ;
    sh:closed true ;
    sh:property [
        sh:path schema:streetAddress ;
        sh:datatype xsd:string ;
    ] ;
    sh:property [
        sh:path schema:postalCode ;
        sh:datatype xsd:integer ;
        sh:minInclusive 10000 ;
        sh:maxInclusive 99999 ;
    ] .
"""

In [2]:
data_file = """
{
    "@context": { "@vocab": "http://schema.org/" },
    "@id": "http://example.org/ns#Bob",
    "@type": "Person",
    "givenName": "Robert",
    "familyName": "Junior",

    "birthDate": "1971-07-07",
    "deathDate": "1968-09-10",
    "address": {
        "@id": "http://example.org/ns#BobsAddress",
        "streetAddress": "1600 Amphitheatre Pkway",
        "postalCode": 9404
    }
}
"""

In [3]:
import pyshacl

results = pyshacl.validate(
    data_file,
    shacl_graph=shapes_file,
    data_graph_format="json-ld",
    shacl_graph_format="turtle",
    inference="rdfs",
    debug=True,
    serialize_report_graph=False,
    )

conforms, v_graph, v_text = results

print("conforms", conforms)
print("graph", v_graph)
print("text", v_text)

Constraint Violation in LessThanConstraintComponent (http://www.w3.org/ns/shacl#LessThanConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:lessThan schema:deathDate ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:path schema:birthDate ]
	Focus Node: <http://example.org/ns#Bob>
	Value Node: Literal("1971-07-07")
	Result Path: schema:birthDate
	Message: Value of <http://example.org/ns#Bob>->schema:deathDate <= Literal("1971-07-07")

Constraint Violation in MinInclusiveConstraintComponent (http://www.w3.org/ns/shacl#MinInclusiveConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:datatype xsd:integer ; sh:maxInclusive Literal("99999", datatype=xsd:integer) ; sh:minInclusive Literal("10000", datatype=xsd:integer) ; sh:path schema:postalCode ]
	Focus Node: <http://example.org/ns#BobsAddress>
	Value Node: Literal("9404", datatype=xsd:integer)
	Result Path: schema:postalCode
	Message: Value is not >= Literal("10000", datatype=xsd:integer)

Constraint Violatio

conforms False
graph [a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'Memory2']].
text Validation Report
Conforms: False
Results (2):
Constraint Violation in LessThanConstraintComponent (http://www.w3.org/ns/shacl#LessThanConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:lessThan schema:deathDate ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:path schema:birthDate ]
	Focus Node: <http://example.org/ns#Bob>
	Value Node: Literal("1971-07-07")
	Result Path: schema:birthDate
	Message: Value of <http://example.org/ns#Bob>->schema:deathDate <= Literal("1971-07-07")
Constraint Violation in NodeConstraintComponent (http://www.w3.org/ns/shacl#NodeConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:node schema:AddressShape ; sh:path schema:address ]
	Focus Node: <http://example.org/ns#Bob>
	Value Node: <http://example.org/ns#BobsAddress>
	Result Path: schema:address
	Message: Value does not conform to Shape schema:AddressShape



The birthday value should result in a `LessThanConstraintComponent` violation, and the postalcode value should result in a `NodeConstraintComponent` violation.

---

In [4]:
import kglab

namespaces = {
    "wtm": "http://purl.org/heals/food/",
    "ind": "http://purl.org/heals/ingredient/",
    "skos": "https://www.w3.org/2004/02/skos/core#",
    "nom": "https://github.com/DerwenAI/kglab/wiki/Vocab#",
    }

kg = kglab.KnowledgeGraph(
    name = "A recipe KG example based on Food.com",
    base_uri = "https://www.food.com/recipe/",
    language = "en",
    namespaces = namespaces,
    )

kg.load_ttl("tmp.ttl")

In [5]:
shacl = """
@prefix sh:  <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix nom: <https://github.com/DerwenAI/kglab/wiki/Vocab#> .
@prefix wtm: <http://purl.org/heals/food/> .
@prefix ind: <http://purl.org/heals/ingredient/> .
@prefix skos: <https://www.w3.org/2004/02/skos/core#> .

nom:RecipeShape
    a sh:NodeShape ;
    sh:targetClass wtm:Recipe ;
    sh:property [
        sh:path wtm:hasIngredient ;
        sh:node wtm:Ingredient ;
        sh:minCount 3 ;
    ] ;
    sh:property [
        sh:path skos:definition ;
        sh:datatype xsd:string ;
        sh:maxLength 50 ;
    ] .
"""

In [6]:
conforms, v_graph, v_text = kg.validate(shacl_graph=shacl)

Constraint Violation in MaxLengthConstraintComponent (http://www.w3.org/ns/shacl#MaxLengthConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:datatype xsd:string ; sh:maxLength Literal("50", datatype=xsd:integer) ; sh:path skos:definition ]
	Focus Node: <https://www.food.com/recipe/137158>
	Value Node: Literal("pikkuleipienperustaikina  finnish butter cookie dough")
	Result Path: skos:definition
	Message: String length not <= Literal("50", datatype=xsd:integer)

Constraint Violation in MaxLengthConstraintComponent (http://www.w3.org/ns/shacl#MaxLengthConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:datatype xsd:string ; sh:maxLength Literal("50", datatype=xsd:integer) ; sh:path skos:definition ]
	Focus Node: <https://www.food.com/recipe/61108>
	Value Node: Literal("german pancakes  from the mennonite treasury of recipes")
	Result Path: skos:definition
	Message: String length not <= Literal("50", datatype=xsd:integer)

Constraint Violation in MaxLengthCons

In [7]:
print("conforms", conforms)
print("graph", v_graph)
print("text", v_text)

conforms False
graph b'@prefix sh: <http://www.w3.org/ns/shacl#> .\n@prefix skos: <https://www.w3.org/2004/02/skos/core#> .\n@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .\n\n[] a sh:ValidationReport ;\n    sh:conforms false ;\n    sh:result [ a sh:ValidationResult ;\n            sh:focusNode <https://www.food.com/recipe/61108> ;\n            sh:resultMessage "String length not <= Literal(\\"50\\", datatype=xsd:integer)" ;\n            sh:resultPath skos:definition ;\n            sh:resultSeverity sh:Violation ;\n            sh:sourceConstraintComponent sh:MaxLengthConstraintComponent ;\n            sh:sourceShape _:ub3bL16C17 ;\n            sh:value "german pancakes  from the mennonite treasury of recipes" ],\n        [ a sh:ValidationResult ;\n            sh:focusNode <https://www.food.com/recipe/137158> ;\n            sh:resultMessage "String length not <= Literal(\\"50\\", datatype=xsd:integer)" ;\n            sh:resultPath skos:definition ;\n            sh:resultSeverity sh:Vi

---

## Exercises

1. fix the errors in the ABox graph data of the first example

  2. how would you validate that each recipe has a cooking time?