### <font color='red'>Go to *File > Save a copy in Drive*</font>.

**<h1>Declarative construction and validation of Knowledge Graphs</h1>**

<img src="https://drive.google.com/uc?export=view&id=1fmIPBG5_eY-fJ_O2H2o1xC2QEwDKRh4l" alt="drawing" width="200"/>
<img src="https://drive.google.com/uc?export=view&id=1MCrbbICvyKYoUKZHsczHBjXMNElxdEsA" alt="drawing" width="200"/>

<img src="https://drive.google.com/uc?export=view&id=1dh5BvfK1f1nFg18MO6dtNQ2jE8i06HeD" alt="drawing" width="200"/>
<img src="https://drive.google.com/uc?export=view&id=1SZl-DnXWUA6eLVsle0yG0SIvYPllZqR9" alt="drawing" width="200"/>

> Ana Iglesias-Molina (ana.iglesiasm@upm.es) and Xuemin Duan (xuemin.duan@kuleuven.be)

> Half-day tutorial at K-CAP23

**<h2>KG Validation</h2>**

 Xuemin Duan (xuemin.duan@kuleuven.be)

> This is the python notebook for KG Validation (using SHACL).

> You will learn

>> how to manually write SHACL shapes,

>> how to automatically generate SHACL shapes from RML mapping rules,

>> how to validate whether the created SHACL shapes are well-formed shapes,

>> and how to validate RDF graphs using created SHACL shapes here.



### Install packages

**[RDFlib](https://github.com/RDFLib/rdflib)** for working with RDF graphs.

**[pySHACL](https://github.com/RDFLib/pySHACL)** for validating SHACL shapes and RDF graphs.

In [None]:
!pip install rdflib pyshacl

In [None]:
from rdflib import Graph
import pyshacl

## SHACL-SHACL

SHACL Shapes to Validate Shapes Graphs.

To enforce many of the syntactic constraints related to SHACL Core in the specification.

To validate whether created shapes graph is well-formed shapes graph

In [None]:
shacl_shacl_graph=Graph().parse("https://www.w3.org/ns/shacl-shacl")

In [None]:
shapes = """
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix ex: <http://example.org/> .

ex:PersonShape
    a sh:NodeShape ;
    sh:targetClass ex:Person ;
    sh:property [
        sh:path ex:name ;
        sh:datatype xsd:string ;
    ] .
"""
shapes_graph = Graph().parse(data=shapes, format="turtle")

# Validating SHACL graphs using pySHACL
conforms, results_graph, results_text = pyshacl.validate(shapes_graph, shacl_graph=shacl_shacl_graph)

# print validation report
print(f"Conforms: {conforms}")
print(results_text)

In [None]:
shapes = """
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix ex: <http://example.org/> .

ex:PersonShape
    a sh:NodeShape ;
    sh:targetClass ex:Person ;
    sh:property [
        #sh:path ex:name ;
        sh:datatype xsd:string ;
    ] .
""" # property shape must has exactly one property path

shapes_graph = Graph().parse(data=shapes, format="turtle")

# Validating SHACL graphs using pySHACL
conforms, results_graph, results_text = pyshacl.validate(shapes_graph, shacl_graph=shacl_shacl_graph)

# print validation report
print(f"Conforms: {conforms}")
print(results_text)

Explore your own shapes

In [None]:
shapes = """
YOUR SHAPES HERE
"""

shapes_graph = Graph().parse(data=shapes, format="turtle")
conforms, results_graph, results_text = pyshacl.validate(shapes_graph, shacl_graph=shacl_shacl_graph)
print(f"Conforms: {conforms}")
print(results_text)

## SHACL creation by hand

Play with simple examples and tasks!

In [None]:
# Simple Example for validating whether a person's name is string and age is integer.

data = """
@prefix ex: <http://example.org/> .

ex:Alice
    a ex:Person ;
    ex:name "Alice" ;
    ex:age "20" .
"""

shapes = """
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix ex: <http://example.org/> .

ex:PersonShape
    a sh:NodeShape ;
    sh:targetClass ex:Person ;
    sh:property [
        sh:path ex:name ;
        sh:datatype xsd:string ;
    ] ;
    sh:property [
        sh:path ex:age ;
        sh:datatype xsd:integer ;
    ] .
"""

# Load as data graph, shapes graph
data_graph = Graph().parse(data=data, format="turtle")
shapes_graph = Graph().parse(data=shapes, format="turtle")

# Validating RDF graphs using pySHACL
conforms, results_graph, results_text = pyshacl.validate(data_graph,shacl_graph=shapes_graph)

# print validation report
print(f"Conforms: {conforms}")
print(results_text)


### Tasks
Finish tasks by writing shapes to validate the following data graph.





In [None]:
data = """
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix ex: <http://www.ex.org/> .

<http://www.ex.org/Aircraft/151UW>
      a ex:Aircraft ;
      ex:hasAircraftId 9824;
      ex:hasAircraftModelVersion <http://data.sfgov.org/AircraftModelVersion/200> ;
      ex:hasCreationDate
              "2013-09-03T05:53:00.0"^^xsd:dateTime;
      ex:hasModificationDate
              "2013-09-03T05:53:00.0"^^xsd:dateTime;
      ex:hasTailNumber "151UW" ;
      ex:isActive "true"^^xsd:boolean;
      ex:owningAirline <http://data.sfgov.org/Company/US+Airways> ;
      ex:has_Aircraft_Model <http://data.sfgov.org/AircraftModel/A321> .

<http://data.sfgov.org/AircraftModelVersion/200>
  a ex:AircraftModelVersion;
  ex:hasModelVersion "200"^^xsd:string .

"""
# Load as data graph
data_graph = Graph().parse(data=data, format="turtle")

Task 1:

Fill the $<$FILL HERE$>$ to restrict the instances of ex:Aircraft to be IRI.

In [None]:
shapes_graph_s = """
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix ex: <http://www.ex.org/> .

ex:PersonShape
    a sh:NodeShape ;
    sh:targetClass ex:Aircraft ;
    <FILL HERE> .
"""
shapes_graph = Graph().parse(data=shapes_graph_s, format="turtle")

# Validating RDF graphs using pySHACL
conforms, results_graph, results_text = pyshacl.validate(data_graph,shacl_graph=shapes_graph)
print(f"Conforms: {conforms}")
print(results_text)

Task 2:

Fill the $<$FILL HERE$>$ to restrict the property value of ontology:hasAircraftId of instances of ex:Aircraft to be integer.

In [None]:
shapes_graph_s = """<FILL HERE>
"""
shapes_graph = Graph().parse(data=shapes_graph_s, format="turtle")

# Validating RDF graphs using pySHACL
conforms, results_graph, results_text = pyshacl.validate(data_graph,shacl_graph=shapes_graph)
print(f"Conforms: {conforms}")
print(results_text)

Task 3:

Fill the $<$FILL HERE$>$ to restrict the property value of ex:hasAircraftModelVersion associated with the ex:Aircraft instances to be IRI,

and require this property value to have one ex:AircraftModelVersion class and must has exactly one ex:hasModelVersion property.

In [None]:
shapes_graph_s = """<FILL HERE>
"""
shapes_graph = Graph().parse(data=shapes_graph_s, format="turtle")

# Validating RDF graphs using pySHACL
conforms, results_graph, results_text = pyshacl.validate(data_graph,shacl_graph=shapes_graph)
print(f"Conforms: {conforms}")
print(results_text)

## RML2SHACL

### Install tools and download dataset

**[RML2SHACL](https://github.com/RMLio/RML2SHACL)** for translating RML mapping rules to SHACL shapes.

In [None]:
!git clone https://github.com/RMLio/RML2SHACL.git

In [None]:
import os
os.chdir("RML2SHACL")

Download the RDF graph and RML mappings from last KG construction session

In [None]:
import gdown
url = 'https://drive.google.com/uc?id=11NP5wZK5U1ZI2EsICTptXHSQFWI3FxH7'
output = 'KG_validation_data.zip'
gdown.download(url, output, quiet=False)

In [None]:
!unzip KG_validation_data.zip

In [None]:
data_graph = Graph().parse("KG_validation_data/result.nt", format="nt")

### Try RML2SHACL

Generate SHACL shapes from RMl mapping rules using RML2SHACL

In [None]:
!python main.py KG_validation_data/aircraft_mappings.ttl

### Validate RDF graph using RML-driven shapes

In [None]:
shapes_graph = Graph().parse("shapes/KG_validation_data/aircraft_mappings.ttl-output-shape.ttl", format="turtle")

# Validating RDF graphs using pySHACL
conforms, results_graph, results_text = pyshacl.validate(data_graph,shacl_graph=shapes_graph)
print(f"Conforms: {conforms}")
print(results_text)