# Discussions on Validating Graphs

## RDA 18: Metadata Working Group

[Session page](https://www.rd-alliance.org/plenaries/rda-18th-plenary-meeting-virtual/recommendations-publishing-structure-metadata-web)

[Slides](https://docs.google.com/presentation/d/1F9E1zyfhnniSQJv18RPdYCutKwSNKTzl6uml3zvDcOs/edit#slide=id.gd08e71dd0b_0_246)

## Validation Breakout Group

Authors:
Adam Shepherd (0000-0003-4486-9448) & Douglas Fils (0000-0002-2257-9127) 



# Agenda for Session

- Who are the Personas in Validation and relation to GO FAIR Implementation Network
    - RDF Conceptual Model Basis
    - Role of Validation
- What is the basic ecosystem we are working in?
    - RDF as JSON-LD
    - Web architectures (robots.txt, sitemaps.xml, HTML+JSON-LD, HTTP)
    - A community (typically, though a single user exposing to the world is also fine)	
        - Here the logical "world" might be GDSS, community indexing or general SEO more broadly
- Why Validation?
    - Align to guidance or principles  
        - best when for a purpose like community index, FAIR, CARE, regulations, etc
    - Align to application 
        - perhaps the more common or useful reason, to address elements of FAIR like F or I
- How Validation
    


# Validation Personas and  GO FAIR Implementation Network


<div>
<img src="relations.png"  width="400"/>
<div>


    
- IMPLEMENT  clearly defined plans and deliverables to implement an element of the Internet of FAIR Data and Services (IFDS) within a defined time period;


- FOSTER  a community of harmonized FAIR practices;


- COMMUNICATE  together on critical issues on which consensus has been reached and which are of generic importance for the community.


## RDF Conceptual Model Basis

A brief tour of RDF conceptual mode, the RDF ecosystem and SHACL (and JSON-LD) in that ecosystem.

<div>
<img src="ecosystem.png"  width="400"/>
<div>

Image credit: Pierre-Antoine Champin  https://www.w3.org/Talks/2021/09-19-ddi-cdi/?full#rdf-ecosystem

## Role of Validation

The various personas can be seen described in [https://book.oceaninfohub.org/personas/persona.html](https://book.oceaninfohub.org/personas/persona.html).  These give an idea of some of the players in an implementation.  The relations between these persona are potential areas where the shape of the graph may be important for query or other functions.     

Validation also helps to address application.  In particular the application of query to the graph.
**If we allow anything in "author" space, then we make the query in "user" space very complex** (to 
deal with the variety).  The result can be poor performance and bad recovery. 

# Validation Options

- JSON Schema
- ShEx
- SHACL
- Others (like Cue lang)

## Why SHACL?

SHACL is on a W3C recommendation track while ShEx is a community project.  SHACL has also shown wider adoption in the JSON-LD and broader structured data on the web community including Solid. 


## A brief aside on JSON-LD Structure Validation

### Validate the structure of the JSON-LD data graph

These test that your document is well formed but not necessarily valid against a vocabulary or profile / guidance.

* [JSON-LD Playground](https://json-ld.org/playground/)
* [Structured data Linter](http://linter.structured-data.org/)


### Validates against Schema.org usage

This includes things like domain and range issues and predicate and type terms.

* [SDO Validator](https://validator.schema.org/)


## SHACL Resources

- [W3C SHACL](https://www.w3.org/TR/shacl/)  
- [Editors Draft](https://w3c.github.io/data-shapes/shacl/)
- [Implementation Report](https://w3c.github.io/data-shapes/data-shapes-test-suite/)

You can try SHACL at the [SHACL Playground](https://shacl.org/playground/)


# Some example SHACL Shapes



Shape Graphs:

The SHACL Shapes Constraint Language, a language for validating RDF graphs against a set of conditions. These conditions are provided as shapes and other constructs expressed in the form of an RDF graph. RDF graphs that are used in this manner are called "shapes graphs"

Data Graphs:  

In SHACL and the RDF graphs that are validated against a shapes graph are called "data graphs". 

reference: [https://www.w3.org/TR/shacl/#sparql-constraints-example](https://www.w3.org/TR/shacl/#sparql-constraints-example)

## A quick example

This is a basic example but it shows things like checking for node type, min and max counts, setting severity and other aspects.  We can visit the [core constraints](https://www.w3.org/TR/shacl/#core-components) for SHACL to see some, but not, of the patterns SHACL can address.  More complex (or at least alternative) approaches include the SPARQL based on constraints or [SHACL Advanced Features](https://www.w3.org/TR/shacl-af/). 

[GitHub resource link](https://github.com/iodepo/odis-arch/blob/master/book/tooling/notebooks/validation/shapes/oih_search.ttl)

```turtle
@prefix schema: <https://schema.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix shacl: <http://www.w3.org/ns/shacl#> .
@prefix oihval: <https://oceans.collaborium.io/voc/validation/1.0.1/shacl#> .

oihval:IDShape
    a shacl:NodeShape ;
    shacl:targetClass schema:Course ;
    shacl:message "Graph must have an ID"@en ;
    shacl:description "https://book.oceaninfohub.org/validation/README.html" ;
    shacl:nodeKind shacl:IRI ;
    .

oihval:DatasetCommonShape
    a shacl:NodeShape ;
    shacl:targetClass schema:Course ;
    shacl:message "OIH Learning Resource validation suite" ;
    shacl:description "https://book.oceaninfohub.org/validation/README.html" ;
    shacl:property
        oihval:nameResourceProperty,
        oihval:urlResourceProperty,
        oihval:descriptionResourceProperty,
        oihval:identifierProviderProperty,
        oihval:keywordsResourceProperty,
        oihval:licenseResourceProperty
    .

oihval:nameResourceProperty
    a shacl:PropertyShape ;
    shacl:path schema:name ;
    shacl:nodeKind shacl:Literal ;
    shacl:minCount 1 ;
    shacl:message "Name is required "@en ;
    shacl:description "https://book.oceaninfohub.org/validation/README.html" ;
    .

oihval:keywordsResourceProperty
    a shacl:PropertyShape ;
    shacl:path schema:keywords ;
    shacl:minCount 1 ;
    shacl:nodeKind shacl:Literal ;
    shacl:severity shacl:Warning ;
    shacl:message "A resource should include descriptive keywords" ;
    shacl:description "https://book.oceaninfohub.org/validation/README.html" ;
    .

oihval:licenseResourceProperty
    a shacl:PropertyShape ;
    shacl:path schema:license ;
    shacl:minCount 1 ;
    shacl:nodeKind shacl:Literal ;
    shacl:severity shacl:Info ;
    shacl:message "Though not required, it is good practice to include a license if one exists" ;
    shacl:description "https://book.oceaninfohub.org/validation/README.html" ;
    .

oihval:urlResourceProperty
    a shacl:PropertyShape ;
    shacl:path schema:url ;
    shacl:maxCount 1 ;
    shacl:minCount 1 ;
    shacl:nodeKind shacl:IRIOrLiteral ;
    shacl:message "URL required for the location of the resource described by this metadata"@en ;
    shacl:description "https://book.oceaninfohub.org/validation/README.html" ;
    .

oihval:descriptionResourceProperty
    a shacl:PropertyShape ;
    shacl:path schema:description;
    shacl:nodeKind shacl:Literal ;
    shacl:minCount 1 ;
    shacl:message "Resource must have a description"@en ;
    shacl:description "https://book.oceaninfohub.org/validation/README.html" ;
    .

oihval:identifierProviderProperty
    a shacl:PropertyShape ;
    shacl:path schema:provider;
    shacl:minCount 1 ;
    shacl:nodeKind shacl:IRIOrLiteral ;
    shacl:message "A provider must be noted"@en ;
    shacl:description "https://book.oceaninfohub.org/validation/README.html" ;
    .

```

## Tooling

pySHACL examples  ([https://github.com/RDFLib/pySHACL/](https://github.com/RDFLib/pySHACL/))

[kglab](https://derwen.ai/docs/kgl/) tutorial on [SHACL validation with pySHACL](https://derwen.ai/docs/kgl/ex5_0/)  

```python
from pyshacl import validate
r = validate(data_graph,
      shacl_graph=sg,
      ont_graph=og,
      inference='rdfs',
      abort_on_first=False,
      allow_warnings=False,
      meta_shacl=False,
      advanced=False,
      js=False,
      debug=False)
conforms, results_graph, results_text = r
```

## Severity

A brief note on severity levels.  SHACL defines [three levels of severity](https://www.w3.org/TR/shacl/#severity).  These can be useful to convey issues that are not violations for use, but are just warning and info related items.

| Severity     | Description                                                            |
|--------------|------------------------------------------------------------------------|
| sh:Info      | A non-critical constraint violation indicating an informative message. |
| sh:Warning   | A non-critical constraint violation indicating a warning.              |
| sh:Violation | A constraint violation.                                                |



# Linkes to the OIH Notebooks for demonstration

## Examples of using pySHACL 

[Basic SHACL](https://book.oceaninfohub.org/tooling/notebooks/validation/OIH_Simple_SHACL.html)

