# Queensland Spatial Information Canonical Examples Scenario Demonstrator
## For the [FrontierSI](https://frontiersi.com.au) Queensland Land Information Project
### By Nicholas J. Car

## 1. Introduction

This Workbook is a _Scenario Demonstrator_ for the _QSI Cadastre Address Project_. It lists some of that project's Requirements and shows how the models proposed for it cater for them.

The goal of this Scenario Demonstrator is not to implement Canonical Examples and tests for _all_ Requirements but just enough to indicate the likely viability of the models as per Requirements.

It shows model skipp per Requirements by presenting _Canonical Examples_ which are data files stored in this Notebook's repository, made according to the project models and then showing validation of and queries against those examples.

The project models are documented within the [Queensland Land Information Supermodel](https://nicholascar.com/qli-supermodel/supermodel.html) and the Canonical Example data files are named according to the Requirements they are for and stored in the [canonical-examples/ folder](https://github.com/nicholascar/qsi-sd-canonical/tree/master/canonical-examples).

### 1.1 Canonical Examples data format
The Canonical Examples data is [RDF](https://www.w3.org/RDF/) data, that is data made according to a conceptual model used to create [_Knowledge Graphs_](https://ai.stanford.edu/blog/introduction-to-knowledge-graphs/) which are data models of nodes representing ideas connected by edges of defined relationships.

RDF data is system-independent - it can be stored and used in many different systems - but it is validatable and queryable too, using a standard query language called [SPARQL](https://www.w3.org/TR/sparql11-query/) which is similar to SQL, but for graphs.

### 1.2 Executable Workbook

This Workbook is a piece of documented code that can be run online, in a browser, step-by-step, without the runner having to install anything. The code sections you see below can be run, results seen, altered and then re-run, to allow users to really get to gripps with the data, queries and so on. Please feel free to experiment: changes are not stored, so you can't really break anything!

### 1.2 Initial Requirement Explained

The first Requirement demonstrated here, A18 in Section 2.1, is explained in more detail than subsequent Requirements to ensure all details of the technical implementation are understood.

## 2. Requirements

### 2.1 Req. A18

#### 2.1.1 Context
Requirement A18 is an Address Requirement with the following details (from the _QSI Cadastre Address Requirements_ list):

**ID** | **Statement** | **Relevant Model Elements** | **Canonical Examples** | **Scenario Demonstrator**
--- | --- | --- | --- | ---
A18 | Ideally entities as address components that are maintained elsewhere e.g. locality (and Local Government Area even though this isn't part of an official address string) should be linked to rather than repeated in the addressing database. | Address Model, Overarching Model | R18-v01, R18-i01, R18-i02 | Notebook SD01, EG01

This Requirement requires two models to be satisfied:

1. [Address Model](https://nicholascar.com/anz-nat-addr-model-candidate/model.html) - to ensure an Address object has the properties required
2. QLI Supermodel's [Overarching Model](https://nicholascar.com/qli-supermodel/supermodel.html#_overarching_model) - since the "...entities...maintained elsewhere e.g. locality..." are objects for which there is no Supermodel Component Model but for which the Overarching Model has a generic slot

By exercising a link between the Address Model and the Administrative Areas model, this example also implicitly relies on the Overarching Model of the Supermodel, but this is a background concern and not specifically called out here.

The Address Model contains an `Address` class and a property for it to indicate `AddressComponent` class instances which is `hasAddressComponent`. `AddressComponent` objects then contain a `hasComponentType`, a `hasValue`  and `hasValueText` properties, see the [`AddressComponent` class documentation](https://nicholascar.com/anz-nat-addr-model-candidate/model.html#AddressComponent).

The Overarching Model contains an `AdministrativeArea` class which will be what the Supermodel considers Localities and other such entities to be. Instances of this class will have identifiers that are IRIs - data web addresses.

#### 2.1.2 Demonstration Method

Can data be formulated according to the various models, some of is does and some of which is valid according Requirement A18? Are such datum able to be tested to prove validity or invalidity?

The Address Model [has a validator](http://w3id.org/profile/anz-address/validator) which has a validation Shape, that is a graph data test function, that can demonstrate that the model meets this Requirement:

```turtle
<sh-02>
	a sh:NodeShape ;
	sh:targetObjectsOf addr:hasAddressComponent ; # addr:AddressComponent instances
	sh:sparql [
		a sh:SPARQLConstraint ;
		sh:message "The hasValue property of an AddressComponent with addressComponentType isov1:locality must indicate an IRI (of a Concept from a vocabulary), not text" ;
		sh:prefixes addr: ;
		sh:select """
			SELECT $this ?value
			WHERE {
                $this addr:hasComponentType isov1:locality .
                $this addr:hasValue ?value .
                FILTER (!isIRI(?value))
			}
			""" ;
	] ;
.
```

This Shape checks that each instance of `AddressComponent` that is of type `locality` (according to [an extended version of ISO's Address Component Types Vocabulary](https://nicholascar.com/anz-nat-addr-model-candidate/model.html#_address_component_type_vocabulary)) has a value that is an IRI - a web address - and not a numerical or textual value.

Separately to this validator, a mapping between validator Shapes and Requirements is maintained within [the QLI Supermodel's listing of Requirements](https://nicholascar.com/qli-supermodel/requirements.html#_model_mappings)

Canonical Examples of system-independent data that does and does not conform to this Shape are presented within this Scenario Demonstrator:

* A18-01-v.ttl - valid example 01
* A18-02-i.ttl - invalid example 02

The graph validation tool [pySHAC](https://github.com/RDFLib/pySHACL) can be run using the Address Model validator and each example as input and the results should be valid (01) and invalid (02) as expected.

#### 2.1.3 Demonstration Steps

##### 1. show example data (just for visual review)

In [None]:
from rdflib import Graph
g = Graph().parse("canonical-examples/A18-01-v.ttl")
print(g.serialize())

_Note the components "20", "Oxford", "Place" & example LGA `<http://example.com/lga/1234>`_

##### 2. load validator tool, pySHACL

In [None]:
from pyshacl import validate

##### 3. validate 01 - expect True

In [None]:
v1 = validate("canonical-examples/A18-01-v.ttl", shacl_graph="validators/addr.shacl.ttl")

if v1[0]:
    print("A18-01-v is valid, as expected")
else:
    print("ERROR: A18-01-v is unexpectedly invalid")

##### 4. validate 02 - expect False

In [None]:
v2 = validate("canonical-examples/A18-02-i.ttl", shacl_graph="validators/addr.shacl.ttl")

if not v2[0]:
    print("A18-02-i is invalid, as expected")
    print("The error message from the validator is:")
    print()
    print(v2[2])
else:
    print("ERROR: A18-02-i is unexpectedly valid")