# Prerequisites

Since most of resources are retrieved from URLs, **this notepad requires an Internet connection** to work.

In [1]:
from collections import deque
import json, sys                           # always handy
from pprint import pprint                  # for pretty printing
from urllib.request import urlopen         # for http requests
import jsonschema                          # for JSON Schema validations
from rdflib import Graph, RDF, Namespace   # for handling RDF data
import jq                                  # for JQ transforms
import pyshacl                             # for SHACL validation and entailment

# Convenience function to retrieve remote JSON data
def fetch_json(url: str):
  print(f"Fetching {url}... ", end='')
  with urlopen(url) as f:
    print('ok')
    return json.load(f)

# Loading a Building Blocks register

To load a Building Blocks register, we start from its `register.json` file, and we load its imports recursively.

In [2]:
my_register_url = 'https://ogcincubator.github.io/bblocks-examples/build/register.json'
loaded_registers = {}
pending_registers = deque((my_register_url,))
while pending_registers:
  register_url = pending_registers.popleft()
  if register_url in loaded_registers:
    # Do not load imports more than once
    continue
  # Fetch register.json
  register = fetch_json(register_url)
  loaded_registers[register_url] = register
  # Add new imports to queue
  pending_registers.extend(loaded_registers.get('imports', []))

# We will be using this later on
my_register = loaded_registers[my_register_url]
# Index bblocks by itemIdentifier
my_bblocks = {bblock['itemIdentifier']: bblock for bblock in my_register['bblocks']}

print('Loaded registers:\n', '\n '.join(loaded_registers.keys()))
print(f'Building blocks in my register ({my_register_url}):\n',
      '\n '.join(sorted(my_bblocks.keys())))

Fetching https://ogcincubator.github.io/bblocks-examples/build/register.json... ok
Loaded registers:
 https://ogcincubator.github.io/bblocks-examples/build/register.json
Building blocks in my register (https://ogcincubator.github.io/bblocks-examples/build/register.json):
 ogc.bbr.examples.feature.externalSchema
 ogc.bbr.examples.feature.geojsonFeature
 ogc.bbr.examples.feature.geojsonFeatureFGLenient
 ogc.bbr.examples.feature.geosparqlFeature
 ogc.bbr.examples.feature.propertySet
 ogc.bbr.examples.observation.vectorObservation
 ogc.bbr.examples.observation.vectorObservationFeature
 ogc.bbr.examples.semantic-uplift.pre-and-post-uplift
 ogc.bbr.examples.transforms.transforms-example


# Exploring a Building Block

Let us select a Building Block (`ogc.bbr.examples.observation.vectorObservationFeature`) and explore its metadata and resources:

In [3]:
bblock_id = 'ogc.bbr.examples.semantic-uplift.pre-and-post-uplift'
bblock = my_bblocks[bblock_id]
print(f"# {bblock['itemIdentifier']} - {bblock['name']}")
print(bblock['abstract'], '\n')
print('## Metadata properties')
print('', '\n '.join(bblock.keys()))

# ogc.bbr.examples.semantic-uplift.pre-and-post-uplift - Pre and Post Semantic Uplift example
A sample building block to show how semantic uplift can be customized 

## Metadata properties
 itemIdentifier
 name
 abstract
 status
 dateTimeAddition
 itemClass
 version
 dateOfLastChange
 shaclRules
 ldContext
 schema
 sourceSchema
 sourceLdContext
 sourceFiles
 rdfData
 validationPassed
 testOutputs
 documentation


## Validating data with a JSON schema

The building blocks offer JSON schemas both in YAML and JSON format. We can validate a given JSON document against the building block's schema. Let us use an invalid example (notice how the required property `one` is spelled as `One`).

In [4]:
if not bblock.get('schema'):
  print('This bblock has no schema')
  sample_data = None
else:
  print('JSON Schema formats:', ', '.join(bblock['schema'].keys()))
  bblock_schema = fetch_json(bblock['schema']['application/json'])
  print(json.dumps(bblock_schema, indent=2))
  # Sample data adapted from one of the actual bblock examples.
  sample_data = json.loads(
  """
  {
    "One": 1,
    "two": 2,
    "string": "value"
  }
  """)
  try:
    jsonschema.validate(instance=sample_data, schema=bblock_schema) # An exception is shown
    print('Validation finished without errors')
  except jsonschema.exceptions.ValidationError as e:
    print('ValidationError!', file=sys.stderr)
    print(e, file=sys.stderr)

JSON Schema formats: application/yaml, application/json
Fetching https://ogcincubator.github.io/bblocks-examples/build/annotated/bbr/examples/semantic-uplift/pre-and-post-uplift/schema.json... ok
{
  "type": "object",
  "properties": {
    "one": {
      "type": "number",
      "x-jsonld-id": "http://example.com/hasOne"
    },
    "two": {
      "type": "number",
      "x-jsonld-id": "http://example.com/hasTwo"
    },
    "string": {
      "type": "string",
      "x-jsonld-id": "http://example.com/hasString"
    }
  },
  "required": [
    "one"
  ],
  "x-jsonld-extra-terms": {
    "three": "http://example.com/exHasGenThree",
    "doubleString": "http://example.com/hasGenDoubleString"
  },
  "x-jsonld-prefixes": {
    "ex": "http://example.com/"
  }
}


ValidationError!
'one' is a required property

Failed validating 'required' in schema:
    {'properties': {'one': {'type': 'number',
                            'x-jsonld-id': 'http://example.com/hasOne'},
                    'string': {'type': 'string',
                               'x-jsonld-id': 'http://example.com/hasString'},
                    'two': {'type': 'number',
                            'x-jsonld-id': 'http://example.com/hasTwo'}},
     'required': ['one'],
     'type': 'object',
     'x-jsonld-extra-terms': {'doubleString': 'http://example.com/hasGenDoubleString',
                              'three': 'http://example.com/exHasGenThree'},
     'x-jsonld-prefixes': {'ex': 'http://example.com/'}}

On instance:
    {'One': 1, 'string': 'value', 'two': 2}


Now let us fix the sample data and retry validation:

In [5]:
if sample_data:
  if 'one' not in sample_data: # Just in case this block is run more than once
    sample_data['one'] = sample_data.pop('One')
  try:
    jsonschema.validate(instance=sample_data, schema=bblock_schema) # No exception should be thrown
    print('Validation finished without errors')
  except jsonschema.exceptions.ValidationError as e:
    print('ValidationError!', file=sys.stderr)
    print(e, file=sys.stderr)

Validation finished without errors


## Getting extended metadata

All building blocks contain additional metadata that is stored in JSON format. The URL for this metadata resides inside the `documentation` property; for the JSON description, we need the `json-full` entry.

For instance, semantic uplift steps and full examples are embedded into this new metadata.

In [6]:
bblock_full = fetch_json(bblock['documentation']['json-full']['url'])
print('## New metadata keys')
print('', '\n '.join(k for k in bblock_full.keys() if k not in bblock), '\n')
print('### Examples')
for example in bblock_full['examples']:
  print(f" - {example['title']}; Snippet languages:", ', '.join(s['language'] for s in example['snippets']))

Fetching https://ogcincubator.github.io/bblocks-examples/build/generateddocs/json-full/bbr/examples/semantic-uplift/pre-and-post-uplift/index.json... ok
## New metadata keys
 description
 examples
 annotatedSchema
 semanticUplift
 gitRepository
 gitPath 

### Examples
 - Example for uplift; Snippet languages: json, jsonld, ttl


## Semantic uplift

If our building block provides a JSON-LD context (and, possibly, semantic uplift steps), we can add it to our object and see the resulting RDF.

In [7]:
if bblock.get('ldContext'):
  def apply_uplift_steps(bblock: dict, stage: str, data):
    for step in bblock.get('semanticUplift', {}).get('additionalSteps', []):
      if step['stage'] == stage:
        step_type = step['type']
        step_code = step.get('code')
        if not step_code and step['ref']:
          # Code is not inlined, but stored in remotely (URL in `ref`)
          with urlopen(step['ref']) as f:
            step_code = f.read().decode('utf-8')
        if step_type == 'shacl':
          print(f"  -> Applying {stage} {step_type} step")
          shacl_graph = Graph().parse(data=step_code, format='ttl')
          pyshacl.validate(data, shacl_graph, in_place=True, advanced=True)
        elif step_type == 'sparql-update':
          print(f"  -> Applying {stage} {step_type} step")
          data.update(step_code)
        elif step_type == 'sparql-construct':
          print(f"  -> Applying {stage} {step_type} step")
          data = data.query(step_code).graph
        elif step_type == 'jq':
          print(f"  -> Applying {stage} {step_type} step")
          data = jq.compile(step_code).input_value(data).first()
          
    return data

  print('# Input data')
  print(json.dumps(sample_data, indent=2))

  # Apply pre-JSON-LD steps, if any
  uplift_data = apply_uplift_steps(bblock_full, 'pre', sample_data)

  print('\n# After pre steps')
  print(json.dumps(uplift_data, indent=2))
  
  # Fetch JSON-LD context
  jsonld_context = fetch_json(bblock['ldContext'])
  # Add to JSON data - merge both objects (jsonld_context only has `@context` key)
  jsonld_data = json.dumps({
      **jsonld_context,
      **uplift_data,
    }, indent=2)
  # base URL can be changed or omitted altogether, but it does look better when used
  rdf_graph = Graph().parse(data=jsonld_data, format='json-ld', base='http://example.com/bblocks/')

  print('\n# After JSON-LD context')
  print(rdf_graph.serialize())

  # Apply post-JSON-LD steps, if any
  rdf_graph = apply_uplift_steps(bblock_full, 'post', rdf_graph)

  # Output data in RDF Turtle format
  print('\n# After post steps context')
  print(rdf_graph.serialize())
else:
  print('No JSON-LD context found in bblock')

# Input data
{
  "two": 2,
  "string": "value",
  "one": 1
}
  -> Applying pre jq step

# After pre steps
{
  "two": 2,
  "string": "value",
  "one": 1,
  "three": 3,
  "doubleString": "valuevalue"
}
Fetching https://ogcincubator.github.io/bblocks-examples/build/annotated/bbr/examples/semantic-uplift/pre-and-post-uplift/context.jsonld... ok

# After JSON-LD context
@prefix ex: <http://example.com/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[] ex:exHasGenThree 3 ;
    ex:hasGenDoubleString "valuevalue" ;
    ex:hasOne 1 ;
    ex:hasString "value" ;
    ex:hasTwo 2 .


  -> Applying post sparql-update step

# After post steps context
@prefix ex: <http://example.com/> .
@prefix ns1: <https://example.net/2/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[] ex:exHasGenThree 3 ;
    ex:hasGenDoubleString "valuevalue" ;
    ex:hasOne 1 ;
    ex:hasString "value" ;
    ex:hasTwo 2 ;
    ns1:hasFour 4 ;
    ns1:nineBound false .




## Validating RDF

If the building block contains SHACL validation shapes (either direct or inherited), we can apply those to our RDF data.

SHACL shapes are stored as an object whose keys are the building block identifiers that declared them (so that inherited shapes can be detected and resolved, if necessary), and whose objects are arrays with the URLs of the documents.

In [8]:
SHACL = Namespace('http://www.w3.org/ns/shacl#')

def shacl_validate(bblock: dict, graph: Graph):
  if bblock.get('shaclRules'):
    shacl_graph = Graph()
    for shacl_bblock_id, shacl_urls in bblock['shaclRules'].items():
      for shacl_url in shacl_urls:
        shacl_graph.parse(shacl_url, format='ttl')
    print('# SHACL shapes')
    print(' -', '\n - '.join(s for s in shacl_graph.subjects(RDF.type, SHACL.NodeShape)))
    conforms, results_graph, results_text = pyshacl.validate(graph, shacl_graph=shacl_graph)
    f = sys.stdout if conforms else sys.stderr
    print('# SHACL result:', conforms, file=f)
    print(results_text, file=f)
  else:
    print('This building block has no SHACL resources')

shacl_validate(bblock_full, rdf_graph)

# SHACL shapes
 - http://example.com/rules#testHasOneIs1
# SHACL result: True
Validation Report
Conforms: True



If we add a change the value of the `ex:hasOne` predicate to something other than `1`, we can see how validation fails this time:

In [9]:
rdf_graph.update('''
  PREFIX ex: <http://example.com/>
  DELETE { ?s ex:hasOne 1 }
  INSERT { ?s ex:hasOne 2 }
  WHERE { ?s ex:hasOne 1 }
''')
shacl_validate(bblock_full, rdf_graph)

# SHACL shapes
 - http://example.com/rules#testHasOneIs1


# SHACL result: False
Validation Report
Conforms: False
Results (1):
Constraint Violation in HasValueConstraintComponent (http://www.w3.org/ns/shacl#HasValueConstraintComponent):
	Severity: sh:Violation
	Source Shape: [ sh:hasValue Literal("1", datatype=xsd:integer) ; sh:name Literal("hasOne is 1") ; sh:path ex:hasOne ]
	Focus Node: [ ex:exHasGenThree Literal("3", datatype=xsd:integer) ; ex:hasGenDoubleString Literal("valuevalue") ; ex:hasOne Literal("2", datatype=xsd:integer) ; ex:hasString Literal("value") ; ex:hasTwo Literal("2", datatype=xsd:integer) ; ns1:hasFour Literal("4", datatype=xsd:integer) ; ns1:nineBound Literal("false" = False, datatype=xsd:boolean) ]
	Result Path: ex:hasOne
	Message: Node [ ex:exHasGenThree Literal("3", datatype=xsd:integer) ; ex:hasGenDoubleString Literal("valuevalue") ; ex:hasOne Literal("2", datatype=xsd:integer) ; ex:hasString Literal("value") ; ex:hasTwo Literal("2", datatype=xsd:integer) ; ns1:hasFour Literal("4", datatype=xsd:integer) ; ns1:nineB