## Provenance documentation for regonal wind hazard creation

This noteook generates a provenance document describing the creation of regional wind hazard layers for use in the National Wind Risk Asessment (2023/24). 

Dates, paths and filenames are set manually in the relevant cells. 

In [None]:
from prov.model import ProvDocument
from prov.dot import prov_to_dot
from datetime import datetime

In [None]:
provdoc = ProvDocument()
provdoc.set_default_namespace("")
provdoc.add_namespace("nwra", "http://www.ga.gova.au/hazards")
provdoc.add_namespace("prov", "http://www.w3.org/ns/prov#")
provdoc.add_namespace("xsd", "http://www.w3.org/2001/XMLSchema#")
provdoc.add_namespace("foaf", "http://xmlns.com/foaf/0.1/")
provdoc.add_namespace("void", "http://vocab.deri.ie/void#")
provdoc.add_namespace("dcterms", "http://purl.org/dc/terms/")


### Agents

1. Geoscience Australia
2. Geoscience Australia staff
3. ArcGIS Pro Geoprocessing Model (a "Software Agent")

In [None]:

provdoc.agent(":GeoscienceAustralia",
              {
                  "foaf:name": "Geoscience Australia",
                  "dcterms:type": "foaf:Organization",
                  "foaf:mbox": "hazards@ga.gov.au"
              }
             )

provdoc.agent("nwra:ConvertWindZones",
              {
                  "dcterms:type": "prov:SoftwareAgent",
                  "dcterms:title": "Convert wind zones to raster",
                  "dcterms:description": "Geoprocessing model to convert wind zones to raster layers and apply focal statistics to smooth data",
                  "dcterms:created": datetime(2023, 10, 13)
              })

provdoc.agent(":NHIStaff",
              {
                  "foaf:name": "Craig Arthur",
                  "dcterms:type": "prov:Person",
                  "foaf:mbox": "craig.arthur@ga.gov.au",
              })

### Source datasets

1. AS/NZS 1170.2 (2021) definition
2. Wind loading regions feature class
3. Australian coastline data

In [None]:
e1 = provdoc.entity(":ASNZS1170.2",
                    {"dcterms:title": "AS/NZS 1170.2:2021 Structural design actions - Wind actions",
                     "dcterms:type": "foaf:Document",
                     "prov:location": "https://www.saiglobal.com/online/Product/Index/EPCO2802029297",
                     "dcterms:created": datetime(2021, 7, 30),
                     "dcterms:creator": "Standards Australia/Standards New Zealand"
                     })


e2 = provdoc.entity("nwra:windloadingregions",
                    {
                        "dcterms:title": "as1170windzones",
                        "dcterms:description": "Wind loading regions feature class",
                        "dcterms:type": "Esri file geodatabase feature class",
                        "prov:location": "X:/georisk/HaRIA_B_Wind/data/derived/boundaries/as1170.2/windzones/windzones.gdb",
                        "dcterms:creted": datetime(2023, 9, 18),
                        "dcterms:creator": ":GeoscienceAustralia"
                     })

e3 = provdoc.entity(":AustralianCoastlines",
                    {
                        "dcterms:title": "Australian Statistical Geography Standard 2021 - Australian Coastline",
                        "dcterms:type": "Esri file geodatabase feature class",
                        "prov:location": "X:/georisk/HaRIA_B_Wind/data/derived/boundaries/as1170.2/windzones/windzones.gdb",
                        "dcterms:created": datetime(2022, 3, 2),
                        "dcterms:creator": "Australian Bureau of Statistics"
                    })

e4 = provdoc.entity("nwra:windloadingregionswithwindspeed",
                    {
                        "dcterms:title": "as1170windzones",
                        "dcterms:description": "Wind loading regions feature class with return period wind speeds",
                        "dcterms:type": "Esri file geodatabase feature class",
                        "prov:location": "X:/georisk/HaRIA_B_Wind/data/derived/boundaries/as1170.2/windzones/windzones.gdb",
                        "dcterms:created": datetime(2023, 9, 18),
                        "dcterms:creator": ":GeoscienceAustralia"
                    })

e5 = provdoc.entity("nwra:extededwindloadingregionswithwindspeed",
                    {
                        "dcterms:title": "extended_wind_regions",
                        "dcterms:description": "Wind loading regions feature class extended offshore with return period wind speeds",
                        "dcterms:type": "Esri file geodatabase feature class",
                        "prov:location": "X:/georisk/HaRIA_B_Wind/data/derived/boundaries/as1170.2/windzones/windzones.gdb",
                        "dcterms:created": datetime(2023, 9, 19),
                        "dcterms:creator": ":GeoscienceAustralia"
                    })

e6 = provdoc.entity("nwra:regionalwindrasters",
                    {
                        "dcterms:title": "Regional wind rasters",
                        "dcterms:description": "Raster datasets of regional wind speed for each defined return period",
                        "dcterms:type": "Esri raster dataset",
                        "prov:location": "X:/georisk/HaRIA_B_Wind/projects/acs/2. DATA/1. Work Unit Assessment/NWRA/NWRA.gdb/RP*_region",
                        "dcterms:created": datetime(2023, 10, 13, 9, 28),
                        "dcterms:creator": ":GeoscienceAustralia"
                     })

e7 = provdoc.entity("nwra:smoothedregionalwindrasters",
                    {
                        "dcterms:title": "Smoothed regional wind rasters",
                        "dcterms:description": "Raster datasets of regional wind speed for each defined return period, with smoothing",
                        "dcterms:type": "GeoTIFF dataset",
                        "prov:location": "X:/georisk/HaRIA_B_Wind/projects/acs/2. DATA/1. Work Unit Assessment/NWRA/hazard/regional/RP*_smooth.tif",
                        "dcterms:created": datetime(2023, 10, 13, 9, 28),
                        "dcterms:creator": ":GeoscienceAustralia"
                     })


### Activities

This sets out all the activities performed in the process. 

In [None]:
a1 = provdoc.activity("nwra:WindRegionFeatureClassCreation",
                      endTime=datetime(2022, 3, 2, 21, 4), 
                      other_attributes={
                          "dcterms:title": "AS/NZS 1170.2 wind loading regions feature class creation"
                      })
a2 = provdoc.activity("nwra:WindSpeedAssignment",
                      startTime=datetime(2023, 9, 18, 9, 15),
                      endTime=datetime(2023, 9, 18, 9, 23),
                      other_attributes={
                          "dcterms:title": "Assign design wind speeds to wind regions",
                          "dcterms:description": "Add attributes to the feature class for each defined return period and set the value to the corresponding value from AS/NZS 1170.2: 2021"
                          })
a3 = provdoc.activity("nwra:extendWindZones",
                      startTime=datetime(2023, 9, 18, 9, 33),
                      endTime=datetime(2023, 9, 19, 12, 6),
                      other_attributes={
                          "dcterms:title": "Extend wind loading regions to offshore areas",
                          "dcterms:description": "Create an offshore zone adjacent to onshore wind zones that have the same attributes, to extend the wind zones 50 km offshore"
                              })
a4 = provdoc.activity("nwra:ConvertWindZonesToRaster",
                      startTime=datetime(2023, 10, 13, 9, 28),
                      endTime=datetime(2023, 10, 13, 10, 23),
                      other_attributes={
                          "dcterms:title": "Convert wind zones to raster datasets geoprocessing model",
                          "dcterms:description": "Convert the wind zones to a set of temporary raster datasets that have the value of the return period wind speed"
                          })
a5 = provdoc.activity("nwra:ApplySmoothing",
                      startTime=datetime(2023, 10, 13, 9, 28),
                      endTime=datetime(2023, 10, 13, 10, 23),
                      other_attributes={
                          "dcterms:title": "Apply focal statistics",
                          "dcterms:description": "Apply a smoothing filter with 50 km length scale to regional wind hazard data",
                      })


### Associations

Specify the associations between entities, activities and agents

In [None]:
provdoc.wasDerivedFrom(e2, e1, activity=a1)
provdoc.wasDerivedFrom(e2, e3, activity=a1)
provdoc.wasDerivedFrom(e4, e2, activity=a2)
provdoc.wasDerivedFrom(e5, e4, activity=a3)
provdoc.wasDerivedFrom(e5, e3, activity=a3)
provdoc.wasDerivedFrom(e6, e5, activity=a4)
provdoc.wasDerivedFrom(e7, e6, activity=a5)

provdoc.wasGeneratedBy(e2, a1)
provdoc.wasGeneratedBy(e4, a2)
provdoc.wasGeneratedBy(e5, a3)
provdoc.wasGeneratedBy(e6, a4)
provdoc.wasGeneratedBy(e7, a5)

provdoc.wasAssociatedWith(a1, ":NHIStaff")
provdoc.wasAssociatedWith(a2, ":NHIStaff")
provdoc.wasAssociatedWith(a3, ":NHIStaff")
provdoc.wasAssociatedWith(a4, "nwra:ConvertWindZones")
provdoc.wasAssociatedWith(a5, "nwra:ConvertWindZones")
provdoc.wasAttributedTo(e2, ":NHIStaff")

provdoc.actedOnBehalfOf("nwra:ConvertWindZones", ":NHIStaff", a5)
provdoc.actedOnBehalfOf(":NHIStaff", ":GeoscienceAustralia")


### Print the provenance

Using the PROV notation, print the provenance information for inspection

In [None]:
print(provdoc.get_provn())

### Write the provenance

Write the provenance information to XML and generate a directed graph of the associations

In [None]:
dot = prov_to_dot(provdoc, direction="TB", use_labels=True)
dot.write_png('regionalwindhazardprovenance.png')
provdoc.serialize('regionalwindhazardprovenance.xml', format='xml')

### Validate the PROV-XML

Run the resulting XML file through the schema validation to ensure the XML is valid PROV information. If there is an exception raised at this point, then there is an error somewhere back in the definition of the elements. Check the specific line indicated in the XML file for errors.

If no errors are reported, then the XML file is a valid PROV-XML document.

In [None]:
from lxml import etree
from pathlib import Path

PROVXML_SCHEMA="C:/WorkSpace/prov/prov.xsd"
schema = etree.XMLSchema(etree.parse(PROVXML_SCHEMA))
schema.assert_(etree.parse('regionalwindhazardprovenance.xml')) 