From 00f6a177c3a506f64f3e4155287f9f8505af4041 Mon Sep 17 00:00:00 2001
From: Dmitry Brizhinev
The methods discussed in this note relate to other deliverables from SDWWG, notably the Use Cases and Requirements, The Best Practices, the Semantic Sensor Network Ontology, QB4ST and OWL-Time. Over the coming 3-4 months we expect to resolve any differences and update the note accordingly. - + All additional work planned, including that above, is marked as an issue in this Note.
Publishing data on the Web using Linked Data technologies makes it more accessible, easier to discover, and machine-readable. In the context of the rapidly growing availability and importance of earth observation data, this work aims to leverage the Linked Data approach to data publishing to make such data both much more easily usable by non-specialists and much more easily integrated with other Web data in applications. - Linked Data has worked well for multi-dimensional statistical data using the RDF Data Cube [[vocab-data-cube]]. Following this success, Earth Observation imagery can be readily modelled as a Data Cube with the three dimensions of latitude, longitude, and time. This simple conceptualisation and its encoding as Linked Data may be convenient for scientists and consumer app developers everywhere, and especially to statisticians such as those in National Statistics Organisations. + Linked Data has worked well for multi-dimensional statistical data using the RDF Data Cube [[vocab-data-cube]]. Following this success, Earth Observation imagery can be readily modelled as a Data Cube with the three dimensions of latitude, longitude, and time. This simple conceptualisation and its encoding as Linked Data may be convenient for scientists and consumer app developers everywhere, and especially to statisticians such as those in National Statistics Organisations.
- +Satellite imagery is commonly modelled as a multidimensional grid coverage, as discussed in [[sdw-bp]]. The large number of data points that is typical of coverage data such as Landsat imagery means that publishers may be justifiably reluctant to address the size explosion that accompanies converting data to RDF. While such a conversion provides maximum machine-readability, many benefits of Linked Data can be realized with a compromise approach where only the metadata is directly expressed in RDF. Further benefits can be realized by storing voluminous gridded coverage data in more efficient storage representations and using specialised middleware to generate an RDF representation on-the-fly to respond to service requests.
-This document illustrates that approach showing how Earth Observation imagery can be published as Linked Data using the RDF Data Cube vocabulary [[vocab-data-cube]] in concert with other relevant ontologies including the W3C/OGC Semantic Sensor Network ontology (SSN) [[vocab-ssn]], the W3C/OGC Time ontology (Time) [[owl-time]], the W3C Simple Knowledge Organisation System (SKOS) [[skos-reference]], W3C PROV-O [prov-o] and the W3C/OGC QB4ST [[qb4st]]. We show how SPARQL queries can be served through a scalable OGC Discrete Global Grid System for observation data, coupled with a triple store for observational metadata. +
This document illustrates that approach showing how Earth Observation imagery can be published as Linked Data using the RDF Data Cube vocabulary [[vocab-data-cube]] in concert with other relevant ontologies including the W3C/OGC Semantic Sensor Network ontology (SSN) [[vocab-ssn]], the W3C/OGC Time ontology (Time) [[owl-time]], the W3C Simple Knowledge Organisation System (SKOS) [[skos-reference]], W3C PROV-O [prov-o] and the W3C/OGC QB4ST [[qb4st]]. We show how SPARQL queries can be served through a scalable OGC Discrete Global Grid System for observation data, coupled with a triple store for observational metadata.
- +Throughout the document we refer to relevant Use Cases and Requirements of the Spatial Data on the Web Working Group (UCR) [[sdw-ucr]] and Best Practices of the Spatial Data on the Web Working Group (BP) [[sdw-bp]]. Those references may be helpful to provide real-world applications and further rationale for the approach described here. We refer to extracts from a small example for illustration. The complete source file for the example is ANU-LED example.
- - + +The RDF Data Cube [[vocab-data-cube]] is a standard for representing multidimensional data as RDF. +
The RDF Data Cube [[vocab-data-cube]] is a standard for representing multidimensional data as RDF. It is typically used for numerical data that is associated with geographic regions (e.g. suburbs) and classifications (e.g. age, industry, or time periods). Common practice includes using the SKOS vocabulary to define the concepts being reported [Observed property in coverage]. The RDF Data Cube vocabulary allows the publisher to define all the relevant components of their data and the concepts they quantify, - including: + including:
- + :lat a qb:DimensionProperty ; rdfs:subPropertyOf geo:lat . @@ -125,29 +125,29 @@The RDF Data Cube
rdfs:range xsd:dateTime ; qb:concept sdmx-concept:timePeriod . -:dataValue a qb:MeasureProperty ; +:dataPixelValue a qb:MeasureProperty ; rdfs:range xsd:integer ; qb:concept :reflectance ; qb:concept sdmx-concept:obsValue . - + +# in pixels per degree :resolution a qb:AttributeProperty ; - rdfs:range :pixelsPerDegree . - -:pixelsPerDegree a rdfs:Datatype ; - owl:equivalentClass xsd:double . - + rdfs:range xsd:double . +
The ontology QB4ST [[qb4st]] extends the Data Cube for extra power and consistency when describing spatio-temporal aspects of data. [Georeferenced spatial data]. Any number of such dimensions can be defined, allowing for 1D, 2D, 3D or 4D coverages [Support for 3D, Time series, 4D model of space-time].
- +:lat a qb4st:SpatialDimension ; rdfs:subPropertyOf geo:lat ; - qb4st:crs <http://www.opengis.net/def/crs/EPSG/0/4326> . + qb4st:crs <http://epsg.io/4326> ; + qb4st:crslabel "WGS84" . :long a qb4st:SpatialDimension ; rdfs:subPropertyOf geo:long ; - qb4st:crs <http://www.opengis.net/def/crs/EPSG/0/4326> . + qb4st:crs <http://epsg.io/4326> ; + qb4st:crslabel "WGS84" . :time a qb:DimensionProperty, qb4st:TemporalProperty ; rdfs:range xsd:dateTime ; @@ -158,15 +158,15 @@Metadata and data
Traditionally, there is a distinction between data, that is the observations proper such as Landsat pixels and metadata, which adds context to the observations such as resolution. In Linked Data modelling, this distinction is not strict. However, it is possible to separate the two in a typical Data Cube.
- +The value of an RDF Data Cube component can be attached to each individual observation or to the dataset as a whole. Dataset-wide metadata can therefore be distinguished from the rest of the dataset, because it is attached to the
qb:DataSet
object. This makes it easy to fetch the metadata alone with a simple SPARQL query. This dataset-wide description alone is already a useful (and web-of-data friendly) approach to publishing spatial data - [Spatial metadata]. + [Spatial metadata].Here we demonstrate BP Describe the positional accuracy of spatial data, BP Include spatial metadata in dataset metadata, and BP Provide geometries on the Web in a usable way. Further, BP Use globally unique persistent HTTP URIs for spatial things is applied at the level of image pixels. We can also see an example of using the PROV ontology [[prov-o]] for earth observation imagery provenance. Alternatively a lineage ontology that extends the PROV ontology to reflect the lineage and lineage-extended components of ISO 19115 metadata is available.
- +:exampleDataset a qb:DataSet, prov:Entity ; qb:structure :exampleStructure ; :instrument :OLI ; @@ -174,20 +174,19 @@- +Metadata and data
:band "4" ; :coverageSpatialDomain "POLYGON((90 41.87, 93.33 41.87, 93.33 38.18, 90 38.18, 90 41.87))"^^ogc:wktLiteral ; :coverageTemporalDomain :timeDomain ; - prov:wasGeneratedBy :ANU-led-resampling ; - prov:wasDerivedFrom :AGDC . - + prov:wasGeneratedBy :ANU-led-resampling . + :p1 a :Pixel ; qb:dataSet :exampleDataset ; :lat "90.5556"; :long "41.2444"; :time "2001-10-26T21:32:52"^^xsd:dateTime ; - :dataValue "15"^^xsd:integer ; - :resolution "2.7"^^:pixelsPerDegree ; + :dataPixelValue "15"^^xsd:integer ; + :resolution "2.7"^^xsd:double ; :dggsCell "R00004" ; :bounds "POLYGON((90.37 41.45, 90.74 41.45, 90.74 41.04, 90.37 41.04, 90.37 41.45))"^^ogc:wktLiteral ; prov:wasDerivedFrom :example-tile .The RDF Data Cube also enables much more detailed metadata, like separate provenance for each observation. While it is not practical @@ -203,25 +202,25 @@
Metadata and data
level of image tiles. -:dataValue a qb:MeasureProperty ; - rdfs:range [owl:unionOf(xsd:anyURI xsd:integer)] ; +:dataImageValue a qb:MeasureProperty ; + rdfs:range xsd:anyURI ; qb:concept :reflectance ; qb:concept sdmx-concept:obsValue . - -:s1 a :GridSquare ; + +:R000 a :GridSquare ; qb:dataSet :exampleDataset ; :lat "91.6667"; :long "40.0270"; :time "2001-10-26T21:32:52"^^xsd:dateTime ; - :dataValue <http://www.example.org/led-example-image-R000> ; - :resolution "0.9"^^:pixelsPerDegree ; + :dataImageValue <http://www.example.org/led-example-image-R000> ; + :resolution "0.9"^^xsd:double ; :dggsCell "R000" ; :dggsLevelSquare "3" ; :dggsLevelPixel "4" ; :bounds "POLYGON((90 41.87, 93.33 41.87, 93.33 38.18, 90 38.18, 90 41.87))"^^ogc:wktLiteral ; prov:wasDerivedFrom :example-tile .
It is also helpful if the user can easily identify the domain of a coverage, that is, the spatial and temporal area where measurements are made [Spatial metadata]. QB4ST [[qb4st]] does not currently have a term for that, but it might in the future.
- +:exampleDataset a qb:DataSet, prov:Entity ; qb:structure :exampleStructure ; :instrument :OLI ; @@ -317,10 +316,9 @@Describing a coverage
:band "4" ; :coverageSpatialDomain "POLYGON((90 41.87, 93.33 41.87, 93.33 38.18, 90 38.18, 90 41.87))"^^ogc:wktLiteral ; :coverageTemporalDomain :timeDomain ; - prov:wasGeneratedBy :ANU-led-resampling ; - prov:wasDerivedFrom :AGDC . - - + prov:wasGeneratedBy :ANU-led-resampling . + + :exampleStructure a qb4st:SpatioTemporalDSD ; qb:component :spatialDomainComponent , @@ -331,7 +329,8 @@Describing a coverage
:satelliteComponent , :instrumentComponent , :bandComponent , - :dataComponent , + :dataImageComponent , + :dataPixelComponent , :dggsCellComponent , :dggsLevelSquareComponent , :dggsLevelPixelComponent , @@ -340,7 +339,7 @@Describing a coverage
:spatialDomainComponent a qb4st:SpatialComponentSpecification ; qb:attribute :coverageSpatialDomain . - + :temporalDomainComponent a qb4st:TemporalComponentSpecification ; qb:attribute :coverageTemporalDomain . @@ -362,8 +361,11 @@Describing a coverage
:bandComponent a qb:ComponentSpecification ; qb:attribute :band . -:dataComponent a qb:ComponentSpecification ; - qb:measure :dataValue . +:dataImageComponent a qb:ComponentSpecification ; + qb:measure :dataImageValue . + +:dataPixelComponent a qb:ComponentSpecification ; + qb:measure :dataPixelValue . :dggsCellComponent a qb4st:SpatialComponentSpecification ; qb:dimension :dggsCell . @@ -379,23 +381,25 @@Describing a coverage
:boundsComponent a qb4st:SpatialComponentSpecification ; qb:attribute :bounds . - - - + + + :coverageSpatialDomain a qb:AttributeProperty, qb4st:SpatialProperty ; rdfs:subPropertyOf :bounds . - + :coverageTemporalDomain a qb:AttributeProperty, qb4st:TemporalProperty ; rdfs:range time:DateTimeInterval ; qb:concept sdmx-concept:timePeriod . :lat a qb4st:SpatialDimension ; rdfs:subPropertyOf geo:lat ; - qb4st:crs <http://www.opengis.net/def/crs/EPSG/0/4326> . + qb4st:crs <http://epsg.io/4326> ; + qb4st:crslabel "WGS84" . :long a qb4st:SpatialDimension ; rdfs:subPropertyOf geo:long ; - qb4st:crs <http://www.opengis.net/def/crs/EPSG/0/4326> . + qb4st:crs <http://epsg.io/4326> ; + qb4st:crslabel "WGS84" . :time a qb:DimensionProperty, qb4st:TemporalProperty ; rdfs:range xsd:dateTime ; @@ -412,13 +416,21 @@Describing a coverage
:band a qb:AttributeProperty ; rdfs:range xsd:integer . -:dataValue a qb:MeasureProperty ; - rdfs:range [owl:unionOf(xsd:anyURI xsd:integer)] ; +:dataImageValue a qb:MeasureProperty ; + rdfs:range xsd:anyURI ; qb:concept :reflectance ; qb:concept sdmx-concept:obsValue . + +:dataPixelValue a qb:MeasureProperty ; + rdfs:range xsd:integer ; + qb:concept :reflectance ; + qb:concept sdmx-concept:obsValue . + +:rHEALPix a qb4st:CRS . :dggsCell a qb4st:SpatialDimension ; - qb4st:crs "rHEALPix WGS84 Ellipsoid" ; + qb4st:crs :rHEALPix ; + qb4st:crslabel "rHEALPix WGS84 Ellipsoid" ; rdfs:range xsd:string ; qb:concept sdmx-concept:refArea . @@ -429,16 +441,17 @@Describing a coverage
rdfs:range xsd:integer . :resolution a qb:AttributeProperty ; - rdfs:range :pixelsPerDegree . + rdfs:range xsd:double . :bounds a qb:AttributeProperty, qb4st:SpatialProperty ; rdfs:subPropertyOf ogc:asWKT ; rdfs:domain :GridSquare ; - qb4st:crs <http://www.opengis.net/def/crs/EPSG/0/4326> ; + qb4st:crs <http://epsg.io/4326> ; + qb4st:crslabel "WGS84" ; qb:concept sdmx-concept:refArea .
Discrete global grid systems are a family of spatial reference systems that subdivide the Earth's surface into a hierarchy of cells. @@ -446,7 +459,7 @@
The ANU-LED example in this document does not require the use of a DGGS. However, the DGGS has some convenient properties that make it particularly suitable for Linked Data. First, each DGGS cell has a unique identifier, so it is easy to generate natural URIs for each @@ -465,7 +478,7 @@
Data structures other than DGGS are also amenable to these approaches, for example n-dimensional gridded data, whether geospatial or not, and hierarchical structures such as tile sets, octrees and quadtrees.
@@ -667,14 +680,14 @@The Semantic Sensor Network ontology [[vocab-ssn]] defines terms for describing satellite sensors used to collect the data [Sensor metadata]. The ANU-LED example @@ -682,13 +695,13 @@
:exampleDataset a qb:DataSet, prov:Entity ; qb:structure :exampleStructure ; :instrument :OLI ; :satellite :landsat-8 ; :coverageSpatialDomain "POLYGON((90 41.87, 93.33 41.87, 93.33 38.18, 90 38.18, 90 41.87))"^^ogc:wktLiteral . - + :landsat-8 a ssn:Platform ; owl:sameAs cci-platform:plat_landsat_8 . @@ -715,12 +728,11 @@PROV-O
and what individuals and organisations were responsible for those processes. PROV-O descriptions can be attached at the dataset level, and also at the individual observation or tile level to indicate precisely from which source material each observation is derived. - +:exampleDataset a qb:DataSet, prov:Entity ; qb:structure :exampleStructure ; - prov:wasGeneratedBy :ANU-led-resampling ; - prov:wasDerivedFrom :AGDC . - + prov:wasGeneratedBy :ANU-led-resampling . + :ANU-led-resampling a prov:Activity ; prov:wasAssociatedWith :DmitryBrizhinev ; prov:used :AGDC . @@ -730,7 +742,7 @@PROV-O
foaf:mbox <mailto:dmitry.brizhinev@anu.edu.au> . :AGDC a prov:Collection ; - prov:wasAssociatedWith :GeoscienceAustralia ; + prov:wasAttributedTo :GeoscienceAustralia ; prov:hadMember :example-tile . :example-tile a prov:Entity ; @@ -738,11 +750,11 @@PROV-O
:GeoscienceAustralia a prov:Agent, prov:Organization . -:s1 a :GridSquare ; +:R000 a :GridSquare ; qb:dataSet :exampleDataset ; :lat "91.6667"; :long "40.0270"; - :dataValue <http://www.example.org/led-example-image-R000> ; + :dataImageValue <http://www.example.org/led-example-image-R000> ; prov:wasDerivedFrom :example-tile .
qb4st:crs
property to identify a CRS definition
[CRS definition, Spatial metadata].
The RDF Data Cube and QB4ST make is easy to define several CRSs and to use them simultaneously, providing clients with several views of the data [Multiple CRSs]. In the example below, a grid square can be identified by the latitude and longitude of its centroid, by its boundary, or by its rHEALPix cell.
-
+
:lat a qb4st:SpatialDimension ; rdfs:subPropertyOf geo:lat ; - qb4st:crs <http://www.opengis.net/def/crs/EPSG/0/4326> . + qb4st:crs <http://epsg.io/4326> ; + qb4st:crslabel "WGS84" . :long a qb4st:SpatialDimension ; rdfs:subPropertyOf geo:long ; - qb4st:crs <http://www.opengis.net/def/crs/EPSG/0/4326> . - + qb4st:crs <http://epsg.io/4326> ; + qb4st:crslabel "WGS84" . + +:rHEALPix a qb4st:CRS . + :dggsCell a qb4st:SpatialDimension ; - qb4st:crs "rHEALPix WGS84 Ellipsoid" ; + qb4st:crs :rHEALPix ; + qb4st:crslabel "rHEALPix WGS84 Ellipsoid" ; rdfs:range xsd:string . - + :bounds a qb:AttributeProperty, qb4st:SpatialProperty ; rdfs:subPropertyOf ogc:asWKT ; - qb4st:crs <http://www.opengis.net/def/crs/EPSG/0/4326> . - + qb4st:crs <http://epsg.io/4326> ; + qb4st:crslabel "WGS84" . + :latitudeComponent a qb4st:SpatialComponentSpecification ; qb:dimension :lat . :longitudeComponent a qb4st:SpatialComponentSpecification ; qb:dimension :long . - + :dggsCellComponent a qb4st:SpatialComponentSpecification ; qb:dimension :dggsCell . - + :boundsComponent a qb4st:SpatialComponentSpecification ; qb:attribute :bounds . - -:s1 a :GridSquare ; + +:R000 a :GridSquare ; :lat "91.6667"; :long "40.0270"; :dggsCell "R000" ; @@ -799,8 +817,8 @@GeoSPARQL
It allows for the use of several encodings, including WKT, to describe polygons [Encoding for vector geometry]. The ANU-LED example uses these terms to define the area covered by individual tiles in the coverage, and also to define the entire spatial domain of a dataset, as required for BPs Include spatial metadata in dataset metadata, and BP Provide geometries on the Web in a usable way. - - + +:exampleDataset a qb:DataSet, prov:Entity ; qb:structure :exampleStructure ; :coverageSpatialDomain "POLYGON((90 41.87, 93.33 41.87, 93.33 38.18, 90 38.18, 90 41.87))"^^ogc:wktLiteral . @@ -808,14 +826,15 @@GeoSPARQL
:bounds a qb:AttributeProperty, qb4st:SpatialProperty ; rdfs:subPropertyOf ogc:asWKT ; rdfs:domain :GridSquare ; - qb4st:crs <http://www.opengis.net/def/crs/EPSG/0/4326> ; + qb4st:crs <http://epsg.io/4326> ; + qb4st:crslabel "WGS84" ; qb:concept sdmx-concept:refArea . -:s1 a :GridSquare ; +:R000 a :GridSquare ; qb:dataSet :exampleDataset ; :lat "91.6667"; :long "40.0270"; - :dataValue <http://www.example.org/led-example-image-R000> ; + :dataImageValue <http://www.example.org/led-example-image-R000> ; :bounds "POLYGON((90 41.87, 93.33 41.87, 93.33 38.18, 90 38.18, 90 41.87))"^^ogc:wktLiteral .
:reflectance a ssn:Property, skos:Concept ; owl:sameAs sweet:Reflectance ; owl:sameAs cci-dataType:dtype_sr . - + :time a qb:DimensionProperty, qb4st:TemporalProperty ; rdfs:range xsd:dateTime ; qb:concept sdmx-concept:timePeriod . @@ -842,14 +861,17 @@@@ -862,17 +884,17 @@SKOS concepts
:instrument a qb:AttributeProperty ; rdfs:range ssn:Sensor ; qb:concept sdmx-concept:collMethod . - -:dataValue a qb:MeasureProperty ; - rdfs:range [owl:unionOf(xsd:anyURI xsd:integer)] ; + +:dataPixelValue a qb:MeasureProperty ; + rdfs:range xsd:integer ; qb:concept :reflectance ; qb:concept sdmx-concept:obsValue . +:rHEALPix a qb4st:CRS . + :dggsCell a qb4st:SpatialDimension ; - qb4st:crs "rHEALPix WGS84 Ellipsoid" ; + qb4st:crs :rHEALPix ; + qb4st:crslabel "rHEALPix WGS84 Ellipsoid" ; rdfs:range xsd:string ; qb:concept sdmx-concept:refArea .
xsd:dateTime
datatype is sufficient.
-
+
QB4ST defines terms that work well together with OWL-Time.
- +:coverageTemporalDomain a qb:AttributeProperty, qb4st:TemporalProperty ; rdfs:range time:DateTimeInterval ; qb:concept sdmx-concept:timePeriod . - + :time a qb:DimensionProperty, qb4st:TemporalProperty ; rdfs:range xsd:dateTime ; qb:concept sdmx-concept:timePeriod . - + :exampleDataset a qb:DataSet, prov:Entity ; qb:structure :exampleStructure ; :coverageSpatialDomain "POLYGON((90 41.87, 93.33 41.87, 93.33 38.18, 90 38.18, 90 41.87))"^^ogc:wktLiteral ; @@ -881,19 +903,19 @@- +OWL-Time
:timeDomain a time:Interval ; time:hasBeginning :timeBeginning ; time:hasEnd :timeEnd . - + :timeBeginning a time:Instant ; time:inXSDDateTime "2001-10-26T21:32:52"^^xsd:dateTime . :timeEnd a time:Instant ; time:inXSDDateTime "2001-10-26T21:32:52"^^xsd:dateTime . -:s1 a :GridSquare ; +:R000 a :GridSquare ; qb:dataSet :exampleDataset ; :time "2001-10-26T21:32:52"^^xsd:dateTime .