# Web Processing Service (WPS) example <img align="right" src="../Supplementary_data/dea_logo.jpg">

* **Compatability:** Notebook currently compatible with the `DEA Sandbox` environment only
* **Products used:** 
[ls5_fc_albers](https://explorer.sandbox.dea.ga.gov.au/ls5_fc_albers), 
[ls7_fc_albers](https://explorer.sandbox.dea.ga.gov.au/ls7_fc_albers),
[ls8_fc_albers](https://explorer.sandbox.dea.ga.gov.au/ls8_fc_albers),
[wofs_albers](https://explorer.sandbox.dea.ga.gov.au/wofs_albers),
[mangrove_cover](https://explorer.sandbox.dea.ga.gov.au/mangrove_cover)
* **Special requirements:** Needs `datacube-wps` installed (see [Getting started](#Getting-started))

## Background
The Web Processing Service (WPS) is an Open Geospatial Consortium (OGC) web standard for requesting processing of geospatial data (for example, zonal statistics) over a web API.
In this notebook we have a look at how to use and develop Datacube WPS processes.

A [Datacube WPS](https://github.com/opendatacube/datacube-wps) process requires two parts: an implementation and a configuration.
The implementation is a Python class that extends either `datacube_wps.processes.PolygonDrill` (for polygon statistics), or `datacube_wps.processes.PixelDrill` (for pixel drill).
The configuration of the process can be done externally in a YAML document.

## Description
We start with setting up the `datacube-wps` package for the notebook. Then we go through an example of calling a pre-defined WPS process directly using a Python API, and then we develop an example WPS process ourselves.

## Getting started

First we need to setup the `datacube-wps` package:

In [1]:
!pip install git+https://github.com/opendatacube/datacube-wps.git

Collecting git+https://github.com/opendatacube/datacube-wps.git
  Cloning https://github.com/opendatacube/datacube-wps.git to /tmp/pip-req-build-qfaqywac
  Running command git clone -q https://github.com/opendatacube/datacube-wps.git /tmp/pip-req-build-qfaqywac
Building wheels for collected packages: datacube-wps
  Building wheel for datacube-wps (setup.py) ... [?25ldone
[?25h  Created wheel for datacube-wps: filename=datacube_wps-4.2.1-py3-none-any.whl size=19476 sha256=1d7ab7efbc7df3331ee140d8f718bc8c3542e2fa94f586f9d3aa6e985c3eb5a5
  Stored in directory: /tmp/pip-ephem-wheel-cache-ii12qf89/wheels/88/10/8c/c61c7b7cb5a334da3f6d8a077c96a3f989f194acb1c37d02ea
Successfully built datacube-wps


### Load packages
Now we load the relevant Python packages that are used in this notebook:

In [2]:
import datacube
from datacube.utils.geometry import CRS, Geometry
from datacube.virtual import construct

import datacube_wps
from datacube_wps.processes import PolygonDrill, chart_dimensions
from datacube_wps.processes.fcdrill import FCDrill

import yaml
import xarray
import altair

## Calling an existing WPS process
Let's define the configuration for the Fractional Cover drill process available in `datacube-wps`:

In [3]:
config = yaml.load("""
   # the class the implements the WPS process
   process: datacube_wps.processes.fcdrill.FCDrill
   
   # WPS standard specific metadata
   about:
       identifier: FractionalCoverDrill
       version: '0.3'
       title: Fractional Cover
       abstract: Performs Fractional Cover Polygon Drill
       store_supported: True
       status_supported: True
       geometry_type: polygon

   # input virtual product recipe
   input:
       juxtapose:
           - collate:
                  - product: ls5_fc_albers
                    group_by: solar_day
                    measurements: [BS, PV, NPV]
                  - product: ls7_fc_albers
                    group_by: solar_day
                    measurements: [BS, PV, NPV]
                  - product: ls8_fc_albers
                    group_by: solar_day
                    measurements: [BS, PV, NPV]
           - product: wofs_albers
             measurements: [water]
             group_by: solar_day
             fuse_func: datacube_wps.processes.wofls_fuser

   # altair chart style
   style:
       table:
           columns:
               Bare Soil:
                   units: "%"
                   chartLineColor: "#8B0000"
                   active: True
               Photosynthetic Vegetation:
                   units: "%"
                   chartLineColor: "green"
                   active: True
               Non-Photosynthetic Vegetation:
                   units: "%"
                   chartLineColor: "#dac586"
                   active: True
               Unobservable:
                   units: "%"
                   chartLineColor: "grey"
                   active: False
""", Loader=yaml.CLoader)

Note that the `input` specifies a virtual product recipe for loading the input data. To create a WPS process we instantiate the class specified by the `process` entry:

In [4]:
fcdrill = FCDrill(config['about'], construct(**config['input']), config['style'])

where we have constructed the virtual product from its recipe before passing it to the `FCDrill` constructor.
To get this object to process the polygon drill, we need to define our polygon of interest:

In [5]:
poly = Geometry({"type": "Polygon",
                 "coordinates": [[[137.82, -30.71],
                                  [138.04, -30.71],
                                  [138.04, -30.57],
                                  [137.82, -30.57],
                                  [137.82, -30.71]]]},
                crs=CRS('EPSG:4326'))

and then we call the `query_handler` method that is responsible for responding to a query:

In [6]:
results = fcdrill.query_handler(time=('2018-09', '2019-03'), feature=poly)

query_handler self: <datacube_wps.processes.fcdrill.FCDrill object at 0x7fc561750a58>
query_handler time: ('2018-09', '2019-03')
query_handler feature: Geometry({'type': 'Polygon', 'coordinates': [[(137.82, -30.71), (138.04, -30.71), (138.04, -30.57), (137.82, -30.57), (137.82, -30.71)]]}, CRS('EPSG:4326'))
byte count for query:  225305136
grouped shape (49,)
process_data self: <datacube_wps.processes.fcdrill.FCDrill object at 0x7fc561750a58>
process_data data: <xarray.Dataset>
Dimensions:  (time: 49, x: 863, y: 666)
Coordinates:
  * x        (x) float64 5.515e+05 5.515e+05 5.515e+05 ... 5.73e+05 5.73e+05
  * y        (y) float64 -3.34e+06 -3.34e+06 -3.34e+06 ... -3.356e+06 -3.356e+06
  * time     (time) datetime64[ns] 2018-09-07T00:44:18 ... 2019-03-27T00:38:06
Data variables:
    BS       (time, y, x) int16 dask.array<chunksize=(1, 666, 863), meta=np.ndarray>
    PV       (time, y, x) int16 dask.array<chunksize=(1, 666, 863), meta=np.ndarray>
    NPV      (time, y, x) int16 dask.arra

As we can see from the logs, `datacube-wps` loaded the input data as a dask-backed `xarray` dataset concurrently using multiple processes.
The result contains the polygon statistics returned by the processing as a Pandas `DataFrame` object:

In [7]:
results['data']

Unnamed: 0,time,BS,PV,NPV,Unobservable
0,2018-09-07 00:44:18.000,65.42926,0.004229,17.095818,17.470692
1,2018-09-08 00:36:36.500,74.150162,0.004996,23.685539,2.159302
2,2018-09-15 00:42:38.500,74.51114,0.00508,2.283838,23.199943
3,2018-09-16 00:38:09.500,75.276863,0.00401,8.664869,16.054258
4,2018-09-23 00:44:23.000,34.120082,0.0,6.182733,59.697185
5,2018-09-24 00:36:15.000,77.947147,0.002902,18.618768,3.431182
6,2018-10-02 00:38:17.000,25.858086,0.000191,0.630485,73.511239
7,2018-10-09 00:44:31.000,67.864753,0.001367,20.628854,11.505026
8,2018-10-10 00:35:51.500,19.098616,0.0,5.570706,75.330678
9,2018-10-17 00:41:51.500,79.834182,0.000593,18.630357,1.534868


as well as a [Vega](https://vega.github.io/vega/) chart implemented by the [altair](https://altair-viz.github.io/) library:

In [8]:
results['chart']

Hover to see tooltips about the underlying data.

## Developing a new WPS process

This example is a polygon drill of mangrove cover is also available in `datacube-wps`, but here we define it ourselves for illustration.
The implementation consists of two methods: `process_data` and `render_chart`.

In [9]:
class MangroveDrill(PolygonDrill):

    # this is the first of the two methods to be overriden
    # `data` is a dask-backed xarray dataset
    def process_data(self, data):
        data = data.compute(scheduler='processes')

        # some mangrove canopy cover stats
        woodland = data.where(data == 1).count(['x', 'y'])
        woodland = woodland.rename(name_dict={'canopy_cover_class': 'Woodland'})
        open_forest = data.where(data == 2).count(['x', 'y'])
        open_forest = open_forest.rename(
            name_dict={'canopy_cover_class': 'Open Forest'})
        closed_forest = data.where(data == 3).count(['x', 'y'])
        closed_forest = closed_forest.rename(
            name_dict={"canopy_cover_class": 'Closed Forest'})

        final = xarray.merge([woodland, open_forest, closed_forest])
        result = final.to_dataframe()

        # even though time is an index here, we want a flat dataframe
        result.reset_index(inplace=True)
        return result

    # this is the second of the two methods to be overriden
    # `df` is the result of `process_data`
    def render_chart(self, df):
        # this is the default chart dimensions, can be overriden
        width, height = chart_dimensions(self.style)

        melted = df.melt('time', var_name='Cover Type', value_name='Area')
        melted = melted.dropna()

        # we use the style specified in the config
        style = self.style['table']['columns']
        cover_types = ['Woodland', 'Open Forest', 'Closed Forest']

        chart = altair.Chart(melted,
                             width=width,
                             height=height,
                             title='Percentage of Area - Mangrove Canopy Cover')
        chart = chart.mark_area()
        chart = chart.encode(
            x='time:T',
            y=altair.Y('Area:Q', stack='normalize'),
            color=altair.Color(
                'Cover Type:N',
                scale=altair.Scale(
                    domain=cover_types,
                    range=[style[ct]['chartLineColor'] for ct in cover_types])),
            tooltip=[
                altair.Tooltip(field='time',
                               format='%d %B, %Y',
                               title='Date',
                               type='temporal'), 'Area:Q', 'Cover Type:N'
            ])

        return chart

    # this is optional to customize the appearance of WPS CSV data
    def render_outputs(self, df, chart):
        return super().render_outputs(
            df,
            chart,
            is_enabled=True,
            name="Mangrove Cover",
            header=['Woodland', 'Open Forest', 'Closed Forest'])


The configuration would normally go into a catalog with those for other processes. Here however we load the `yaml` directly for demonstration.

In [10]:
config = yaml.load("""
   # this is going to be looked up by datacube-wps
   # in this example it is not used
   process: your_module.your_class

   about:
       identifier: Mangrove Cover Drill
       version: '0.2'
       title: Mangrove Cover
       abstract: Performs Mangrove Polygon Drill
       store_supported: True
       status_supported: True
       geometry_type: polygon

   input:
       product: mangrove_cover
       measurements: [canopy_cover_class]

   style:
       table:
           columns:
               Woodland:
                   units: "#"
                   chartLineColor: "#9FFF4C"
                   active: True
               Open Forest:
                   units: "#"
                   chartLineColor: "#5ECC00"
                   active: True
               Closed Forest:
                   units: "#"
                   chartLineColor: "#3B7F00"
                   active: True
""", Loader=yaml.CLoader)

As before, we instantiate the class, and call the `query_handler` method with our desired polygon and time range.

In [11]:
mangrove_drill = MangroveDrill(config['about'], construct(**config['input']), config['style'])

poly = Geometry(
    {
        'type':
            'Polygon',
        'coordinates': [[(143.98956298828125, -14.689881366618762),
                         (144.26422119140625, -14.689881366618762),
                         (144.26422119140625, -14.394778454856146),
                         (143.98956298828125, -14.394778454856146),
                         (143.98956298828125, -14.689881366618762)]]
    }, CRS('EPSG:4326'))


result = mangrove_drill.query_handler(time=('2000', '2010'), feature=poly)

query_handler self: <__main__.MangroveDrill object at 0x7fc560f8f668>
query_handler time: ('2000', '2010')
query_handler feature: Geometry({'type': 'Polygon', 'coordinates': [[(143.98956298828125, -14.689881366618762), (144.26422119140625, -14.689881366618762), (144.26422119140625, -14.394778454856146), (143.98956298828125, -14.394778454856146), (143.98956298828125, -14.689881366618762)]]}, CRS('EPSG:4326'))
byte count for query:  40529016
grouped shape (11,)
query_handler returned {'data':          time  Woodland  Open Forest  Closed Forest
0  2000-01-01      3454        19372          21710
1  2001-01-01      4292        24591          15826
2  2002-01-01      5832        26475          12062
3  2003-01-01     11647        26792           5716
4  2004-01-01      6761        27520          10016
5  2005-01-01      5497        27373          11833
6  2006-01-01      4125        22052          18260
7  2007-01-01      3280        19687          21787
8  2008-01-01      3197        19393

And the method returned the data and the chart that we specified.

In [12]:
result['data']

Unnamed: 0,time,Woodland,Open Forest,Closed Forest
0,2000-01-01,3454,19372,21710
1,2001-01-01,4292,24591,15826
2,2002-01-01,5832,26475,12062
3,2003-01-01,11647,26792,5716
4,2004-01-01,6761,27520,10016
5,2005-01-01,5497,27373,11833
6,2006-01-01,4125,22052,18260
7,2007-01-01,3280,19687,21787
8,2008-01-01,3197,19393,22178
9,2009-01-01,4899,26428,13690


In [13]:
result['chart']

***

## Additional information

**License:** The code in this notebook is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 
Digital Earth Australia data is licensed under the [Creative Commons by Attribution 4.0](https://creativecommons.org/licenses/by/4.0/) license.

**Contact:** If you need assistance, please post a question on the [Open Data Cube Slack channel](http://slack.opendatacube.org/) or on the [GIS Stack Exchange](https://gis.stackexchange.com/questions/ask?tags=open-data-cube) using the `open-data-cube` tag (you can view previously asked questions [here](https://gis.stackexchange.com/questions/tagged/open-data-cube)).
If you would like to report an issue with this notebook, you can file one on [Github](https://github.com/GeoscienceAustralia/dea-notebooks).

**Last modified:** May 2020

**Compatible datacube version:** 

In [14]:
print(datacube.__version__)

1.7+262.g1cf3cea8


## Tags
Browse all available tags on the DEA User Guide's [Tags Index](https://docs.dea.ga.gov.au/genindex.html)