# Example: OWSLib extension for ESGF compute API

This notebook demonstrates a prototype of an [ESGF API](https://github.com/ESGF/esgf-compute-api) client implementation based on [OWSLib](https://github.com/geopython/OWSLib). On the server side, we are using [PyWPS](https://pywps.org/) running a mock ESGF compute process called `pelican_subset`. This subsetting process uses `xarray` to subset input NetCDF files.

Please read the [OWSLib-esgfwps](https://owslib-esgfwps.readthedocs.io/en/latest/) documentation

See also the notebook examples:

https://nbviewer.jupyter.org/github/bird-house/OWSLib-esgfwps/tree/master/examples/notebooks/

You can compare this with notebook examples of the original ESGF compute interface: 

* https://github.com/ESGF/esgf-compute-api
* https://nbviewer.jupyter.org/github/ESGF/esgf-compute-api/tree/master/examples/


<div class="alert alert-block alert-warning">
<b>Disclaimer:</b>  This prototype is incomplete. It's meant to show how we can leverage the OGC-related code base to meet ESGF requirements and avoid maintaining code by ourselves. That being said, all implementations need improvements and could use additional eye balls: OWSLib, OWSLib-esgfwps, PyWPS and ESGF-API itself. 
</div>

We could also use PyWPS for WPS service definitions and build a seperate ESGF compute library for processing functionality. We can define an abstract PyWPS process class which can be used (subclassed) to define new ESGF-API processes.

See: 
* https://github.com/ESGF/esgf-compute-wps
* https://github.com/bird-house/pelican/blob/master/pelican/processes/wps_esgf_subset.py
* https://pywps.org/
* http://xarray.pydata.org/en/stable/dask.html


### Defining an ESGF API Profile

A common ESGF API WPS Profile could be defined using *Mixins* classes or *decorators*. 
See examples in notebook:

https://github.com/cehbrecht/jupyterlab-notebooks/tree/master/pywps-profiles

The `pelican_subset` process is using a Python *decorator* `esgf_api`, see code:
https://github.com/bird-house/pelican/blob/master/pelican/processes/wps_esgf_subset.py


## WPS client OWSLib/esgfapi

In [None]:
from owslib.wps import WebProcessingService

**ESGF Access Token**

**TODO**: Use OAuth2 access tokens. This can be handled by a security middleware like [Twitcher](https://twitcher.readthedocs.io/en/latest/).

In [None]:
# Get OAuth2 Access Token using client_id/client_secret
import requests
import urllib3
urllib3.disable_warnings()
url = "https://cp4cds-cn2.dkrz.de/oauth/token?grant_type=client_credentials&client_id={}&client_secret={}".format(
    'a1bba369139442d3858f62a41f4a8450', '9821aec9c4104ae0b8c0e8a6d6721589')
token = requests.get(url, verify=False).json()
token

In [None]:
# use headers for OAuth bearer token
headers = {'Authorization': 'Bearer {}'.format(token['access_token'])}

### Get Capabilities

Here we are using a [mock ESGF process](https://github.com/bird-house/pelican/blob/master/pelican/processes/wps_esgf_subset.py) from the `Pelican` test server.

In [None]:
client = WebProcessingService('https://cp4cds-cn2.dkrz.de/ows/proxy/pelican', headers=headers, verify=False)
# client = WebProcessingService('https://bovec.dkrz.de/ows/proxy/pelican', headers=headers, verify=True)
# client = WebProcessingService('http://localhost:5000/wps', headers=headers, verify=True)

In [None]:
for p in client.processes:
    print(p.identifier)

### Describe Process

In [None]:
proc = client.describeprocess(
    'pelican_subset'
)
proc.identifier

In [None]:
for inpt in proc.dataInputs:
    print(inpt.identifier, inpt.dataType)

### WPS Process Inputs

**Domain**

**TODO**: can we use WPS boundingbox to describe domain? Are there other OGC concepts we can use?

In [None]:
from owslib_esgfwps import Domain, Dimension

In [None]:
d0 = Domain(dict(
    time=Dimension(0, 1, crs='indices'),
    lat=Dimension(40, 60, crs='values'),
    lon=Dimension(0, 20, crs='values'),
))

In [None]:
# show json
print(d0.json)

In [None]:
# add domain to WPS inputs
from owslib_esgfwps import Domains
inputs = [('domain', Domains([d0]))]

**TODO**: Why using `Domains` and `Variables` (note the *s*)? The WPS protocol already handles the multiplicity of parameters.

**Variable**

In [None]:
from owslib_esgfwps import Variable

**TODO**: Should we use the file transportation layer of PyWPS?

In [None]:
# data files we want to process
files = [
    # OpenDAP, CORDEX EUR-44, tasmax, climate index SU (summer days)
    'http://opendap.knmi.nl/knmi/thredds/dodsC/CLIPC/gerics/climatesignalmaps/EUR-44/tasmax/su_python-2-7-6_GERICS_ens-multiModel-climatesignalmap-rcp85-EUR-44_yr_20700101-20991231_1971-2000.nc',
]


In [None]:
from owslib_esgfwps import Variables

# add them one by one to WPS inputs as Variable
# variable=su (summer days climate index)
su = Variable(uri=files[0], var_name='su')
inputs.append(('variable', Variables([su])))  

In [None]:
# show all WPS inputs
for inp in inputs:
    print(inp[1])

### Execute

In [None]:
from owslib.wps import SYNC
exec = client.execute(proc.identifier, inputs=inputs, mode=SYNC)

In [None]:
exec.isComplete()

In [None]:
exec.isSucceded()

**Outputs**

**TODO**: return multiple output files ... maybe using metalink.

See: https://github.com/bird-house/emu/issues/64

In [None]:
# show the output ... 
for output in exec.processOutputs:
    print(output.identifier, output.reference or output.data)

**Plot Preview**

In [None]:
print(exec.processOutputs[1].reference)

### Use Output parameter

**TODO**: PyWPS supports [MetaLink](https://pywps.readthedocs.io/en/latest/process.html#returning-multiple-files) to return multiple files. We can also add support for other output formats (like a simple json document). We need to discuss this and make sure it is standard-compliant.

In [None]:
from owslib_esgfwps import Outputs
outputs = Outputs.from_owslib(exec.processOutputs)

In [None]:
outputs.outputs[0].uri