## PAVICS Web Processing Services 
PAVICS allows access to a number of different WPS services via Birdhouse
* Each 'bird' groups a set of processing tools 


In [13]:
from owslib.wps import WebProcessingService
import requests
from lxml import etree  
import owslib
owslib.__version__

def parseStatus(execute):
    o = requests.get(execute.statusLocation)
    t = etree.fromstring(o.content)
    ref = t.getchildren()[-1].getchildren()[-1].getchildren()[-1].get('{http://www.w3.org/1999/xlink}href')
    
    return ref


### One suite of WPS tools for netcdf files resides in 'Hummingbird'
For metadata use GetCapabilities

In [14]:
# Hummingbird WPS url
wps_url = 'https://pavics.ouranos.ca/twitcher/ows/proxy/hummingbird/wps'
# connection
wps = WebProcessingService(url=wps_url)
# print wps title
print(wps.identification.title)

Hummingbird 0.5_dev


### Print out info on available processes (from Hummingbird)

In [15]:
for process in wps.processes:
    print ('%s \t : %s \n' %(process.identifier, process.abstract))

ncdump 	 : Run ncdump to retrieve NetCDF header metadata. 

spotchecker 	 : Checks a single uploaded or remote dataset against a variety of compliance standards. The dataset is either in the NetCDF format or a remote OpenDAP resource. Available compliance standards are the Climate and Forecast conventions (CF) and project specific rules for CMIP6 and CORDEX. 

cchecker 	 : Runs the IOOS Compliance Checker tool to check datasets against compliance standards. Each compliance standard is executed by a Check Suite, which functions similar to a Python standard Unit Test. A Check Suite runs one or more checks against a dataset, returning a list of Results which are then aggregated into a summary. Development and maintenance for the compliance checker is done by the Integrated Ocean Observing System (IOOS). 

cfchecker 	 : The NetCDF Climate Forcast Conventions compliance checker by CEDA. This process allows you to run the compliance checker to check that the contents of a NetCDF file comply 

### PAVICS/Hummingbird has lots of WPS services
### Let's keep it simple with 'ncdump'  
* Print info on WPS inputs needed

In [16]:
# ncdump
proc_name = 'ncdump'
process = wps.describeprocess(proc_name) # get process info
print(process.abstract)
print("Inputs:")
for inputs in process.dataInputs:
    print(' * ', inputs.identifier)

Run ncdump to retrieve NetCDF header metadata.
Inputs:
 *  dataset
 *  dataset_opendap


#### The only input we need is a dataset (url) or it's OpenDAP link
* A simple way to find a test dataset is to access : https://pavics.ouranos.ca/thredds

* Note - PAVICS also has a catalogue WPS but we will see that in other examples later

In [17]:
# Example netcdf url to NRCAN daily - tasmin 2013
nc_url = 'https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/nrcan/nrcan_canada_daily/tasmin/nrcan_canada_daily_tasmin_2013.nc'
print(nc_url)

https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/nrcan/nrcan_canada_daily/tasmin/nrcan_canada_daily_tasmin_2013.nc


#### Create WPS input - Python list

In [18]:
myinputs = []
myinputs.append(('dataset_opendap',nc_url)) # inputs : use opendap link of a single netcdf file from catalogue search to erun ncdump
print(myinputs)

[('dataset_opendap', 'https://pavics.ouranos.ca/twitcher/ows/proxy/thredds/dodsC/birdhouse/nrcan/nrcan_canada_daily/tasmin/nrcan_canada_daily_tasmin_2013.nc')]


#### Execute the WPS
The execution is asynchronous, meaning that it does not automatically return the output. The response of the server is only a message saying that the request was accepted. 

In [19]:
print(proc_name)
execute = wps.execute(proc_name, myinputs)

ncdump


In [20]:
from lxml import etree
print(etree.tostring(execute.response).decode())

<wps:ExecuteResponse xmlns:gml="http://www.opengis.net/gml" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 http://schemas.opengis.net/wps/1.0.0/wpsExecute_response.xsd" service="WPS" version="1.0.0" xml:lang="en-US" serviceInstance="https://pavics.ouranos.ca:443/wps?service=WPS&amp;request=GetCapabilities" statusLocation="https://pavics.ouranos.ca:443/wpsoutputs/hummingbird/6ca0018a-bb58-11e8-91d6-0242ac12000d.xml">
  <wps:Process wps:processVersion="4.4.1.1">
    <ows:Identifier>ncdump</ows:Identifier>
    <ows:Title>NCDump</ows:Title>
    <ows:Abstract>Run ncdump to retrieve NetCDF header metadata.</ows:Abstract>
  </wps:Process>
  <wps:Status creationTime="2018-09-18T15:35:09Z">
    <wps:ProcessSucceeded>PyWPS Process NCDump finished</wps:ProcessSucceeded>
  </wps:Status>
  <wps:ProcessOutputs>


#### Get the result
To actually parse the output, we must first make sure that the process completed server-side. 
`execute.checkStatus()` will poll the server and update its response. 

In [21]:
execute.checkStatus()
print("Status: ", execute.status)
print(execute.statusLocation)

Status:  ProcessSucceeded
https://pavics.ouranos.ca:443/wpsoutputs/hummingbird/6ca0018a-bb58-11e8-91d6-0242ac12000d.xml


Then we can check the actual output of the process, stored as a list in the `processOutputs` attribute. In the case where the output is a reference to a file, we can get it using the `reference` attribute. The method `retrieveData` let's us fetch and retrieve the content of the file.

In [23]:
ref = parseStatus(execute)
print('Output reference :\n*', ref)

r = requests.get(ref)
print('\nNCDUMP results :\n',r.text)

#out = execute.processOutputs[0]
#print("Output reference: ", out.reference)
#data = out.retrieveData()
#print("Data: ", data.decode())

Output reference :
* https://pavics.ouranos.ca:443/wpsoutputs/hummingbird/6ca0018a-bb58-11e8-91d6-0242ac12000d/nc_dump_8MSe6y.txt

NCDUMP results :
 netcdf nrcan_canada_daily_tasmin_2013.nc {
dimensions:
	time = UNLIMITED ; // (365 currently)
	lat = 510 ;
	lon = 1068 ;
	ts = 3 ;
variables:
	float lon(lon) ;
		lon:units = "degrees_east" ;
		lon:long_name = "longitude" ;
		lon:standard_name = "longitude" ;
		lon:axis = "X" ;
		lon:_ChunkSizes = 1068 ;
	float lat(lat) ;
		lat:axis = "Y" ;
		lat:units = "degrees_north" ;
		lat:long_name = "latitude" ;
		lat:standard_name = "latitude" ;
		lat:_ChunkSizes = 510 ;
	short ts(ts) ;
		ts:_FillValue = -32767s ;
		ts:_ChunkSizes = 3 ;
	short time(time) ;
		time:axis = "T" ;
		time:units = "days since 1950-01-01 00:00:00" ;
		time:long_name = "time" ;
		time:standard_name = "time" ;
		time:calendar = "gregorian" ;
		time:_ChunkSizes = 1 ;
	short time_vectors(time, ts) ;
		time_vectors:_ChunkSizes = 1, 3 ;
	float tasmin(time, lat, lon) ;
		tasmin:lo