# Getting Data from DFO Water Levels Service

Example code to access the DFO Water Levels web service,
and information about what data are available there.

Service description: http://www.tides.gc.ca/eng/info/WebServicesWLD

Technical specification (PDF): http://www.tides.gc.ca/docs/Specifications%20-%20Spine%20observation%20and%20predictions%202.0.3%28en%29.pdf

The service uses the [SOAP](https://en.wikipedia.org/wiki/SOAP) protocol.
The are several Python libraries for working with SOAP services.
The one used here is [suds-jurko](https://bitbucket.org/jurko/suds),
a Python 3 port of the (apparently) no longer maintained [suds](http://fedorahosted.org/suds) library.
Docs for `suds` and `suds-jurko` are at https://fedorahosted.org/suds/wiki/Documentation.

You can install `suds-jurko` from the IOOS conda channel with:

    $ conda install -c https://conda.anaconda.org/ioos suds-jurko

or with `pip`:


    $ pip install suds-jurko


In [1]:
from suds.client import Client

The service has 3 entry point URLs:

* SPINE (St. Lawerence water level forecast & interpolation): https://ws-shc.qc.dfo-mpo.gc.ca/spine 
* observations: https://ws-shc.qc.dfo-mpo.gc.ca/observations
* Tide table predictions: https://ws-shc.qc.dfo-mpo.gc.ca/predictions

We're interesting in the observations.

In [2]:
obs_url = 'https://ws-shc.qc.dfo-mpo.gc.ca/observations'

We construct a client for the service using the entry point URL
with the query string `wsdl`.
[WSDL](https://en.wikipedia.org/wiki/Web_Services_Description_Language) stands for Web Service Description Language.
A request to the entry point URL with `?wsdl` appended causes the service to return
a description of the data that the service provides and the methods that it provides to
access those data.
The `suds.client.Client` constructor uses that information to build a client object for us
that provides the methods that the service supports.

In [3]:
client = Client(obs_url + '?wsdl')
print(client)


Suds ( https://fedorahosted.org/suds/ )  version: 0.6

Service ( ObservationsService ) tns="http://client.ws.shc.gc.ca"
   Prefixes (3)
      ns0 = "http://client.ws.shc.gc.ca"
      ns1 = "http://schemas.xmlsoap.org/soap/encoding/"
      ns2 = "http://wds.dfo.gc.ca"
   Ports (1):
      (observations)
         Methods (13):
            getBoundaryDate()
            getBoundaryDepth()
            getBoundarySpatial()
            getDataInfo()
            getInfo()
            getLocalizedString(xs:string in0, ArrayOf_xsd_anyType in1)
            getMetadata()
            getMetadataInfo()
            getName()
            getResourceBundle()
            getStatus()
            getVersion()
            search(xs:string dataName, xs:double latitudeMin, xs:double latitudeMax, xs:double longitudeMin, xs:double longitudeMax, xs:double depthMin, xs:double depthMax, xs:string dateMin, xs:string dateMax, xs:int start, xs:int sizeMax, xs:boolean metadata, xs:string metadataSelection, xs:string 

Now we can ask the service to tell us about the data it has available:

In [4]:
client.service.getDataInfo()

[(Metadata){
    name = "wl"
    value = "Water levels at the SINECO stations"
  }, (Metadata){
    name = "sal"
    value = "Salinity at the SINECO stations"
  }, (Metadata){
    name = "temp"
    value = "Temperature at the SINECO stations"
  }, (Metadata){
    name = "atm_pres"
    value = "Atmospheric pressure at the SINECO stations"
  }]

**Spoiler Alert:** The salinity, temperature, and atmospheric pressure data items look exciting
but it turns out that they are not available for the BC stations :-(

We can also ask the service for its metadata:

In [5]:
metadata = client.service.getMetadata()
print(metadata)

[(Metadata){
   name = "station_id_list"
   value = "01970,02330,02780,02985,03057,03075,03100,03110,03248,03280,03300,03335,03345,03353,03360,03365,03424,07277,07594,07654,07735,07786,07917,08408,08976,09338,09341,09348,09354,09850,15330,15520,15540,15660,15780,15930,15975,16005"
 }, (Metadata){
   name = "station_id_position"
   value = "[01970,47.378861,-61.857293][02330,48.997,-64.3805][02780,50.194833,-66.376833][02985,48.478333,-68.513667][03057,47.448833,-70.3655][03075,47.0895,-70.710833][03100,46.9965,-70.808167][03110,46.858167,-71.003333][03248,46.811111,-71.201944][03280,46.6965,-71.572833][03300,46.68116667,-71.877167][03335,46.561,-72.105833][03345,46.500333,-72.245833][03353,46.400333,-72.3795][03360,46.3405,-72.539167][03365,46.2725,-72.619333][03424,48.126333,-69.72975][07277,48.653601,-123.451646][07594,49.105896,-123.303368][07654,49.200077,-122.910377][07735,49.289554,-123.107339][07786,49.340556,-123.231978][07917,49.162788,-123.923523][08408,50.722505,-127.488214]

`getMetadata()` returns a list of `Metadata` objects,
each of which has `name` and `value` attributes.

Let's summarize the metadata down to each `Metadata` object's
index in the list, and its `name`:

In [6]:
for index, item in enumerate(metadata):
    print(index, item.name)

0 station_id_list
1 station_id_position
2 contact
3 language
4 name
5 abstract
6 reference_date
7 station_id_name_list
8 metadata_selection_accepted
9 max_rows
10 total_nbr_obs


Now let's extract the station ids and names of the BC stations:

In [8]:
from collections import namedtuple

Station = namedtuple('Station', 'id, name')

stn_id_names = metadata[7]
bc_stns = [stn for stn in stn_id_names.value[1:-1].split('][') if '(BC)' in stn]
print(bc_stns)
stns = [Station(stn.split(';;')[0], stn.split(';;')[1][:-5]) for stn in bc_stns]
stns

['07277;;Patricia Bay (BC)', '07594;;SandHeads (BC)', '07654;;New Westminster (BC)', '07735;;Vancouver (BC)', '07786;;Sandy Cove (BC)', '07917;;Port of Nanaimo (BC)', '08408;;Port Hardy 8310 (BC)', '08976;;Bella Bella (BC)', '09338;;PRPA - Aero Trading (BC)', '09341;;Porpoise Channel East (BC)', '09348;;PRPA - Fairview Terminal (BC)', '09354;;Prince Rupert (BC)', '09850;;Queen Charlotte City (BC)']


[Station(id='07277', name='Patricia Bay'),
 Station(id='07594', name='SandHeads'),
 Station(id='07654', name='New Westminster'),
 Station(id='07735', name='Vancouver'),
 Station(id='07786', name='Sandy Cove'),
 Station(id='07917', name='Port of Nanaimo'),
 Station(id='08408', name='Port Hardy 8310'),
 Station(id='08976', name='Bella Bella'),
 Station(id='09338', name='PRPA - Aero Trading'),
 Station(id='09341', name='Porpoise Channel East'),
 Station(id='09348', name='PRPA - Fairview Terminal'),
 Station(id='09354', name='Prince Rupert'),
 Station(id='09850', name='Queen Charlotte City')]

## Getting Data

The `search()` method is how we get data from the service.
It's call signature was printed above when we printed the `client`.
Here it is again with some linebreaks added to improve readability:
```
search(
    xs:string dataName,
    xs:double latitudeMin, xs:double latitudeMax,
    xs:double longitudeMin, xs:double longitudeMax,
    xs:double depthMin, xs:double depthMax,
    xs:string dateMin, xs:string dateMax,
    xs:int start, xs:int sizeMax,
    xs:boolean metadata,
    xs:string metadataSelection,
    xs:string order,
)
```

So, to get the water level at Point Atkinson at 18:00 on 3-Nov-2015, we do:

In [9]:
client.service.search(
    'wl', 
    -90, 90, 
    -180, 180, 
    0, 0, 
    '2015-11-03 17:00:00', '2015-11-03 18:00:00', 
    1, 1, 
    True, 
    'station_id=07735', 
    'desc',
)

(ResultSet){
   boundaryDate = 
      (BoundaryDate){
         max = "2015-11-03 18:00:00"
         min = "2015-11-03 18:00:00"
      }
   boundaryDepth = 
      (BoundaryDepth){
         max = 0.0
         min = 0.0
      }
   boundarySpatial = 
      (BoundarySpatial){
         latitude = 
            (RealBoundary){
               max = 49.289554
               min = 49.289554
            }
         longitude = 
            (RealBoundary){
               max = -123.107339
               min = -123.107339
            }
      }
   boundaryValue = 
      (StringBoundary){
         max = "4.08"
         min = "4.08"
      }
   data[] = 
      (Data){
         boundaryDate = 
            (BoundaryDate){
               max = "2015-11-03 18:00:00"
               min = "2015-11-03 18:00:00"
            }
         boundaryDepth = 
            (BoundaryDepth){
               max = 0.0
               min = 0.0
            }
         metadata[] = 
            (Metadata){
               name = "

Let's do that again but catch the search result in a variable to that we can pick it apart:

In [10]:
result = client.service.search(
    'wl', -90, 90, -180, 180, 0, 0, 
    '2015-11-03 17:00:00', '2015-11-03 18:00:00', 
    1, 1, True, 'station_id=07735', 'desc')

Summarizing the attribute names in `result`
(and ignoring the "dunder" attributes):

In [11]:
[attr for attr in dir(result) if not attr.startswith('__')]

['boundaryDate',
 'boundaryDepth',
 'boundarySpatial',
 'boundaryValue',
 'data',
 'metadata',
 'size',
 'status']

The `boundary*` attributes provide information about the limits on the data.
`metadata` is metadata about the processing of our search request.
`size` is the number of items that were returned.
`status` is the service status.
So, the only really interesting thing is the `data` attribute.

`result.data` is a list of `Data` objects that contains `result.size` items.
The attribute names of each `Data` object
(again ignoring "dunder" attributes) are:

In [12]:
[attr for attr in dir(result.data[0]) if not attr.startswith('__')]

['boundaryDate', 'boundaryDepth', 'metadata', 'spatialCoordinates', 'value']

So, the water level at Point Atkinson at 18:00 on 3-Nov-2015 was:

In [13]:
result.data[0].value

4.08