# Example of DOV search methods for groundwater samples (grondwatermonsters)


## Use cases:
* Get groundwater samples in a bounding box
* Get groundwater samples with specific properties
* Get the coordinates of all groundwater samples in Ghent
* Get groundwater samples based on a combination of specific properties
* Get groundwater samples based on a selection of screens (filters)

In [1]:
%matplotlib inline
import inspect, sys

In [None]:
# check pydov path
import pydov

## Get information about the datatype 'GrondwaterMonster'

In [None]:
from pydov.search.grondwatermonster import GrondwaterMonsterSearch
gwmonster = GrondwaterMonsterSearch()

A description is provided for the 'GrondwaterMonster' datatype:

In [None]:
print(gwmonster.get_description())

The different fields that are available for objects of the 'GrondwaterMonster' datatype can be requested with the get_fields() method:

In [None]:
fields = gwmonster.get_fields()

# print available fields
for f in fields.values():
    print(f['name'])

You can get more information of a field by requesting it from the fields dictionary:
* *name*: name of the field
* *definition*: definition of this field
* *cost*: currently this is either 1 or 10, depending on the datasource of the field. It is an indication of the expected time it will take to retrieve this field in the output dataframe.
* *notnull*: whether the field is mandatory or not
* *type*: datatype of the values of this field

In [None]:
# print information for a certain field
fields['waarde']

Optionally, if the values of the field have a specific domain the possible values are listed as *values*:

In [None]:
# if an attribute can have several values, these are listed under 'values', e.g. for 'parameter':
fields['parameter']['values'].items()[0:10]

In [None]:
fields['parameter']['values']['NH4']

## Example use cases

### Get groundwater samples in a bounding box

Get data for all the groundwater samples that are geographically located within the bounds of the specified box.

The coordinates are in the Belgian Lambert72 (EPSG:31370) coordinate system and are given in the order of lower left x, lower left y, upper right x, upper right y.

In [None]:
from pydov.util.location import Within, Box

df = gwmonster.search(location=Within(Box(93378, 168009, 94246, 169873)))
df.head()

Using the *pkey* attributes one can request the details of the corresponding *grondwatermonster* in a webbrowser (only showing the first unique records):

In [None]:
for pkey_grondwatermonster in df.pkey_grondwatermonster.unique()[0:5]:
    print(pkey_grondwatermonster)

### Get groundwater samples with specific properties

Next to querying groundwater samples based on their geographic location within a bounding box, we can also search for groundwater samples matching a specific set of properties. For this we can build a query using a combination of the 'GrondwaterMonster' fields and operators provided by the WFS protocol.

A list of possible operators can be found below:

In [None]:
[i for i,j in inspect.getmembers(sys.modules['owslib.fes'], inspect.isclass) if 'Property' in i]

In this example we build a query using the *PropertyIsEqualTo* operator to find all groundwater samples that are within the community (gemeente) of 'Leuven':

In [None]:
from owslib.fes import PropertyIsEqualTo

query = PropertyIsEqualTo(
            propertyname='gemeente',
            literal='Leuven')

df = gwmonster.search(query=query)
df.head()

Once again we can use the *pkey_grondwatermonster* as a permanent link to the information of the groundwater samples:

In [None]:
for pkey_grondwatermonster in df.pkey_grondwatermonster.unique()[0:5]:
    print(pkey_grondwatermonster)

We can add the descriptions of the parameter values as an extra column 'parameter_label':

In [None]:
df['parameter_label'] = df['parameter'].map(fields['parameter']['values'])
df[['pkey_grondwatermonster', 'datum_monstername', 'parameter', 'parameter_label', 'waarde', 'eenheid']].head()

## Get groundwater screens based on a combination of specific properties

Get all groundwater screens in Hamme that have measurements for cations (kationen). And filter to get only Sodium values after fetching all records.

In [None]:
from owslib.fes import Or, Not, PropertyIsNull, PropertyIsLessThanOrEqualTo, And, PropertyIsLike

query = And([PropertyIsEqualTo(propertyname='gemeente',
                               literal='Hamme'),
             PropertyIsEqualTo(propertyname='kationen',
                               literal='true')
             ])
df_hamme = gwmonster.search(query=query,
                     return_fields=('pkey_grondwatermonster', 'parameter', 'parametergroep', 'waarde', 'eenheid','datum_monstername'))
df_hamme.head()

You should note that this initial dataframe contains all parameters (not just the cations). The filter will only make sure that only samples where any cation was analysed are in the list. If we want to filter more, we should do so in the resulting dataframe.

In [None]:
df_hamme = df_hamme[df_hamme.parameter=='Na']
df_hamme.head()

## Working with water samples

For further analysis and visualisation of the time series data, we can use the data analysis library [pandas](https://pandas.pydata.org/) and visualisation library [matplotlib](https://matplotlib.org/). 

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

Query the data of a specific filter using its `pkey`:

In [None]:
query = PropertyIsEqualTo(
            propertyname='pkey_filter',
            literal='https://www.dov.vlaanderen.be/data/filter/1991-001040')

df = gwmonster.search(query=query)
df.head()

The date is still stored as a string type. Transforming to a data type using the available pandas function [`to_datetime`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_datetime.html) and using these dates as row index:

In [None]:
df['datum_monstername'] = pd.to_datetime(df['datum_monstername'])

For many usecases, it is useful to create a pivoted table, showing the value per parameter

In [None]:
pivot = df.pivot_table(columns=df.parameter, values='waarde', index='datum_monstername')
pivot

### Plotting

The default plotting functionality of Pandas can be used:

In [None]:
parameters = ['NO3', 'NO2', 'NH4']
pivot[parameters].plot(style='-', figsize=(12, 5))

## Combine search in filters and groundwater samples

For this example, we will first search filters, and later search all samples for this selection. 
We will select filters in the primary network located in Kalmthout.

In [None]:
from pydov.search.grondwaterfilter import GrondwaterFilterSearch
from pydov.util.query import Join

gfs = GrondwaterFilterSearch()

gemeente = 'Kalmthout'
filter_query = And([PropertyIsLike(propertyname='meetnet',
                       literal='meetnet 1 %'),
                    PropertyIsEqualTo(propertyname='gemeente',
                       literal=gemeente)])

filters = gfs.search(query=filter_query, return_fields=['pkey_filter'])

monsters = gwmonster.search(query=Join(filters, 'pkey_filter'))
monsters.head()

We will filter out some parameters, and show trends per location.

In [None]:
parameter = 'NH4'
trends_sel = monsters[(monsters.parameter==parameter) & (monsters.veld_labo=='LABO')]
trends_sel = trends_sel.set_index('datum_monstername')
trends_sel['label'] = trends_sel['gw_id'] + ' F' + trends_sel['filternummer'] 

ax = trends_sel.groupby('label')['waarde'].plot(figsize=(12, 5))

plt.title('Langetermijntrends van %s in meetnet 1 te %s' % (parameter, gemeente))

# Put a legend to the right of the current axis
box = ax[0].get_position()
ax[0].set_position([box.x0, box.y0, box.width * 0.8, box.height])
ax[0].legend(loc='center left', bbox_to_anchor=(1, 0.5))