# Example of DOV search methods for observations (observaties)

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/DOV-Vlaanderen/pydov/master?filepath=docs%2Fnotebooks%2Fsearch_observaties.ipynb)

## Use cases explained below
* Get observations in a bounding box
* Get observations with specific properties
* Get observations in a bounding box based on specific properties
* Select observations in a municipality and return depth
* Get observations based on fields not available in the standard output dataframee

In [1]:
import os
os.environ['PYDOV_BASE_URL'] = 'https://oefen.dov.vlaanderen.be/'

In [2]:
%matplotlib inline
import inspect, sys
import warnings; warnings.simplefilter('ignore')

In [3]:
# check pydov path
import pydov

## Get information about the datatype 'Observatie'

In [4]:
from pydov.search.observatie import ObservatieSearch
observatie = ObservatieSearch()

A description is provided for the 'Observatie' datatype:

In [5]:
observatie.get_description()

'DIT IS EEN TEST'

The different fields that are available for objects of the 'Observatie' datatype can be requested with the get_fields() method:

In [6]:
fields = observatie.get_fields()

# print available fields
for f in fields.values():
    print(f['name'])

id
pkey_observatie
pkey_parent
parameter
parametergroep
observatietype
detectieconditie
resultaat
eenheid
fenomeentijd
resultaattijd
methode
uitvoerder
diepte_van_m
diepte_tot_m
herkomst
opmerking
opdracht
geom


You can get more information of a field by requesting it from the fields dictionary:
* *name*: name of the field
* *definition*: definition of this field
* *cost*: currently this is either 1 or 10, depending on the datasource of the field. It is an indication of the expected time it will take to retrieve this field in the output dataframe.
* *notnull*: whether the field is mandatory or not
* *type*: datatype of the values of this field

In [7]:
fields['diepte_van_m']

{'name': 'diepte_van_m',
 'definition': None,
 'type': 'float',
 'list': False,
 'notnull': False,
 'query': True,
 'cost': 1}

## Example use cases

### Get observations in a bounding box

Get data for all the observations that are geographically located within the bounds of the specified box.

The coordinates are in the Belgian Lambert72 (EPSG:31370) coordinate system and are given in the order of lower left x, lower left y, upper right x, upper right y.

In [8]:
from pydov.util.location import Within, Box

df = observatie.search(location=Within(Box(114000, 172310, 114005, 172315)), max_features = 10)
df.head()

[000/001] .


Unnamed: 0,pkey_observatie,pkey_parent,fenomeentijd,diepte_van_m,diepte_tot_m,parametergroep,parameter,detectieconditie,resultaat,eenheid,methode,uitvoerder,herkomst
0,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2018-01-09,7.5,8.0,Onderkenning-grondsoort,"Grondsoort volgens ASTM, de beschrijving (ASTM...",,Clayey sand,,Onbekend,VO - Afdeling Geotechniek,LABO
1,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2018-01-09,7.5,8.0,Onderkenningsproeven-korrelverdeling,Korrelverdeling d.m.v. hydrometer/areometer (K...,,,,Hydrometer,VO - Afdeling Geotechniek,LABO
2,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2018-01-09,4.5,4.75,Onderkenning-grondsoort,Grondsoort volgens GEO-BGGG (Grondsoort BGGG),,sterk kalkh. of schelph. zandh. leem,-,Classificatie volgens de norm,VO - Afdeling Geotechniek,LABO
3,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2018-01-09,4.5,4.75,Onderkenning - proeven,Consistentiegrenzen - Uitrolgrens (Consistenti...,,#,%,Onbekend,VO - Afdeling Geotechniek,LABO
4,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2018-01-09,4.75,5.0,Onderkenning-grondsoort,"Grondsoort volgens ASTM, de code (ASTM_code)",,CL,-,Onbekend,VO - Afdeling Geotechniek,LABO


The dataframe contains several observations made at the same location

Using the *pkey_observatie* field one can request the details of these obsevrations in a webbrowser:

In [9]:
for pkey_observatie in set(df.pkey_observatie):
    print(pkey_observatie)

https://oefen.dov.vlaanderen.be/data/observatie/2022-1758836
https://oefen.dov.vlaanderen.be/data/observatie/2022-3762475
https://oefen.dov.vlaanderen.be/data/observatie/2022-6538564
https://oefen.dov.vlaanderen.be/data/observatie/2022-1850256
https://oefen.dov.vlaanderen.be/data/observatie/2022-6538563
https://oefen.dov.vlaanderen.be/data/observatie/2022-2533717
https://oefen.dov.vlaanderen.be/data/observatie/2022-2032153
https://oefen.dov.vlaanderen.be/data/observatie/2023-7514595
https://oefen.dov.vlaanderen.be/data/observatie/2022-3762474
https://oefen.dov.vlaanderen.be/data/observatie/2022-2533716


### Get observations with specific properties

Next to querying observations based on their geographic location within a bounding box, we can also search for observations matching a specific set of properties. For this we can build a query using a combination of the 'Observatie' fields and operators provided by the WFS protocol.

A list of possible operators can be found below:

In [10]:
[i for i,j in inspect.getmembers(sys.modules['owslib.fes2'], inspect.isclass) if 'Property' in i]

['PropertyIsBetween',
 'PropertyIsEqualTo',
 'PropertyIsGreaterThan',
 'PropertyIsGreaterThanOrEqualTo',
 'PropertyIsLessThan',
 'PropertyIsLessThanOrEqualTo',
 'PropertyIsLike',
 'PropertyIsNotEqualTo',
 'PropertyIsNull',
 'SortProperty']

In this example we build a query using the *PropertyIsEqualTo* operator to find all observations concerning the parameter "Watergehalte (watergehalte)":

In [11]:
from owslib.fes2 import PropertyIsEqualTo

query = PropertyIsEqualTo(propertyname='parameter',
                          literal='Watergehalte (watergehalte)')
df = observatie.search(query=query, max_features = 10)

df.head()

[000/001] .


Unnamed: 0,pkey_observatie,pkey_parent,fenomeentijd,diepte_van_m,diepte_tot_m,parametergroep,parameter,detectieconditie,resultaat,eenheid,methode,uitvoerder,herkomst
0,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2002-10-30,,,Volumemassa-watergehalte,Watergehalte (watergehalte),,234,%,Gewichtsverlies na drogen in droogstoof,MVG - Afdeling Geotechniek,LABO
1,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2011-06-16,,,Volumemassa-watergehalte,Watergehalte (watergehalte),,2205,%,Gewichtsverlies na drogen in droogstoof,VO - Afdeling Geotechniek,LABO
2,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2013-06-13,,,Volumemassa-watergehalte,Watergehalte (watergehalte),,305,%,Gewichtsverlies na drogen in droogstoof,VO - Afdeling Geotechniek,LABO
3,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2011-08-31,,,Volumemassa-watergehalte,Watergehalte (watergehalte),,272,%,Gewichtsverlies na drogen in droogstoof,VO - Afdeling Geotechniek,LABO
4,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2009-06-24,,,Volumemassa-watergehalte,Watergehalte (watergehalte),,240,%,Gewichtsverlies na drogen in droogstoof,VO - Afdeling Geotechniek,LABO


Once again we can use the *pkey_observatie* as a permanent link to the information of these observations:

In [12]:
for pkey_observatie in set(df.pkey_observatie):
    print(pkey_observatie)

https://oefen.dov.vlaanderen.be/data/observatie/2022-3486622
https://oefen.dov.vlaanderen.be/data/observatie/2022-5672035
https://oefen.dov.vlaanderen.be/data/observatie/2022-4942917
https://oefen.dov.vlaanderen.be/data/observatie/2022-1664432
https://oefen.dov.vlaanderen.be/data/observatie/2022-2392215
https://oefen.dov.vlaanderen.be/data/observatie/2022-2394182
https://oefen.dov.vlaanderen.be/data/observatie/2022-2394056
https://oefen.dov.vlaanderen.be/data/observatie/2022-1664515
https://oefen.dov.vlaanderen.be/data/observatie/2022-1664872
https://oefen.dov.vlaanderen.be/data/observatie/2022-1301311


### Get observations in a bounding box based on specific properties

We can combine a query on attributes with a query on geographic location to get the observations within a bounding box that have specific properties.

The following example requests the observations where the parameter 'Watergehalte (watergehalte)' is greater than 30 and within the given bounding box.

(Note that the datatype of the *literal* parameter should be a string, regardless of the datatype of this field in the output dataframe.)

In [13]:
from owslib.fes2 import PropertyIsGreaterThanOrEqualTo, And

query = And([PropertyIsGreaterThanOrEqualTo(propertyname='resultaat',literal='30'),
            PropertyIsEqualTo(propertyname='parameter', literal='Watergehalte (watergehalte)')])

df = observatie.search(
    location=Within(Box(114000, 172310, 114005, 172315)),
    query=query,
    max_features = 10
    )

df.head()

[000/001] .


Unnamed: 0,pkey_observatie,pkey_parent,fenomeentijd,diepte_van_m,diepte_tot_m,parametergroep,parameter,detectieconditie,resultaat,eenheid,methode,uitvoerder,herkomst
0,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2018-01-09,,,Volumemassa-watergehalte,Watergehalte (watergehalte),,363,%,Gewichtsverlies na drogen in droogstoof,VO - Afdeling Geotechniek,LABO
1,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2018-01-09,,,Volumemassa-watergehalte,Watergehalte (watergehalte),,517,%,Gewichtsverlies na drogen in droogstoof,VO - Afdeling Geotechniek,LABO


We can look at one of theobservations in a webbrowser using its *pkey_observatie*:

In [14]:
for pkey_observatie in set(df.pkey_observatie):
    print(pkey_observatie)

https://oefen.dov.vlaanderen.be/data/observatie/2022-2533717
https://oefen.dov.vlaanderen.be/data/observatie/2022-2761922


### Select observations with specific conditions and return the results

We can limit the columns in the output dataframe by specifying the *return_fields* parameter in our search.

In this example we query all the observations that have a value (resultaat) greater than 10 for parameter 'Watergehalte (watergehalte)' and	return its value (resultaat):

In [None]:
query = And([PropertyIsGreaterThanOrEqualTo(propertyname='resultaat',literal='10'),
            PropertyIsEqualTo(propertyname='parameter', literal='Watergehalte (watergehalte)')])

df = observatie.search(query=query,
                       return_fields=('resultaat',),
                       max_features=10)
df.head()

In [None]:
df.describe()

By discarding the observations with a resultaat less than 50, we get a different result:

In [None]:
df[df.resultaat.astype(float) < 50.0].describe()

In [None]:
ax = df[df.resultaat.astype(float) < 50.0].astype(float).boxplot()
ax.set_ylabel("Water content(%)");
ax.set_title("Distribution of water content");

### Get observations based on fields not available in the standard output dataframe

To keep the output dataframe size acceptable, not all available WFS fields are included in the standard output. However, one can use this information to select observations as illustrated below.

For example, make a selection of the observations that have an 'opdracht':

In [15]:
from owslib.fes2 import Not
from owslib.fes2 import PropertyIsNull

query = Not([PropertyIsNull(propertyname='opdracht')])

df = observatie.search(query=query, max_features = 10,
                   return_fields=('pkey_observatie', 'opdracht'))
df.head()

[000/001] .


Unnamed: 0,pkey_observatie,opdracht
0,https://oefen.dov.vlaanderen.be/data/observati...,Bodemkoolstofmonitoringnetwerk Cmon: staalname...
1,https://oefen.dov.vlaanderen.be/data/observati...,Bodemkoolstofmonitoringnetwerk Cmon leesrechte...
2,https://oefen.dov.vlaanderen.be/data/observati...,Bodemkoolstofmonitoringnetwerk Cmon leesrechte...
3,https://oefen.dov.vlaanderen.be/data/observati...,Bodemkoolstofmonitoringnetwerk Cmon leesrechte...
4,https://oefen.dov.vlaanderen.be/data/observati...,Bodemkoolstofmonitoringnetwerk Cmon leesrechte...


### Select observations with extra details

We can ask extra info from an observation from the XML. In this example we want the details of an observation

In [16]:
from pydov.search.observatie import ObservatieSearch
from pydov.types.observatie import Observatie, ObservatieDetails

observatie = ObservatieSearch(objecttype=Observatie.with_extra_fields(ObservatieDetails)
                              )

df = observatie.search(max_features=10)
df.head()

[000/001] .
[000/010] cccccccccc


Unnamed: 0,pkey_observatie,pkey_parent,fenomeentijd,diepte_van_m,diepte_tot_m,parametergroep,parameter,detectieconditie,resultaat,eenheid,methode,uitvoerder,herkomst,betrouwbaarheid,geobserveerd_object_type,geobserveerd_object_naam,geobserveerd_object_permkey
0,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/1...,1950-01-01,,,Bodem_chemisch,Sorptie totaal (sorptie_totaal),,2120,meq/100g,Onbekend,,VELD,B,monster,KART_PROF_011E/65_H3_M1,1950-254346
1,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/2...,2016-10-12,7.3,7.5,Onderkenning-grondsoort,Grondsoort volgens GEO-BGGG (Grondsoort BGGG),,weinig kalkh. zandh. leem,-,Classificatie volgens de norm,VO - Afdeling Geotechniek,LABO,B,monster,N003B,2018-202354
2,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/1...,1964-01-01,,,Bodem_fysisch_textuur,Textuurfracties (textuurmeting),,,%,Onbekend,,LABO,B,monster,KART_OPP_091E/069,1964-299792
3,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/bodemloca...,2023-07-17,,,Instrument parameters,Volumetrisch vochtgehalte (Volumetrisch vochtg...,,,%,Onbekend,,VELD,B,bodemlocatie,CN_141230_2021,2021-025676
4,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/monster/1...,1952-12-09,,,Bodem_chemisch,Calciumcarbonaatgehalte (caco3_gehalte),,00,%,Onbekend,,VELD,B,monster,KART_PROF_041E/25_H6_M1,1952-262815


### Get observations with data from the subtype 'ObservatieHerhaling'

There are different subtypes available within observations search: 'ObservatieHerhaling'



In [17]:
from pydov.search.observatie import ObservatieSearch
from pydov.types.observatie import Observatie, ObservatieHerhaling
from owslib.fes2 import PropertyIsLike

observatie = ObservatieSearch(
    objecttype=Observatie.with_subtype(ObservatieHerhaling))
query = PropertyIsLike(propertyname='pkey_observatie',
                       literal='%2022-11963810%')
df = observatie.search(query=query, max_features = 10)
df.head()

[000/001] .
[000/001] c


Unnamed: 0,pkey_observatie,pkey_parent,fenomeentijd,diepte_van_m,diepte_tot_m,parametergroep,parameter,detectieconditie,resultaat,eenheid,methode,uitvoerder,herkomst,herhaling_aantal,herhaling_minimum,herhaling_maximum,herhaling_standaardafwijking
0,https://oefen.dov.vlaanderen.be/data/observati...,https://oefen.dov.vlaanderen.be/data/bodemdiep...,2022-02-17,,,Bodem_terrein,Strooisellaag of viltlaag - dikte (strooisella...,,10,cm,Cmon staalnameprotocol,,VELD,32,1.0,1.0,0.0


## Visualize results

Using Geopandas GeoDataFrame, we can easily display the results of our search on a map.

In [18]:
import geopandas as gpd

query = And([PropertyIsGreaterThanOrEqualTo(propertyname='resultaat',literal='10'),
            PropertyIsEqualTo(propertyname='parameter', literal='Watergehalte (watergehalte)')])
df = observatie.search(query=query,
                   return_fields=('pkey_observatie','resultaat','geom'), max_features = 100)

[000/001] .


In [19]:
gdf = gpd.GeoDataFrame(df, geometry='geom', crs='EPSG:31370')
gdf.explore()