# Example of DOV search methods for the Aardewerk dataset

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/DOV-Vlaanderen/pydov/master?filepath=docs%2Fnotebooks%2Fsearch_aardewerk.ipynb)

Aardewerk is a database including the description and analysis results of 7.020 soil profiles and 42.592 associated soil horizons, supplemented with 9.281 surface samples. All of them are located within the territories of Flanders and Brussels.
This data was collected during a systematic soil profile study, conducted in Belgium between 1949 and 1971.

Most of the data was made accessible in the [DOV viewer](https://www.dov.vlaanderen.be/portaal/?module=verkenner) by translating the Aardewerk database into the DOV-Bodemdatabank as soil locations (bodemlocaties). 

In [1]:
%matplotlib inline

import inspect, sys
import warnings; warnings.simplefilter('ignore')

In [2]:
# check pydov path
import pydov

## Extract Aardewerk data from DOV
Since this tutorial relies on the manipulation of soil locations (bodemlocaties), it is strongly recommended to follow the 'Example of DOV search methods for soil data (bodemgegevens)' first.

First we would like to initiate the 'Bodemlocatie' datatype: 

In [3]:
from pydov.search.bodemlocatie import BodemlocatieSearch
bodemlocatie = BodemlocatieSearch()

In the DOV-Bodemdatabank, Aardewerk data is distinguished by having the suffix 'KART_PROF_' to its name.
Before constructing a query, it would be interesting to get a grasp at the different fields existing within the 'Bodemlocatie' datatype by using the get_fields() method:

In [4]:
fields = bodemlocatie.get_fields()

# print available fields
for f in fields.values():
    print(f['name'])

naam
pkey_bodemlocatie
type
rapport_bodemlocatie
profielbeschrijving
waarnemingsdatum
doel
x
y
mv_mtaw
Auteurs
Aantal_classificaties
Aantal_opbouwen
erfgoed
Aantal_observaties
Aantal_monsters
bodemstreek
Bodemsite
pkey_bodemsite
Opdrachten
invoerdatum
educatieve_waarde


You can get more information of a field by requesting it from the fields dictionary:

name: name of the field
definition: definition of this field
cost: currently this is either 1 or 10, depending on the datasource of the field. It is an indication of the expected time it will take to retrieve this field in the output dataframe.
notnull: whether the field is mandatory or not
type: datatype of the values of this field

In [5]:
fields['naam']

{'name': 'naam',
 'definition': 'De unieke naam van de bodemlocatie.',
 'type': 'string',
 'notnull': False,
 'query': True,
 'cost': 1}

### Filter Aardewerk soil locations
Since we know that Aardewerk soil locations make use of a specific suffix, a query could be built filtering these out.

A list of possible operators can be found below:

In [6]:
[i for i,j in inspect.getmembers(sys.modules['owslib.fes'], inspect.isclass) if 'Property' in i]

['PropertyIsBetween',
 'PropertyIsEqualTo',
 'PropertyIsGreaterThan',
 'PropertyIsGreaterThanOrEqualTo',
 'PropertyIsLessThan',
 'PropertyIsLessThanOrEqualTo',
 'PropertyIsLike',
 'PropertyIsNotEqualTo',
 'PropertyIsNull',
 'SortProperty']

Since we only need to match a partial string in the name, we will build a query using the *PropertyIsLike* operator to find all Aardewerk bodemlocaties.
We use *max_features=10* to limit the results to 10.

In [7]:
from owslib.fes import PropertyIsLike

query = PropertyIsLike(propertyname='naam',
                       literal='KART_PROF_%', wildCard='%')
df = bodemlocatie.search(query=query, max_features=10)

df.head()

[000/010] cccccccccc


Unnamed: 0,pkey_bodemlocatie,pkey_bodemsite,naam,type,waarnemingsdatum,doel,x,y,mv_mtaw,erfgoed,bodemstreek,invoerdatum,educatieve_waarde
0,https://www.dov.vlaanderen.be/data/bodemlocati...,,KART_PROF_027W/24,profielput,1950-11-28,bodemprofielen en oppervlaktemonsters karterin...,134282.0,215405.0,2.5,False,Doel,2019-10-11,OK
1,https://www.dov.vlaanderen.be/data/bodemlocati...,,KART_PROF_001E/02,profielput,1954-05-31,bodemprofielen en oppervlaktemonsters karterin...,158367.0,240397.0,12.5,False,Kempen,2019-10-11,OK
2,https://www.dov.vlaanderen.be/data/bodemlocati...,,KART_PROF_007W/40,profielput,1955-07-07,bodemprofielen en oppervlaktemonsters karterin...,167292.0,229988.0,19.0,False,Kempen,2019-10-11,OK
3,https://www.dov.vlaanderen.be/data/bodemlocati...,,KART_PROF_037E/38,profielput,1952-02-01,bodemprofielen en oppervlaktemonsters karterin...,65969.0,207819.0,21.0,False,Zandstreek,2019-10-11,OK
4,https://www.dov.vlaanderen.be/data/bodemlocati...,,KART_PROF_022E/68,profielput,1951-03-02,bodemprofielen en oppervlaktemonsters karterin...,65480.0,213545.0,2.0,False,Oudlandpolders,2019-10-11,OK


As seen in the soil data example, we can use the *pkey_bodemlocatie* as a permanent link to the information of these bodemlocaties:

In [8]:
for pkey_bodemlocatie in set(df.pkey_bodemlocatie):
    print(pkey_bodemlocatie)

https://www.dov.vlaanderen.be/data/bodemlocatie/1951-000825
https://www.dov.vlaanderen.be/data/bodemlocatie/1957-000826
https://www.dov.vlaanderen.be/data/bodemlocatie/1958-000829
https://www.dov.vlaanderen.be/data/bodemlocatie/1950-000834
https://www.dov.vlaanderen.be/data/bodemlocatie/1954-000822
https://www.dov.vlaanderen.be/data/bodemlocatie/1955-000827
https://www.dov.vlaanderen.be/data/bodemlocatie/1955-000823
https://www.dov.vlaanderen.be/data/bodemlocatie/1954-000828
https://www.dov.vlaanderen.be/data/bodemlocatie/1952-000830
https://www.dov.vlaanderen.be/data/bodemlocatie/1952-000824
