# Geotechnical data API - Demo with ``owimetadatabase_soilapi``

This notebook demonstrates how the soildata app of ```owimetadatabase``` can be used to retrieve geotechnical data through the API.

To facilitate interaction with the database, the Python package ``owimetadatabase_api`` was developed which allows interaction with the database without having to form the HTTP requests. The package simplifies the interaction to the definition of a number of verbose arguments. The data retrieval is performed in the background through the ``requests`` package.

The ``owimetadatabase_api`` package is open-source and can be downloaded or cloned from https://github.com/OWI-Lab/owimetadatabase_soilapi.

## Library imports

We need to import a few essential libraries first:

   - ```pandas``` for manipulation of tabular data
   - ```owimetadatabase_api``` to interact with the owimetadatabase API
   - ```json``` to handle the JSON data returned by the API
   - ```os``` to retrieve environment variables
   - ```plotly``` for plotting data

In [None]:
import pandas as pd
pd.options.display.max_columns = 200
pd.options.display.max_rows = 1000
from owimetadatabase_soilapi.soil.io import SoilAPI, SOIL_URL_PREFIX
import json
import os
import plotly.express as px

For geotechnical data manipulation, the ```groundhog``` library is used. The modules for soil profiles and PCPT testing are loaded.

In [None]:
from groundhog.general import soilprofile
from groundhog.siteinvestigation.insitutests.pcpt_processing import PCPTProcessing

## API access setup

### Authentication

The API is only accessible for authenticated users. To get a user account, send an email to bruno.stuyts@vub.be with your name, affiliation and use case.

Users will receive an API token which needs to be stored as the environment variable ```OWIMETA_TOKEN```. We can check that the environment variable is not empty. In case of problems, the try refreshing the environment variables before running Jupyter. Alternatively, you can just assign the value of your token to ```TOKEN``` (not recommended for security reasons).

In [None]:
TOKEN = os.getenv('OWIMETA_TOKEN')
TOKEN

We can set up the header of the API requests as follows:

In [None]:
head = {'Authorization': 'Token %s' % (TOKEN)}

With this header, we can authenticate all requests. We will set up the connection to the soil data API by creating an instance of the ``SoilAPI`` class.

In [None]:
soil_api = SoilAPI(api_root=SOIL_URL_PREFIX, header=head)

The ``soil_api`` object can be used for all further interaction with the API.

## Survey campaigns

Retrieving which survey campaigns happened on a project is done with the ``get_surveycampaigns`` method. The ```projectsite``` argument allows filtering based on project site. Here, we can retrieve the geotechnical surveys performed at the Borssele I site.

In [None]:
campaigns = soil_api.get_surveycampaigns(projectsite="Borssele I")['data']
campaigns

We can also retrieve a single survey campaign using the ``get_surveycampaign_detail``. As an example, the Borehole investigation campaign is retrieved.

In [None]:
soil_api.get_surveycampaign_detail(projectsite="Borssele I", campaign="Borehole investigation")['data']

## Borehole locations

Determining where the boreholes are located is an essential step in determining the geotechnical data coverage. This data can be retrieved using the ```get_testlocations``` method. Filtering per project site and survey campaign is possible.

In [None]:
borehole_investigation_locations = soil_api.get_testlocations(
    projectsite="Borssele I", campaign="Borehole investigation")['data']
borehole_investigation_locations

The geographical position of these borehole locations can be visualised using the ``plot_testlocations`` method.

In [None]:
soil_api.plot_testlocations(projectsite="Borssele I", campaign="Borehole investigation")

An method (```get_closest_testlocation```) for retrieving test locations in the vicinity of a central point is also available. We can retrieve the test locations in a radius of 1km around a point with given latitude and longitude.

In [None]:
soil_api.get_closest_testlocation(latitude=51.72, longitude=3.08, radius=1)['data']

We can see that a geotechnical test was also performed in the vicinity of the tested location during the seafloor CPT investigation.

Furthermore, test locations in the vicinity of a profile line can be retrieved (``get_testlocations_profile`` method). We need to specify latitude and longitude of the start and end point and the width of the search band (in meters). We can create a profile from location BH-WFS1-2A to location BH-WFS1-6 (NW-SE profile) with a 500m search band on either side of the profile line.

In [None]:
profile_locations = soil_api.get_testlocations_profile(lat1=51.74374, lon1=3.040028, lat2=51.70409, lon2=3.122349, band=500)

We can also plot these locations. Their position along the profile is obvious.

In [None]:
fig = px.scatter_mapbox(profile_locations, lat='northing', lon='easting', hover_name='title',
    hover_data=['title'], zoom=10, height=500)
fig.update_layout(mapbox_style='open-street-map')
fig.show()

## In-situ test data

In-situ testing returns valuable data on the geotechnical conditions at a site and in-situ data is stored in ```owimetadatabase``` in unstructured JSON fields to allow rapid retrieval of relevant data. The data has been uploaded using a standard format for common column names (e.g. ```'z [m]'``` for depth below mudline, ```'qc [MPa]'``` for cone tip resistance, ...). This allows rapid processing of the data once retrieved from the database.

### In-situ test types

We first need to know which in-situ test types exist in the database. The method ``get_insitutesttypes`` exposes this information.

In [None]:
soil_api.get_insitutesttypes()['data']

### In-situ test summary data

Retrieving full data can make the HTTP requests time out if data is requested for a large number of in-situ tests. To still allow metadata on the in-situ tests to be retrieved, the method ``get_insitutests`` is available. This only retrieves the metadata and not the detailed test results. A listing of all seabed can be retrieved for example. Method arguments can be used for filtering.

In [None]:
borssele_seabed_cpts = soil_api.get_insitutests(projectsite='Borssele I', testtype='Seabed PCPT')['data']
borssele_seabed_cpts.head()

### In-situ test detailed data

#### CPT data

To retrieve the detailed test data for CPTs, the method ``get_cpttest_detail`` can be used. To prevent timeouts, a separate call can be made for each location. For example, retrieving the downhole CPT data for location BH-WFS1-2A happens as follows. The ``'cpt'`` dictionary entry contains the ``groundhog`` ``PCPTProcessing`` object.

In [None]:
bhwfs12a_cpt = soil_api.get_cpttest_detail(
    projectsite='Borssele I', location="BH-WFS1-2A", testtype="Downhole PCPT")['cpt']

This CPT data is loaded into a ```groundhog``` ```PCPTProcessing``` object for further processing:

The CPT data can be plotted:

In [None]:
bhwfs12a_cpt.plot_raw_pcpt(u2_range=(-0.5, 2.5), u2_tick=0.5)

#### Other in-situ tests

Data from other in-situ test types can be retrieved in a similar fashion with the ``get_insitutest_detail`` method. The data is contained in the ``'rawdata'`` dictionary element. This data can be used in further processing.

In [None]:
bhwfs1_1_spcpt = soil_api.get_insitutest_detail(projectsite='Borssele I', location= 'BH-WFS1-1', testtype="S-PCPT")['rawdata']
bhwfs1_1_spcpt.head()

### Batch lab test data

Batch lab test data is laboratory test data carried out in bulk, often on-board the site investigation vessel. The available test types can be retrieved with the ``get_batchlabtesttypes`` method.

In [None]:
soil_api.get_batchlabtesttypes()['data']

We can retrieve either summary (``get_batchlabtests`` method) or detailed (``get_batchlabtest_detail``) data. As an example, we can retrieve all water batch lab tests for the Borssele I offshore wind farm and then load the test data for location BH-WFS1-2A.

In [None]:
wc_summary = soil_api.get_batchlabtests(
    projectsite="Borssele I", testtype="Water content")['data']
wc_summary

In [None]:
wc_bhwfs1_2a = soil_api.get_batchlabtest_detail(
    projectsite='Borssele I', location="BH-WFS1-2A", testtype="Water content")['rawdata']
wc_bhwfs1_2a.head()

## Sample test data

Data from advanced laboratory tests is stored in the database in the ```sampletest``` table. The API can also be used to access this data.

First, we can retrieve a listing of the samples for a specific borehole in a project using the method ``get_geotechnicalsamples``. Often, only the samples with advanced testing performed on them are included.

In [None]:
samples_bh_wfs1_2a = soil_api.get_geotechnicalsamples(projectsite='Borssele I', location='BH-WFS1-2A')['data']
samples_bh_wfs1_2a.head()

The sample test types can be retrieved using the ``get_geotechnicalsampletypes`` method.

In [None]:
soil_api.get_geotechnicalsampletypes()['data']

The available sample test types can be retrieved using the ``get_sampletesttypes`` method.

In [None]:
soil_api.get_sampletesttypes()['data']

Detailed test data can be retrieved when the borehole, locations, sample and testtype is known. Note that multiple tests of the same type can happen on one sample. In this case, the test title will need to be used to select a single test.

As an example, we can retrieve the bender element test results on sample W18. First we use the method ``get_sampletests`` to retrieve all bender element tests on this sample:

In [None]:
soil_api.get_sampletests(projectsite="Borssele I", location='BH-WFS1-1', sample="W18", testtype="Bender element")['data']

There is only one bender element test, so the data can be retrieved with the ``

In [None]:
benderelement_W18_df = soil_api.get_sampletest_detail(
    projectsite="Borssele I", location='BH-WFS1-1', sample="W18", testtype="Bender element", sampletest="CIUc+BE")['rawdata']
benderelement_W18_df

The measured value of small-strain shear modulus $ G_{max} $ can be retrieved as follows. This is the value after the isotropic consolidation stage.

In [None]:
benderelement_W18_df.iloc[0]['Gmax selected [MPa]']

## Soil profiles

Soil profile retrieval is relatively straightforward using the API. To prevent timeout API requests, a ``get_soilprofiles`` method and a ``get_soilprofile_detail`` method are provided for metadata-only and full data retrieval respectively.

First, we can retrieve the metadata for all soil profiles at the Borssele I site:

In [None]:
soilprofiles = soil_api.get_soilprofiles(projectsite='Borssele I')['data']
soilprofiles.head()

We can create a soil profile for the BH-WFS1-2A location using the ``get_soilprofile_detail`` method. A dictionary is returned in which the ``'soilprofile'`` element is a ``groundhog`` ``SoilProfile`` object.

In [None]:
profile_bh_wfs1_2a = soil_api.get_soilprofile_detail(
    projectsite="Borssele I", location="BH-WFS1-2A", soilprofile='Inferred layering')['soilprofile']
profile_bh_wfs1_2a

We can plot a mini-log of this profile. We need to define a mapping for the soil type:

In [None]:
soiltypecolors = {
    "SAND": 'yellow',
    "CLAY": 'brown',
    "Clayey SAND": 'orange',
    "Silty SAND": '#fcba03'
}

bhwfs12a_minilog = profile_bh_wfs1_2a.plot_profile(parameters=((),), fillcolordict=soiltypecolors)

This soil profile can also be used for plotting of other properties using the ``LogPlot`` from ``groundhog``.

In [None]:
from groundhog.general.plotting import LogPlot

Based on all data sources retrieved above, we can create a plot combining cone tip resistance, water content and shear wave velocity. Note that since we have a downhole CPT, we need to plot each push separately. The CPT data has a ``'Push'`` column which contains the push number. We can loop over all individual pushes and plot a trace for each one.

Other plotting options can be fine-tuned using the Plotly plotting syntax (https://plotly.com/python/creating-and-updating-figures/).

In [None]:
combined_plot = LogPlot(profile_bh_wfs1_2a, no_panels=3, fillcolordict=soiltypecolors)
for _push in bhwfs12a_cpt.data['Push'].unique():
    try:
        _push_data =  bhwfs12a_cpt.data[bhwfs12a_cpt.data['Push'] == _push]
        combined_plot.add_trace(
            x=_push_data['qc [MPa]'],
            z=_push_data['z [m]'],
            line=dict(color='black'),
            name='Cone resistance',
            showlegend=False,
            panel_no=1)
    except:
        pass
combined_plot.add_trace(
    x=wc_bhwfs1_2a['Water content [%]'],
    z=wc_bhwfs1_2a['Depth [m]'],
    name='Water content',
    marker_symbol='circle-open',
    mode='markers',
    showlegend=False,
    panel_no=2)
combined_plot.add_trace(
    x=bhwfs1_1_spcpt['Vs [m/s]'],
    z=bhwfs1_1_spcpt['z [m]'],
    name='Water content',
    marker_symbol='square-open',
    mode='markers',
    showlegend=False,
    panel_no=3)

combined_plot.set_xaxis(title=r'$ q_c \ \text{[MPa]} $', panel_no=1)
combined_plot.set_xaxis(title=r'$ w \ \text{[%]} $', panel_no=2, range=(0, 50))
combined_plot.set_xaxis(title=r'$ V_s \ \text{[m/s]} $', panel_no=3, range=(0, 500))
combined_plot.set_zaxis(title=r'$ z \ \text{[m]} $', range=(80, 0))
combined_plot.show()