# Add stratigraphy from GeoTOP to CPT data

This example shows how stratigraphic layer boundaries from GeoTOP can easily be added to CPT data. This way, CPT parameters can easily be aggregated to get averages for geological units. For this example we are going to use a selection of CPTs in the area of the Utrecht Science Park (USP).

We will first import the relevant modules and plot the locations of the CPTs in the [`CptCollection`](../api_reference/cpt_collection.rst).

In [None]:
import seaborn as sns
from matplotlib import pyplot as plt

import geost
from geost.analysis.combine import add_voxelmodel_variable
from geost.bro import GeoTop
from geost.bro.bro_geotop import StratGeotop

cpts = geost.data.cpts_usp()
cpts.header.explore(style_kwds=dict(color="red", weight=6))

## Adding information from a voxelmodel
Any information from voxelmodels can be added. For this example we show how to add the stratigraphy from GeoTOP to the CPTs. First we read GeoTOP directly from the OpenDaP server for the USP area.

In [None]:
geotop = GeoTop.from_opendap(data_vars=["strat"], bbox=cpts.header.total_bounds)
print(geotop)

As you can see, this prints a [`GeoTop`](../api_reference/bro_geotop.rst) instance along with the dimensions and resolution of GeoTOP. We can use this GeoTop instance to add the variable "strat" to the CPT data. This adds a column "strat" to the data object of the [`CptCollection`](../api_reference/cpt_collection.rst). First, we must create a "depth" column in the CPT data because this is expected to be present in the data. We can use the column "penetration_length" for this. Below we add the variable and print the resulting column:

In [None]:
cpts.data["depth"] = cpts.data["depth"]
cpts = add_voxelmodel_variable(cpts, geotop, "strat")
print(cpts.data["strat"])

Note that some of the resulting values are "NaN" which occurs when a CPT falls outside of the 3D model extent. In this case, some CPTs are deeper than the maximum depth of GeoTOP. Now we could easily aggregate any CPT parameter according to the new "strat" variable. For example, make a boxplot of the average cone resistance ("qc"):

In [None]:
fig, ax = plt.subplots()
sns.boxplot(ax=ax, data=cpts.data, x="strat", y="cone_resistance", showfliers=False)
ax.tick_params(axis="x", rotation=45)

Each number on the x-axis represents a geological unit. However, these numbers are off course not really intuitive and knowing the corresponding geological unit is hard. GeoST provides the [`StratGeotop`](../api_reference/geotop_selection.rst) class which makes it easy to select desired stratigraphic units with or relabel the unit numbers in the plot above into more meaningfull names.

First, we select all the units that have been merged with the CPT data:

In [None]:
units = StratGeotop.select_values(cpts.data["strat"].unique())
units

This returns a list of enum types for each unit. An enum contains a "name" with corresponding "value". Now, we can make a dictionary with both and replace the numbers with the names in the "strat" column. Let's print the result and plot the figure again:

In [None]:
replace_dict = {unit.value: unit.name for unit in units}
cpts.data["strat"].replace(replace_dict, inplace=True)
print(cpts.data["strat"])

fig, ax = plt.subplots()
sns.boxplot(ax=ax, data=cpts.data, x="strat", y="cone_resistance", showfliers=False)
ax.tick_params(axis="x", rotation=45)

The x-axis in the plot now has some more meaningful abbreviations for geological units which can be found in the [stratigraphic nomenclature](https://www.dinoloket.nl/stratigrafische-nomenclator/boven-noordzee-groep). Most units belong to the "Boven-Noordzee Groep" (prefix "NU"). For example, the unit "EC" in the plot with the corresponding code can be found [here](https://www.dinoloket.nl/stratigrafische-nomenclator/formatie-van-echteld).

The [`StratGeotop`](../api_reference/geotop_selection.rst) class can be used for all sorts of selections and groupings a user would like for further analyses.