# Comparing modelled heads to observations

This notebook showcases some methods to compare modelled heads to head observations.

In [None]:
from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd
import xarray as xr

import nlmod

First read in model results from the IJmuiden model (03_local_grid_refinement.ipynb).

In [None]:
model_ws = Path("ijmuiden")
model_name = "IJm_planeten"

# read model dataset
ds = xr.open_dataset(model_ws / f"{model_name}.nc")

# read heads
head = nlmod.gwf.output.get_heads_da(ds)

Compute the groundwater level and plot the result for the first timestep.

In [None]:
# compute the groundwater level in each time step
gwl = nlmod.gwf.output.get_gwl_from_wet_cells(head)

In [None]:
# plot the heads in the first aquifer
ax = nlmod.plot.map_array(
    gwl.isel(time=0), ds=ds, cmap="RdBu_r", colorbar_label="head [m+NAP]"
)

Load the measurements and plot the locations of the observation wells.

In [None]:
df = pd.read_pickle("./data/20250428_bro_ijmuiden_np1_26_4.pklz", compression="zip")
df.head()

In [None]:
f, ax = nlmod.plot.get_map(nlmod.grid.get_extent(ds), background=True)
ax.plot(df.x, df.y, "ko");

## Get the modeled heads

Get the heads from the cells in which the observation wells are located. 

For this we use the `nlmod.layers.get_modellayers_indexer()` method which takes
a model dataset (defining the model grid) and a dataframe (with the observation
well metadata) as input. 

The dataframe must define the x,y locations of the
observation wells, and the top and bottom of the screens. By default it is
assumed these column names follow the hydropandas standard: `x`, `y`
`screen_top` and `screen_bottom`.

In [None]:
idx = nlmod.layers.get_modellayers_indexer(ds, df)
idx

This indexer can be used directly (if no warnings were raised or if
`drop_nan_layers=True`) to obtain the heads in the cells with observation wells.


<div class="alert alert-info" role="alert">

<strong>Note:</strong> If warnings were raised, this means there are
observation wells for which the corresponding model layer could not be
determined (these probably lie above or below the model). In this case the
modellayer is returned as a float array and contains NaNs.

Some post-processing will be necessary to be able to use the indexer e.g.
dropping the NaN values:

<code>idx.dropna("name", subset=["layer"])</code>

Additionally, the layer might also have to be renamed to get the layer names
corresponding to the layer indices:

<code>idx["layer"].values = ds["layer"].values[idx["layer"].astype(int)]</code>.
</div>

Try using the indexer to get the modelled heads for each observation well

In [None]:
hsim = head.sel(**idx)
hsim

Get and plot the result for the a random observation well.

In [None]:
i = 20
hsim.isel(name=i)

In [None]:
hsim.isel(name=i).plot(marker="o", figsize=(10, 3))
# plot observations
df.obs.loc[hsim["name"].values[i]].loc[
    pd.Timestamp(hsim.time[0].item()) : pd.Timestamp(hsim.time[-1].item())
].plot(y="value", ax=plt.gca(), marker="x");

In [None]:
hmean = hsim.mean("time")

f, ax = nlmod.plot.get_map(nlmod.grid.get_extent(ds), background=True)
cm = ax.scatter(
    hmean.x,
    hmean.y,
    s=75,
    c=hmean.values,
    cmap="RdBu_r",
    edgecolors="k",
    linewidths=0.75,
)
cbar = f.colorbar(cm, ax=ax, label="head [m+NAP]", shrink=0.85)

## Interpolating heads

It is also possible to interpolate the heads at the locations of the observation wells. For this we need the original x, y coordinates of the observation wells as well as the layer each well is measuring in. 

To get all this information it can be useful to use `full_output=True` in
`nlmod.layers.get_modellayers_indexer()`. This returns every variable that is
necessary to compute the layer for each observation well. 

<div class="alert alert-info" role="alert">
<strong>Note:</strong> This also returns the model layer for the `screen_top` and `screen_bottom` separately allowing you to identify observation wells spanning multiple layers.
<div>

In [None]:
idx_full = nlmod.layers.get_modellayers_indexer(ds, df, full_output=True)
idx_full

Now we can use `nlmod.observations.interpolate_points_ds()` to compute the interpolated heads
at each observation well. The first argument is the data array we want to
interpolate. The second argument is a dataset containing information about the
location and layer for each observation well.

We need to pass the correct names for each variable:
 - `x`, `y`: the coordinate names for the locations of the computed heads, the default is `"x"` and `"y"`
 - `xi`, `yi`: the coordinate names of the observation wells in `idx_full`, the default is `"x"` and `"y"`
 - `layer`: the layer dimension, the default is "layer"

Our data matches the default so we don't need to adjust anything.

<div class="alert alert-warning" role="alert">
<strong>Structured grids:</strong> For structured grids the returned x and y-coordinates in <code>nlmod.layers.get_modellayers_indexer()</code> are the coordinates corresponding to the cell centers. This way the result can be directly used for indexing a data array. The original locations of the observation wells are stored under <code>"x_obs"</code> <code>"y_obs"</code>. When using structured grids make sure to pass the correct coordinate names for <code>xi</code> and <code>yi</code> to the interpolate function.
<div>

In [None]:
hsim_i = nlmod.observations.interpolate_points_ds(head, idx_full)
hsim_i

Compare the interpolated result to the earlier result.

In [None]:
hsim.isel(name=i).plot(marker="o", figsize=(10, 3))
hsim_i.isel(name=i).plot(marker="o", ax=plt.gca());

Plot the location of the observation well in the grid:

In [None]:
obswell = idx_full.isel(name=i)
extent = [obswell.x - 200, obswell.x + 200, obswell.y - 200, obswell.y + 200]
f, ax = nlmod.plot.get_map(extent, background=True, figsize=6)
nlmod.plot.modelgrid(ds, ax=ax)
ax.plot(head.x, head.y, "k.", label="cell centers")
ax.plot(obswell.x, obswell.y, "ro", markersize=10, label=obswell.name.item())
ax.legend(loc=(0, 1), frameon=False, ncol=2, fontsize="small");