# Adding surface water to a model

*D.A. Brakenhoff, Artesia, 2020*
*R.J. Calje, Artesia, 2023*

This example notebook shows some how to add surface water defined in a GeoDataFrame to a MODFLOW model using the `nlmod` package.

There are three water boards in the model area, and we download seasonal data about the stage of the surface water for each. In this notebook we perform a steady-state run, in which the stage of the surface water is the mean of the summer and winter stage. For locations without a stage from the water board, we obtain information from a Digital Terrain Model near the surface water features, to estimate a stage. We assign a stage of 0.0 m NAP to the river Lek. The surface water bodies in each cell are aggregated using an area-weighted method and added to the model with the river-package.

In [None]:
import os

import flopy
import matplotlib.pyplot as plt
import rioxarray

import nlmod

In [None]:
nlmod.util.get_color_logger("INFO")
nlmod.show_versions()

## Load data

First we define the extent of our model and subsequently input that information into the convenient methods in `nlmod` to download all the relevant data and create a Modflow6 model.

In [None]:
model_name = "steady"
model_ws = "schoonhoven"
figdir, cachedir = nlmod.util.get_model_dirs(model_ws)
extent = [116_500, 120_000, 439_000, 442_000]

### AHN
Download the Digital Terrain model of the Netherlands (AHN). To speed up this notebook we download data on a resolution of 5 meter. We can change this to a resolution of 0.5 meter, changing the identifier to "AHN4_DTM_05m".

In [None]:
fname_ahn = os.path.join(cachedir, "ahn.tif")
if not os.path.isfile(fname_ahn):
    ahn = nlmod.read.ahn.get_ahn4(extent)
    ahn.rio.to_raster(fname_ahn)
ahn = rioxarray.open_rasterio(fname_ahn, mask_and_scale=True)[0]

### Layer 'waterdeel' from bgt
As the source of the location of the surface water bodies we use the 'waterdeel' layer of the Basisregistratie Grootschalige Topografie (BGT). This data consists of detailed polygons, maintained by dutch government agencies (water boards, municipalities and Rijkswaterstaat).

In [None]:
bgt = nlmod.read.bgt.get_bgt(extent)

#### Add minimum surface height around surface water bodies
Get the minimum surface level in 5 meter around surface water levels and add these data to the column 'ahn_min'.

In [None]:
bgt = nlmod.gwf.add_min_ahn_to_gdf(bgt, ahn, buffer=5.0, column="ahn_min")

#### Plot 'bronhouder'
We can plot the column 'bronhouder' from the GeoDataFrame bgt. We see there are three water boards in this area (with codes starting with 'W').

In [None]:
f, ax = nlmod.plot.get_map(extent)
bgt.plot("bronhouder", legend=True, ax=ax)

### level areas
For these three waterboards we download the level areas (peilgebieden): polygons with information about winter and summer stages.

In [None]:
la = nlmod.gwf.surface_water.download_level_areas(
    bgt, extent=extent, raise_exceptions=False
)

#### Plot summer stage
The method download_level_areas() generates a dictionary with the name of the water boards as keys and GeoDataFrames as values. Each GeoDataFrame contains the columns summer_stage and winter_stage. Let's plot the summer stage, together with the location of the surface water bodies.

In [None]:
f, ax = nlmod.plot.get_map(extent)
bgt.plot(color="k", ax=ax)
for wb in la:
    la[wb].plot("summer_stage", ax=ax, vmin=-3, vmax=1, zorder=0)

#### Add stages to bgt-data
We then add the information from these level areas to the surface water bodies.

In [None]:
bgt = nlmod.gwf.surface_water.add_stages_from_waterboards(bgt, la=la)

#### Save the data to use in other notebooks as well
We save the bgt-data to a GeoPackage file, so we can use the data in other notebooks with surface water as well.

In [None]:
fname_bgt = os.path.join(cachedir, "bgt.gpkg")
bgt.to_file(fname_bgt)

#### Change some values in the GeoDataFrame for this model

In [None]:
sfw = bgt
sfw["stage"] = sfw[["winter_stage", "summer_stage"]].mean(1)
# use a water depth of 0.5 meter
sfw["botm"] = sfw["stage"] - 0.5
# set the stage of the Lek to 0.0 m NAP and the botm to -3 m NAP
mask = sfw["bronhouder"] == "L0002"
sfw.loc[mask, "stage"] = 0.0
sfw.loc[mask, "botm"] = -3.0

Take a look at the first few rows. For adding surface water features to a MODFLOW model the following attributes must be present:

- **stage**: the water level (in m NAP)
- **botm**: the bottom elevation (in m NAP)
- **c0**: the bottom resistance (in days)

The `stage` and the `botm` columns are present in our dataset. The bottom resistance `c0` is rarely known, and is usually estimated when building the model. We will add our estimate later on.

<div class="alert alert-info">
    
<b>Note:</b>

The NaN's in the dataset indicate that not all parameters are known for each feature. This is not necessarily a problem but this will mean some features will not be converted to model input.
   
</div>

Now use `stage` as the column to color the data. Note the missing features caused by the fact that the stage is undefined (NaN).

In [None]:
fig, ax = nlmod.plot.get_map(extent)
sfw.plot(ax=ax, column="stage", legend=True)

## Build model

The next step is to define a model grid and build a model (i.e. create a discretization and define flow parameters).

Build the model. We're keeping the model as simple as possible.

In [None]:
delr = delc = 50.0
start_time = "2021-01-01"

In [None]:
# layer model
layer_model = nlmod.read.get_regis(
    extent, cachedir=cachedir, cachename="layer_model.nc"
)
layer_model

In [None]:
# create a model ds by changing grid of layer_model
ds = nlmod.to_model_ds(layer_model, model_name, model_ws, delr=delr, delc=delc)

# create model time dataset
ds = nlmod.time.set_ds_time(ds, start=start_time, steady=True, perlen=1)

ds

In [None]:
# create simulation
sim = nlmod.sim.sim(ds)

# create time discretisation
tdis = nlmod.sim.tdis(ds, sim)

# create ims
ims = nlmod.sim.ims(sim)

# create groundwater flow model
gwf = nlmod.gwf.gwf(ds, sim)

# Create discretization
dis = nlmod.gwf.dis(ds, gwf)

# create node property flow
npf = nlmod.gwf.npf(ds, gwf)

# Create the initial conditions package
ic = nlmod.gwf.ic(ds, gwf, starting_head=1.0)

# Create the output control package
oc = nlmod.gwf.oc(ds, gwf)

## Add surface water

Now that we have a discretization (a grid, and layer tops and bottoms) we can start processing our surface water shapefile to add surface water features to our model. The method to add surface water starting from a shapefile is divided into the following steps:

1. Intersect surface water shape with grid. This steps intersects every feature with the grid so we can determine the surface water features in each cell.
2. Aggregate parameters per grid cell. Each feature within a cell has its own parameters. For MODFLOW it is often desirable to have one representative set of parameters per cell. These representative parameters are calculated in this step.
3. Build stress period data. The results from the previous step are converted to stress period data (generally a list of cellids and representative parameters: `[(cellid), parameters]`) which is used by MODFLOW and flopy to define boundary conditions.
4. Create the Modflow6 package

The steps are illustrated below.

### Intersect surface water shape with grid

The first step is to intersect the surface water shapefile with the grid.

In [None]:
sfw_grid = nlmod.grid.gdf_to_grid(
    sfw, gwf, cachedir=ds.cachedir, cachename="sfw_grid.pklz"
)

Plot the result and the model grid and color using `cellid`. It's perhaps a bit hard to see but each feature is cut by the gridlines. 

In [None]:
fig, ax = nlmod.plot.get_map(extent)
sfw_grid.plot(ax=ax, column="cellid")
nlmod.plot.modelgrid(ds, ax=ax, lw=0.2)

### Aggregate parameters per model cell
The next step is to aggregate the parameters for all the features in one grid cell to obtain one representative set of parameters. First, let's take a look at a grid cell containing multiple features. We take the gridcell that contains the most features.

In [None]:
cid = sfw_grid.cellid.value_counts().index[0]
mask = sfw_grid.cellid == cid
sfw_grid.loc[mask]

We can also plot the features within that grid cell.

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(10, 8))
sfw_grid.loc[mask].plot(
    column="identificatie",
    legend=True,
    ax=ax,
    legend_kwds={"loc": "lower left", "ncol": 2, "fontsize": "x-small"},
)
xlim = ax.get_xlim()
ylim = ax.get_ylim()
gwf.modelgrid.plot(ax=ax)
ax.set_xlim(xlim[0], xlim[0] + nlmod.grid.get_delr(ds)[-1] * 1.1)
ax.set_ylim(ylim)
ax.set_title(f"Surface water shapes in cell: {cid}")

Now we want to aggregate the features in each cell to obtain a representative set of parameters (`stage`, `conductance`, `bottom elevation`) to use in the model. There are several aggregation methods. Note that the names of the methods are not representative of the aggregation applied to each parameter. For a full description see the following list:

- `'area_weighted'`
  - **stage**: area-weighted average of stage in cell
  - **cond**: conductance is equal to area of surface water divided by bottom resistance
  - **elev**: the lowest bottom elevation is representative for the cell
- `'max_area'`
  - **stage**: stage is determined by the largest surface water feature in a cell
  - **cond**: conductance is equal to area of all surface water features divided by bottom resistance
  - **elev**: the lowest bottom elevation is representative for the cell
- `'de_lange'`
  - **stage**: area-weighted average of stage in cell
  - **cond**: conductance is calculated using the formulas derived by De Lange (1999).
  - **elev**: the lowest bottom elevation is representative for the cell
  
Let's try using `area_weighted`. This means the stage is the area-weighted average of all the surface water features in a cell. The conductance is calculated by dividing the total area of surface water in a cell by the bottom resistance (`c0`). The representative bottom elevation is the lowest elevation present in the cell.

In [None]:
try:
    nlmod.gwf.surface_water.aggregate(sfw_grid, "area_weighted")
except ValueError as e:
    print(e)

The function checks whether the requisite columns are defined in the DataFrame. We need to add a column containing the bottom resistance `c0`. Often a value of 1 day is used as an initial estimate.

In [None]:
sfw_grid["c0"] = 1.0  # days

Now aggregate the features.

In [None]:
celldata = nlmod.gwf.surface_water.aggregate(
    sfw_grid, "area_weighted", cachedir=ds.cachedir, cachename="celldata.pklz"
)

Let's take a look at the result. We now have a DataFrame with cell-id as the index and the three parameters we need for each cell `stage`, `cond` and `rbot`. The area is also given, but is not needed for the groundwater model. 

In [None]:
celldata.head(10)

### Build stress period data

The next step is to take our cell-data and build convert it to 'stress period data' for MODFLOW. This is a data format that defines the parameters in each cell in the following format:

```
[[(cellid1), param1a, param1b, param1c],
 [(cellid2), param2a, param2b, param2c],
 ...]
```

The required parameters are defined by the MODFLOW-package used:

- **RIV**: for the river package `(stage, cond, rbot)`
- **DRN**: for the drain package `(stage, cond)`
- **GHB**: for the general-head-boundary package `(stage, cond)`

We're selecting the RIV package. We don't have a bottom (rbot) for each reach in celldata. Therefore we remove the reaches where rbot is nan (not a number).

In [None]:
new_celldata = celldata.loc[~celldata.rbot.isna()]
print(f"removed {len(celldata)-len(new_celldata)} reaches because rbot is nan")

In [None]:
riv_spd = nlmod.gwf.surface_water.build_spd(new_celldata, "RIV", ds)

Take a look at the stress period data for the river package:

In [None]:
riv_spd[:10]

### Create RIV package
The final step is to create the river package using flopy.

In [None]:
riv = flopy.mf6.ModflowGwfriv(gwf, stress_period_data=riv_spd)

Plot the river boundary condition to see where rivers were added in the model

In [None]:
# use flopy plotting methods
fig, ax = plt.subplots(1, 1, figsize=(10, 8), constrained_layout=True)
mv = flopy.plot.PlotMapView(model=gwf, ax=ax, layer=0)
mv.plot_bc("RIV")

## Write + run model

Now write the model simulation to disk, and run the simulation.

In [None]:
nlmod.sim.write_and_run(sim, ds, write_ds=True, script_path="02_surface_water.ipynb")

## Visualize results

To see whether our surface water was correctly added to the model, let's visualize the results. We'll load the calculated heads, and plot them.

In [None]:
head = nlmod.gwf.get_heads_da(ds)

Plot the heads in a specific model layer

In [None]:
# using nlmod plotting methods
ax = nlmod.plot.map_array(
    head,
    ds,
    ilay=0,
    iper=0,
    plot_grid=True,
    title="Heads top-view",
    cmap="RdBu",
    colorbar_label="head [m NAP]",
)

In cross-section

In [None]:
# using flopy plotting methods
col = gwf.modelgrid.ncol // 2

fig, ax = plt.subplots(1, 1, figsize=(10, 3))
xs = flopy.plot.PlotCrossSection(model=gwf, ax=ax, line={"column": col})
qm = xs.plot_array(head[-1], cmap="RdBu")  # last timestep
xs.plot_ibound()  # plot inactive cells in red
xs.plot_grid(lw=0.25, color="k")
ax.set_ylim(bottom=-150)
ax.set_ylabel("elevation [m NAP]")
ax.set_xlabel("distance along cross-section [m]")
ax.set_title(f"Cross-section along column {col}")
cbar = fig.colorbar(qm, shrink=1.0)
cbar.set_label("head [m NAP]")