# Example: Working with models in Python

The main feature of HydroMT is to facilitate the process of building and analyzing spatial geoscientific models with a focus on water system models. It does so by automating the workflow to go from raw data to a complete model instance which is ready to run and to analyse model results once the simulation has finished. 

This notebook will explore how to work with HydroMT models in Python.

In [None]:
import geopandas as gpd

# other imports
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import hydromt

## Available models and components in HydroMT

To know which models are available within your active environment, you can use global `PLUGINS` variable in hydromt

In [None]:
# generic model classes
print(f"Model classes: {hydromt.PLUGINS.model_summary()}")
# model classes from external plugin
print(f"Model components: {hydromt.PLUGINS.component_summary()}")

Here you see that we have available the core ``model`` and ``example_model`` and many generic components from core.

Apart from ``ConfigComponent``, the other components do not methods to easily add data to them. So in this notebook, we will then use the ``example_model``.

## Model components

HydroMT defines any model through the model-agnostic ``Model`` API. By default, ``Model`` does not contain any components but these can be added during instantiation. Subclasses of ``Model``, for example here ``ExampleModel`` will usually already contain several components. This is for example the case if you are using a HydroMT plugin such as ``hydromt_wflow`` which will define ``WflowSbmModel`` and ``WflowSedimentModel``.

But back to our ``ExampleModel`` class. Let's see which components make this model:

In [None]:
from hydromt.model import ExampleModel

model = ExampleModel()
model.components

We see here that ``ExampleModel`` is made of two components:

- ``config`` for the example model simulation settings file
- ``grid`` for the gridded data of example model. In this case, you see that the grid is of type ``ExampleGridComponent``. This component actually contains two methods that we can use to populate our grid with data: ``create_from_region`` and ``add_data_from_rasterdataset``.

## Building a model step-by-step

To fill in our model components with data, HydroMT uses **steps** or **setup_ methods**. These methods go from reading input data using the DataCatalog, transforming the data using processes (e.g. reprojection, deriving model parameters, etc…) and adding the new model data to the right model component.

Here, we will only have the following methods to add data to our model:

- config: [update](https://deltares.github.io/hydromt/latest/_generated/hydromt.model.components.ConfigComponent.update.html#hydromt.model.components.ConfigComponent.update)
- grid: [create_from_region](https://github.com/Deltares/hydromt/blob/fbb17de5046e657a6257959c5ee48634eab5adab/hydromt/model/example/example_grid_component.py#L58) and [add_data_from_rasterdataset](https://github.com/Deltares/hydromt/blob/fbb17de5046e657a6257959c5ee48634eab5adab/hydromt/model/example/example_grid_component.py#L156)

We are here a little limited but if you are using a [plugin](https://deltares.github.io/hydromt/latest/about/plugins.html), check their documentation to get some inspiration!

Let's start populating our model by first creating a grid using ``grid.create_from_region``. This method parses the [HydroMT region option](https://deltares.github.io/hydromt/latest/user_guide/models/model_region.html) to define the geographic region of interest and grid of the GridModel to build and once done our ``ExampleModel`` will have a ``region`` property available.

Let's use for region a subbasin for any point in the Piave basin. We first initialize a ``ExampleModel`` instance in writing mode at a model root folder. Data is sourced from the ``artifact_data`` pre-defined catalog.

In [None]:
root = "tmp_example_model_py"
model = ExampleModel(
    root=root,
    mode="w+",
    data_libs=["artifact_data"],
)

In [None]:
xy = [12.2051, 45.8331]
region = {"subbasin": xy, "uparea": 50}
model.grid.create_from_region(
    region=region,
    res=1000,
    crs="utm",
    hydrography_path="merit_hydro",
    basin_index_path="merit_hydro_index",
)
model.grid.data

In [None]:
# Plot
fig = plt.figure(figsize=(5, 6))
ax = plt.subplot()
# grid mask
model.grid.data["mask"].plot(ax=ax)
# grid vector cells using hydromt.raster.vector_grid method
model.grid.data["mask"].raster.vector_grid().boundary.plot(
    ax=ax, color="black", linewidth=0.1
)
# the outlet point we used to derive region
gdf_xy = gpd.GeoDataFrame(geometry=gpd.points_from_xy(x=[xy[0]], y=[xy[1]]), crs=4326)
gdf_xy.to_crs(model.crs).plot(ax=ax, markersize=40, c="red", zorder=2)

Similarly, we can also populate the ``config`` component using the ``config.update`` method. For HydroMT, config represents the configuration of the model kernel, e.g. the file that would fix your model kernel run settings or list of outputs etc. For most models, this is usually a text file (for example .yaml, .ini, .toml, .inp formats) that can be ordered in sections. Within HydroMT, we then use the dictionary object to represent each header/setting/value.

Let’s populate our config with some simple settings:

In [None]:
config_data = {
    "header": {"setting": "value"},
    "timers": {"start": "2010-02-05", "end": "2010-02-15"},
}

model.config.update(data=config_data)
model.config.data

We can now add data to our ``grid`` component with the method ``grid.add_data_from_rasterdataset``. Let’s add both a DEM map from the data source merit_hydro_ihu and a landuse map using vito_2015 dataset to our model grid object.

In [None]:
model.grid.add_data_from_rasterdataset(
    raster_data="merit_hydro_ihu",
    variables=["elevtn"],
    reproject_method="bilinear",
)
model.grid.add_data_from_rasterdataset(
    raster_data="vito_2015",
    fill_method="nearest",
    reproject_method="mode",
    rename={"vito": "landuse"},
)

In [None]:
# check which maps are read
print(f"model grid: {list(model.grid.data.data_vars)}")

model.grid.data["elevtn"]

In [None]:
# Plot
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 4))
# Elevation
model.grid.data["elevtn"].raster.mask_nodata().plot(ax=axes[0], cmap="terrain")
gdf_xy = gpd.GeoDataFrame(
    geometry=gpd.points_from_xy(x=[xy[0]], y=[xy[1]]), crs=4326
).to_crs(model.crs)
gdf_xy.plot(ax=axes[0], markersize=40, c="red", zorder=2)
axes[0].set_title("Elevation")

# VITO landuse
df = pd.read_csv("./legends/vito-label-qgis.txt", header=None, index_col=0)
levels = df.index
colors = (df.iloc[:-1, :4] / 255).values
ticklabs = df.iloc[:-1, 4].values
cmap, norm = mpl.colors.from_levels_and_colors(levels, colors)
ticks = np.array(levels[:-1]) + np.diff(levels) / 2.0

model.grid.data["landuse"].plot(
    ax=axes[1], cmap=cmap, norm=norm, cbar_kwargs=dict(ticks=ticks)
)
gdf_xy.plot(ax=axes[1], markersize=40, c="red", zorder=2)
axes[1].set_title("VITO Landuse")

## Model read & write methods

Once our model is filled up with data, we can then write it down using either the general ``write`` method or component 
specific ``component.write`` methods. Similarly, our model can be read back with the general read method or component specific ones.

Let’s now write our model into a model root folder.

In [None]:
model.write(components=["grid", "config"])

In [None]:
# print MODEL_ROOT folder
import os


def print_dir(root):
    for path, _, files in os.walk(root):
        print(path)
        for name in files:
            if name.endswith(".xml"):
                continue
            print(f" - {name}")


print_dir(root)

And now let’s read it back in a new ``ExampleModel`` instance:

In [None]:
model2 = ExampleModel(root=root, mode="r")
model2.read(components=["config", "grid"])

In [None]:
# check which grid are read
print(f"model grid: {list(model2.grid.data.data_vars)}")

## Building / updating a model with python

Using the same functionalities, it is also possible to build or update a model within python instead of using the command line, using the ``build`` and ``update`` methods. Let’s see how we could rebuild our previous ``ExampleModel`` with the build method.

First let’s start with writing a HydroMT build workflow file with the ExampleModel (steps) methods we want to use.

In [None]:
%%writefile tmp_build_example_model_py.yml

steps:
  - config.update:
      data:
        header.settings: value
        timers.end: "2010-02-15"
        timers.start: "2010-02-05"
  - grid.create_from_region:
      region:
        subbasin: [12.2051, 45.8331]
        uparea: 50
      res: 1000
      crs: utm
      hydrography_path: merit_hydro
      basin_index_path: merit_hydro_index
  - grid.add_data_from_rasterdataset:
      raster_data: merit_hydro_ihu
      variables:
        - elevtn
      reproject_method:
        - bilinear
  - grid.add_data_from_rasterdataset:
      raster_data: vito_2015
      fill_method: nearest
      reproject_method: mode
      rename:
        vito: landuse

And now let’s build our model:

In [None]:
from hydromt.readers import read_workflow_yaml

# First we instantiate ExampleModel with the output folder and use the write mode (build from scratch)
root3 = "tmp_example_model_py1"
model3 = ExampleModel(
    root=root3,
    mode="w+",
    data_libs=["artifact_data"],
)

# Read the workflow file
_, _, build_options = read_workflow_yaml("./tmp_build_example_model_py.yml")

# Now let's build it with the config file
model3.build(steps=build_options)

In [None]:
print_dir(root3)

And check that the results are similar to our one-by-one setup earlier:

In [None]:
model3.config.data

In [None]:
# Plot
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 4))
# Elevation
model3.grid.data["elevtn"].raster.mask_nodata().plot(ax=axes[0], cmap="terrain")
gdf_xy = gpd.GeoDataFrame(
    geometry=gpd.points_from_xy(x=[xy[0]], y=[xy[1]]), crs=4326
).to_crs(model3.crs)
gdf_xy.plot(ax=axes[0], markersize=40, c="red", zorder=2)
axes[0].set_title("Elevation")

# VITO landuse
df = pd.read_csv("./legends/vito-label-qgis.txt", header=None, index_col=0)
levels = df.index
colors = (df.iloc[:-1, :4] / 255).values
ticklabs = df.iloc[:-1, 4].values
cmap, norm = mpl.colors.from_levels_and_colors(levels, colors)
ticks = np.array(levels[:-1]) + np.diff(levels) / 2.0

model3.grid.data["landuse"].plot(
    ax=axes[1], cmap=cmap, norm=norm, cbar_kwargs=dict(ticks=ticks)
)
gdf_xy.plot(ax=axes[1], markersize=40, c="red", zorder=2)
axes[1].set_title("VITO Landuse")