Welcome to the HexWatershed tutorial notebook!

Before running this notebook, we recommend that you are already familiar with the PyFlowline tutorial available at PyFlowline Tutorial.

This tutorial serves as an example of the HexWatershed application using a dggrid mesh.

For comprehensive documentation of HexWatershed, please visit HexWatershed Documentation

For additional information on this application and the DGGRID mesh, please refer to the following publication:

Liao, C., Engwirda, D., Cooper, M., Li, M., and Fang, Y.: Discrete Global Grid System-based Flow Routing Datasets in the Amazon and Yukon Basins, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2023-398, in review, 2024.

If you are running this notebook directly from the Binder platform, then all the dependencies are already installed. Otherwise, you must install the HexWatershed package and its dependencies. Additionally, visualization requires optional dependency packages (refer to the full documentation installation section).

Feel free to modify the notebook to use a different visualization method as needed. Enjoy exploring HexWatershed!

## Preliminaries

First, let's load some Python libraries.

In [2]:
import os
import sys
import json
from pathlib import Path
import shutil
from os.path import realpath
import importlib.util
from shutil import copy2
from datetime import date
import geopandas as gpd
import matplotlib.pyplot as plt

check hexwatershed installation

In [3]:
#check hexwatershed installation
iFlag_hexwatershed = importlib.util.find_spec("pyhexwatershed")
if iFlag_hexwatershed is not None:
    print('The hexwatershed package is installed.')
    pass
else:
    print('The hexwatershed package is not installed.')

/Users/liao313/workspace/python/hexwatershed_tutorial/notebooks/dggrid
/Users/liao313/workspace/python/hexwatershed_tutorial
['/Users/liao313/workspace/python/pyhexwatershed', '/Users/liao313/workspace/python/pyflowline', '/Users/liao313/workspace/python/pyearth', '/Users/liao313/workspace/python/hexwatershed_tutorial/notebooks/dggrid', '/opt/miniconda3/envs/hexwatershed/lib/python312.zip', '/opt/miniconda3/envs/hexwatershed/lib/python3.12', '/opt/miniconda3/envs/hexwatershed/lib/python3.12/lib-dynload', '', '/opt/miniconda3/envs/hexwatershed/lib/python3.12/site-packages', '/Users/liao313/workspace/python/pyhexwatershed', '/Users/liao313/workspace/python/pyflowline', '/Users/liao313/workspace/python/pyearth']


## Setup binary path

In this tutorial, we requires two binary programs (dggrid and hexwatershed) to be available in the system path.
Since we are running in Binder JupyterLab, the "/home/jovyan/" is used (see MyBinder document).
Tips: If you are running this notebook in your local machine, you can set it to where you installed the binaries.

In [4]:

#add compiled binary into the system path or other preferred location
os.environ["PATH"] += os.pathsep + "/home/jovyan/"

Now we can import two functions that are used to set up the model configurations.

In [5]:

from pyhexwatershed.configuration.change_json_key_value import change_json_key_value #this function is used to change the value of a key in a json file
from pyhexwatershed.configuration.read_configuration_file import pyhexwatershed_read_configuration_file #this function is used to read the model configuration file

## Data Preparation

Firstly, we'll download a sample dataset from the HexWatershed_data repository. This dataset primarily comprises three spatial datasets: (1) a watershed boundary; (2) a river network; and (3) a raster DEM file.

In [27]:
sPath_notebook = Path().resolve()
print(sPath_notebook)
sPath_parent = str(Path().resolve().parents[1])
print(f"Parent path: {sPath_parent}")

sWorkspace_data = os.path.join( sPath_parent ,  'data', 'yukon' )
if not os.path.exists(sWorkspace_data):
    print(sWorkspace_data)
    os.makedirs(sWorkspace_data)

sWorkspace_input =  os.path.join( sWorkspace_data ,  'input')
if not os.path.exists(sWorkspace_input):
    print(sWorkspace_input)
    os.makedirs(sWorkspace_input)

sWorkspace_output = sWorkspace_data + '/output/' #this is where the output will be stored
if not os.path.exists(sWorkspace_output):
    print(sWorkspace_output)
    os.makedirs(sWorkspace_output)
#create a temp folder to download data
sPath_temp = os.path.join( sPath_parent ,  'data', 'tmp' )
if not os.path.exists(sPath_temp):
    print(sPath_temp)
    os.makedirs(sPath_temp)
else:
    shutil.rmtree(sPath_temp)

# specify the repository's URL
hexwatershed_data_repo = 'https://github.com/changliao1025/hexwatershed_data.git'
# clone the repository
os.system(f'git clone {hexwatershed_data_repo} {sPath_temp}')
sPath_temp_data = os.path.join( sPath_parent ,  'data', 'tmp', 'data','yukon', 'input' )

#copy all the files under the temp data folder using shutil
#check if the destination directory exists, if exists, remove it
if os.path.exists(sWorkspace_input):
    shutil.rmtree(sWorkspace_input)

shutil.copytree(sPath_temp_data, sWorkspace_input)

shutil.rmtree(sPath_temp_data)


/Users/liao313/workspace/python/hexwatershed_tutorial
/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon


Cloning into '/Users/liao313/workspace/python/hexwatershed_tutorial/data/tmp'...


## Configuration file

Additionally, the dataset includes an example configuration file in JSON format. HexWatershed utilizes this JSON file as its configuration file. An example JSON file is available in the repository for reference.

In [None]:

#an example of the configuration file are provided in the input folder
sFilename_configuration_in = realpath( sWorkspace_input +  '/pyhexwatershed_yukon_dggrid.json' )
sFilename_basins_in = realpath( sWorkspace_input +  '/pyflowline_yukon_basins.json' )
if os.path.isfile(sFilename_configuration_in) and os.path.isfile(sFilename_basins_in):
    pass
else:
    print('This configuration does not exist: ', sFilename_configuration_in )

print('Finished the data preparation step.')

Let's take a look at the configuration files.
You will notice the structure and content of the configuration file is similar to the PyFlowline configuration file.

In [11]:

#we can take a look at the content of this json file
with open(sFilename_configuration_in, 'r') as pJSON:
    parsed = json.load(pJSON)
    print(json.dumps(parsed, indent=4))

{
    "sFilename_model_configuration": "/qfs/people/liao313/workspace/python/pyhexwatershed/tests/configurations/pyhexwatershed_yukon_mpas.json",
    "sModel": "pyhexwatershed",
    "sRegion": "yukon",
    "sWorkspace_input": "/qfs/people/liao313/workspace/python/pyhexwatershed_icom/data/yukon/input",
    "sWorkspace_output": "/compyfs/liao313/04model/pyhexwatershed/yukon",
    "sJob": "hexwatershed",
    "iFlag_create_mesh": 1,
    "iFlag_mesh_boundary": 1,
    "iFlag_save_mesh": 1,
    "iFlag_simplification": 1,
    "iFlag_intersect": 1,
    "iFlag_resample_method": 2,
    "iFlag_flowline": 1,
    "iFlag_global": 0,
    "iFlag_multiple_outlet": 0,
    "iFlag_use_mesh_dem": 1,
    "iFlag_elevation_profile": 1,
    "iFlag_rotation": 0,
    "iFlag_stream_burning_topology": 1,
    "iFlag_save_elevation": 1,
    "iFlag_dggrid": 1,
    "iResolution_index": 1,
    "iCase_index": 1,
    "iMesh_type": 4,
    "nOutlet": 1,
    "dMissing_value_dem": -9999,
    "dBreach_threshold": 10,
    "dAcc

### Set up case

Now we can set up some keywords/parameters.
Tip: they are not written into the configuration file yet.

In [8]:
#set up some parameters
sMesh_type = 'dggrid' #the dggrid mesh type supported by hexwatershed
sDggrid_type = 'ISEA3H' #a type of dggrid mesh
iCase_index = 1 #a case index for bookmark
iResolution_index = 10 #dggrid resolution index, see dggrid documentation for details.
iFlag_stream_burning_topology = 1 #see hexwatershed documentation for details
iFlag_use_mesh_dem = 0
iFlag_elevation_profile = 0 #reserved for future use

today = date.today()
iYear = today.year
iMonth = today.month
iDay = today.day
print("Today's date:", iYear, iMonth, iDay)
sDate = str(iYear) + str(iMonth).zfill(2) + str(iDay).zfill(2) #the date is also a bookmark to label a simulation


Today's date: 2024 4 16


For DGGRID mesh, the sDggrid_type and iResolution_index combined is used to specify the mesh resolution. See dggrid document for details.
We can check the actual resolution using a pyflowline function.

[DGGRID mesh](../../figures/dggrid/dggrid_mesh.png)

In [9]:
#use this function from pyflowline to check the actual spatial resolution
from pyflowline.mesh.dggrid.create_dggrid_mesh import dggrid_find_resolution_by_index
dResolution = dggrid_find_resolution_by_index(sDggrid_type, iResolution_index)
print(dResolution) #unit is meter

31759.6


[ISEA3H resolution](../../figures/dggrid/isea3h.png)

### Update configuration file

Now, let's update the configuration file with the parameters we set up earlier.
Tip: the following script actually makes a copy of the original configuration file and then modifies the copied file. This is useful if you run multiple simulations simultaneously.

In [19]:
#we want to copy the example configuration file to the output directory
sFilename_configuration_copy= os.path.join( sWorkspace_output, 'pyhexwatershed_configuration_copy.json' )
copy2(sFilename_configuration_in, sFilename_configuration_copy)

#copy the basin configuration file to the output directory as well
sFilename_configuration_basins_copy = os.path.join( sWorkspace_output, 'pyhexwatershed_configuration_basins_copy.json' )
copy2(sFilename_basins_in, sFilename_configuration_basins_copy)

#now switch to the copied configuration file for modification
sFilename_configuration = sFilename_configuration_copy
sFilename_basins = sFilename_configuration_basins_copy

change_json_key_value(sFilename_configuration, 'sWorkspace_output', sWorkspace_output) #output folder
change_json_key_value(sFilename_configuration, 'sFilename_basins', sFilename_basins) #basin configuration file

#change the boundary file
sFilename_mesh_boundary = realpath(os.path.join(sWorkspace_input, 'boundary.geojson'))
change_json_key_value(sFilename_configuration, 'sFilename_mesh_boundary', sFilename_mesh_boundary)
#change the dem file
sFilename_dem = realpath(os.path.join(sWorkspace_input, 'dem.tif'))
change_json_key_value(sFilename_configuration, 'sFilename_dem', sFilename_dem)

/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/input/pyhexwatershed_yukon_dggrid.json
/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/
HexWatershed compset is being initialized
The model will use the user provided binary file
The user provided binary file does not exist. The model will use the default binary file
Directory /Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002 created successfully
The dggrid binary file does not exist, you need to update this parameter before running the model!
The DEM file does not exist in pyflowline!
The filtered flowline file does not exist!
The filtered flowline file does not exist!
/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed_configuration_copy.json


## Create HexWatershed case object

we can now call the read configuration function again to re-create the hexwatershed object

In [21]:
#the read function accepts several keyword arguments that can be used to change the default parameters.

oPyhexwatershed = pyhexwatershed_read_configuration_file(sFilename_configuration,
                    iCase_index_in=iCase_index,iFlag_stream_burning_topology_in=iFlag_stream_burning_topology,
                    iFlag_use_mesh_dem_in=iFlag_use_mesh_dem,
                    iFlag_elevation_profile_in=iFlag_elevation_profile,
                    iResolution_index_in = iResolution_index,
                    sDggrid_type_in=sDggrid_type,
                    sDate_in= sDate, sMesh_type_in = sMesh_type)

/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/
HexWatershed compset is being initialized
The model will use the user provided binary file
The user provided binary file does not exist. The model will use the default binary file
Directory /Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002 created successfully
The dggrid binary file does not exist, you need to update this parameter before running the model!
The filtered flowline file does not exist!
The filtered flowline file does not exist!


Other than change the json file using the change_json_key_value functiond, we can also change the parameters on the fly for some of the parameters.
This is useful for testing different parameters without changing the json file, especially for individual basins.

In [35]:
#we also need to set the output location for the only basin
dLongitude_outlet_degree= -164.47594
dLatitude_outlet_degree= 63.04269
oPyhexwatershed.pPyFlowline.aBasin[0].dThreshold_small_river = dResolution * 5
oPyhexwatershed.pPyFlowline.pyflowline_change_model_parameter('dLongitude_outlet_degree', dLongitude_outlet_degree, iFlag_basin_in= 1)
oPyhexwatershed.pPyFlowline.pyflowline_change_model_parameter('dLatitude_outlet_degree', dLatitude_outlet_degree, iFlag_basin_in= 1)
#remember to update the flowline file
sFilename_flowline = realpath(os.path.join(sWorkspace_input, 'dggrid10/river_networks.geojson') )
oPyhexwatershed.pPyFlowline.pyflowline_change_model_parameter('sFilename_flowline_filter', sFilename_flowline, iFlag_basin_in= 1)
oPyhexwatershed.pPyFlowline.pyflowline_change_model_parameter('iFlag_debug', 0, iFlag_basin_in= 1)

You can check the setting for the single basin as well

In [24]:
print(oPyhexwatershed.aBasin[0].tojson())

{
    "dAccumulation_threshold": 100000.0,
    "dLatitude_outlet_degree": 0.17099,
    "dLongitude_outlet_degree": -50.71465,
    "dThreshold_small_river": 10000.0,
    "iFlag_correct_flowline_direction": 0,
    "iFlag_dam": 0,
    "iFlag_debug": 1,
    "iFlag_disconnected": 0,
    "iFlag_remove_low_order_river": 0,
    "iFlag_remove_small_river": 0,
    "iMesh_type": 1,
    "lBasinID": 1,
    "lCellID_outlet": -1,
    "sBasinID": "00000001",
    "sFilename_area_of_difference": "/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/hexwatershed/00000001/area_of_difference.geojson",
    "sFilename_basin_info": "/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/hexwatershed/00000001/basin_info.json",
    "sFilename_confluence_conceptual_info": "/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/hexwatershed/00000001/confluence_conceptual_info.json",
 

After the hexwatershed object was re-created, we can set up the model.
This step setup both the pyflowline and hexwatershed model.
You might receive some warnings about some paths unavailable, but it is normal as long as key inputs are present.

In [36]:
oPyhexwatershed.iFlag_user_provided_binary = 0 #bindaries are provided in the system path
oPyhexwatershed.pPyFlowline.iFlag_user_provided_binary = 0
oPyhexwatershed.pyhexwatershed_setup()

Started setting up model
Basin 00000001: initial flowline: /Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/input/dggrid10/river_networks.geojson
Found binary at: /Users/liao313/workspace/python/hexwatershed_tutorial/bin/dggrid
Found binary at: /Users/liao313/workspace/python/hexwatershed_tutorial/bin/hexwatershed
Elapsed time: 0.7900 seconds


## Run PyFlowline submodule

Now let's run the the submodel pyflowline.

In [37]:
#run step 1
aCell_origin = oPyhexwatershed.pyhexwatershed_run_pyflowline();

Started running pyflowline
Start flowline simplification: 00000001
Basin  00000001  has no dam
Basin  00000001 find flowline vertex
Elapsed time: 0.0041 seconds
Basin  00000001 split flowline
Elapsed time: 0.3807 seconds
Basin  00000001 started correction flow direction
Elapsed time: 0.0264 seconds
-164.46458333333362 63.03124999999984
Basin  00000001 started loop removal
Elapsed time: 0.0069 seconds
Basin  00000001 started update stream order initial
Elapsed time: 0.0078 seconds
Basin  00000001 find flowline confluence
Elapsed time: 0.0004 seconds
Basin  00000001 started stream segment definition
Elapsed time: 0.0000 seconds
Basin  00000001 started confluence definition
Elapsed time: 0.0001 seconds
Basin  00000001 started stream topology definition
Elapsed time: 0.0000 seconds
Basin  00000001 started stream order definition
Elapsed time: 0.0018 seconds
Finish flowline simplification: 00000001
Start mesh generation.
GTiff pDriver IS available.

Resolution is:  31759.6
./run_dggrid.sh
*

## Elevation assignment

After the pyflowline simulation is finished, we can assign elevation information to the mesh cells. 

In [38]:

oPyhexwatershed.pyhexwatershed_assign_elevation_to_cells(dMissing_value_in=-9999);


Started assigning elevation
GTiff pDriver IS available.

Elapsed time: 1.6194 seconds
Started updating outlet
Elapsed time: 0.0003 seconds
PyFlowline started export results
Elapsed time: 0.3954 seconds
Started running hexwatershed
Running on a Unix-based system
Start to run HexWatershed model!
The following arguments are provided:
/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/configuration.json
The configuration file is:/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/configuration.json
Finished set up  model at Tue Apr 16 19:46:26 2024

Finished reading data at Tue Apr 16 19:46:26 2024

Finished initialization at Tue Apr 16 19:46:26 2024

This is a local simulation.
This is a local simulation with only one outlet.
 You have assigned the correct outlet mesh ID!
Finished depression filling at Tue Apr 16 19:46:26 2024

Finished flow direction at Tue Apr 16 19:46:26 2024

Finished flow acc

When the elevation information is added, it is possible that the outlet cell does not have valid elevation, the model will attempt to automatically search along the river network to find the next potential outlet.

In [None]:
aCell_new = oPyhexwatershed.pyhexwatershed_update_outlet(aCell_origin)



After confirming the new outlet, we can export the PyFlowline results, which are necessary inputs for the HexWatershed model.

In [None]:
oPyhexwatershed.pPyFlowline.pyflowline_export()
oPyhexwatershed.pyhexwatershed_export_config_to_json() #this configuration has all the information needed for the hexwatershed binary to run

## Run HexWatershed

Now we can run the HexWatershed model. This step may take a while depending on the size of the watershed and the resolution of the mesh.
It calls the internal hexwatershed binary to run the simulation using the above 'new' configuration file. You will find this file in the output directory.

In [None]:
oPyhexwatershed.pyhexwatershed_run_hexwatershed();

We can now export the HexWatershed results. In the output directory, you will find a subfolder for each basin containing their respective results.

In [None]:

oPyhexwatershed.pyhexwatershed_export()


## Visualization

Now we can visualize the results for indiviual variables.

In [31]:
#export output

file1_path = oPyhexwatershed.sFilename_mesh
file2_path = oPyhexwatershed.aBasin[0].sFilename_flow_direction
gdf1 = gpd.read_file(file1_path)
gdf2 = gpd.read_file(file2_path)
fig, ax = plt.subplots()
gdf1.plot(ax=ax, color='blue')
gdf2.plot(ax=ax, color='red')
plt.show()
pass

In [None]:
oPyhexwatershed.plot( sVariable_in = 'elevation',dData_min_in=0, iFlag_colorbar_in=1)

In [None]:
oPyhexwatershed.plot( sVariable_in = 'flow_direction')

In [None]:
oPyhexwatershed.plot( sVariable_in = 'drainage_area',dData_min_in=0 , iFlag_colorbar_in=1)

In [None]:
oPyhexwatershed.plot( sVariable_in = 'travel_distance',dData_min_in=0, iFlag_colorbar_in=1)

## Congratulations! You have successfully finished a HexWatershed simulation.