This is a tutorial HexWatershed notebook.
This tutorial is an example of the hexwatershed application using a dggrid mesh.

The following publication includes a comprehensive application:

Liao, C., Engwirda, D., Cooper, M., Li, M., and Fang, Y.: Discrete Global Grid System-based Flow Routing Datasets in the Amazon and Yukon Basins, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2023-398, in review, 2024.

The full documentation of HexWatershed is hosted at: https://hexwatershed.readthedocs.io

In order the run this notebook, you must install the HexWatershed package and its dependencies. 
Besides, the visulization requires the optional dependency packages (see the full documentation installation section).
You can also modify the notebook to use a different visualization method.


First, let's load some Python libraries.

In [2]:
#step 1: load some basic libraries for various operations
import os
import sys
import json
from pathlib import Path
import shutil
from os.path import realpath
import importlib.util
from shutil import copy2
from datetime import date


If any dependency is missing, please install it using conda.

In [3]:
#now add the pyflowline into the Python path.
sPath_notebook = Path().resolve()
print(sPath_notebook)
sPath_parent = str(Path().resolve().parents[1]) 
print(sPath_parent)

#check hexwatershed installation
iFlag_hexwatershed = importlib.util.find_spec("pyhexwatershed") 
if iFlag_hexwatershed is not None:
    print('The hexwatershed package is installed.')
    pass
else:
    print('The hexwatershed package is not installed.')




/Users/liao313/workspace/python/hexwatershed_tutorial/notebooks/dggrid
/Users/liao313/workspace/python/hexwatershed_tutorial
['/Users/liao313/workspace/python/pyhexwatershed', '/Users/liao313/workspace/python/pyflowline', '/Users/liao313/workspace/python/pyearth', '/Users/liao313/workspace/python/hexwatershed_tutorial/notebooks/dggrid', '/opt/miniconda3/envs/hexwatershed/lib/python312.zip', '/opt/miniconda3/envs/hexwatershed/lib/python3.12', '/opt/miniconda3/envs/hexwatershed/lib/python3.12/lib-dynload', '', '/opt/miniconda3/envs/hexwatershed/lib/python3.12/site-packages', '/Users/liao313/workspace/python/pyhexwatershed', '/Users/liao313/workspace/python/pyflowline', '/Users/liao313/workspace/python/pyearth']


In [4]:

#add compiled binary into the system path
import os
os.environ["PATH"] += os.pathsep + "/home/jovyan/"

Now we can import functions within hexwatershed.

In [5]:
#step 3
#load the read configuration function
from pyhexwatershed.change_json_key_value import change_json_key_value
from pyhexwatershed.pyhexwatershed_read_model_configuration_file import pyhexwatershed_read_model_configuration_file

hexwatershed uses a json file for configuration, an example json file is provided.
check whether a configuration exists

In [27]:
print(sPath_parent)
sWorkspace_data = os.path.join( sPath_parent ,  'data', 'yukon' )
if not os.path.exists(sWorkspace_data):
    print(sWorkspace_data)
    os.makedirs(sWorkspace_data)

sWorkspace_input =  os.path.join( sWorkspace_data ,  'input')
if not os.path.exists(sWorkspace_input):
    print(sWorkspace_input)
    os.makedirs(sWorkspace_input)

sPath_bin = os.path.join( sPath_parent ,  'bin' )
if not os.path.exists(sPath_bin):
    print(sPath_bin)
    os.makedirs(sPath_bin)
os.environ["PATH"] += os.pathsep + sPath_bin

#create a temp folder to download data
sPath_temp = os.path.join( sPath_parent ,  'data', 'tmp' )
if not os.path.exists(sPath_temp):
    print(sPath_temp)
    os.makedirs(sPath_temp)
else:
    shutil.rmtree(sPath_temp)

# specify the repository's URL
hexwatershed_data_repo = 'https://github.com/changliao1025/hexwatershed_data.git'
# clone the repository
os.system(f'git clone {hexwatershed_data_repo} {sPath_temp}')
sPath_temp_data = os.path.join( sPath_parent ,  'data', 'tmp', 'data','yukon', 'input' )

#copy all the files under the temp data folder using shutil
# check if the destination directory exists
if os.path.exists(sWorkspace_input):
    # if it does, remove it
    shutil.rmtree(sWorkspace_input)
shutil.copytree(sPath_temp_data, sWorkspace_input)

#an example of the configuration file are provided in the input folder
sFilename_configuration_in = realpath( sWorkspace_input +  '/pyhexwatershed_yukon_dggrid.json' )
sFilename_basins_in = realpath( sWorkspace_input +  '/pyflowline_yukon_basins.json' )
if os.path.isfile(sFilename_configuration_in):
    pass
else:
    print('This configuration does not exist: ', sFilename_configuration_in )

print('Finished the data preparation step.')

/Users/liao313/workspace/python/hexwatershed_tutorial
/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon


Cloning into '/Users/liao313/workspace/python/hexwatershed_tutorial/data/tmp'...


In [11]:

#we can take a look at the content of this json file
with open(sFilename_configuration_in, 'r') as pJSON:
    parsed = json.load(pJSON)
    print(json.dumps(parsed, indent=4))

{
    "sFilename_model_configuration": "/qfs/people/liao313/workspace/python/pyhexwatershed/tests/configurations/pyhexwatershed_yukon_mpas.json",
    "sModel": "pyhexwatershed",
    "sRegion": "yukon",
    "sWorkspace_input": "/qfs/people/liao313/workspace/python/pyhexwatershed_icom/data/yukon/input",
    "sWorkspace_output": "/compyfs/liao313/04model/pyhexwatershed/yukon",
    "sJob": "hexwatershed",
    "iFlag_create_mesh": 1,
    "iFlag_mesh_boundary": 1,
    "iFlag_save_mesh": 1,
    "iFlag_simplification": 1,
    "iFlag_intersect": 1,
    "iFlag_resample_method": 2,
    "iFlag_flowline": 1,
    "iFlag_global": 0,
    "iFlag_multiple_outlet": 0,
    "iFlag_use_mesh_dem": 1,
    "iFlag_elevation_profile": 1,
    "iFlag_rotation": 0,
    "iFlag_stream_burning_topology": 1,
    "iFlag_save_elevation": 1,
    "iFlag_dggrid": 1,
    "iResolution_index": 1,
    "iCase_index": 1,
    "iMesh_type": 4,
    "nOutlet": 1,
    "dMissing_value_dem": -9999,
    "dBreach_threshold": 10,
    "dAcc

The meaning of these json keywords are explained in the HexWatershed documentation: https://pyflowline.readthedocs.io/en/latest/data/data.html#inputs


Now we can set up some keywords/parameters

In [8]:
#set up some parameters
sMesh_type = 'dggrid' #the dggrid mesh type supported by hexwatershed
sDggrid_type = 'ISEA3H' #a type of dggrid mesh
iCase_index = 1 #a case index for bookmark
iResolution_index = 10 #dggrid resolution index, see dggrid documentation for details.
iFlag_stream_burning_topology = 1 #see hexwatershed documentation for details, also see the publication list.
iFlag_use_mesh_dem = 0
iFlag_elevation_profile = 0

today = date.today()
iYear = today.year
iMonth = today.month
iDay = today.day
print("Today's date:", iYear, iMonth, iDay)
sDate = str(iYear) + str(iMonth).zfill(2) + str(iDay).zfill(2) #the date is also a bookmark to label a simulation
sWorkspace_output = sWorkspace_data + '/output/' #this is where the output will be stored

Today's date: 2024 4 16


In [9]:
#use this function from pyflowline to check the actual spatial resolution 
from pyflowline.mesh.dggrid.create_dggrid_mesh import dggrid_find_resolution_by_index
dResolution = dggrid_find_resolution_by_index(sDggrid_type, iResolution_index)
print(dResolution)  

31759.6


In [19]:
#create a temporal hexwatershed object, later on we will modify several parameters
print(sFilename_configuration_in)
oPyhexwatershed = pyhexwatershed_read_model_configuration_file(sFilename_configuration_in,
                    iCase_index_in=iCase_index,iFlag_stream_burning_topology_in=iFlag_stream_burning_topology,
                    iFlag_use_mesh_dem_in=0,
                    iFlag_elevation_profile_in=0,
                    iResolution_index_in = iResolution_index, 
                    sDggrid_type_in=sDggrid_type,
                    sDate_in = sDate, sMesh_type_in= sMesh_type, 
                    sWorkspace_output_in = sWorkspace_output)  

#first, we want to change the output directory, even if the json file might be correct, we change it anyway
sWorkspace_output_old = oPyhexwatershed.sWorkspace_output
#we will copy the example configuration files first, so we won't modify the original files
sFilename_configuration_copy= os.path.join( sWorkspace_output, 'pyhexwatershed_configuration_copy.json' )
#copy the main configuration file to the output directory
copy2(sFilename_configuration_in, sFilename_configuration_copy)
#copy the basin configuration file to the output directory as well

sFilename_configuration_basins_copy = os.path.join( sWorkspace_output, 'pyhexwatershed_configuration_basins_copy.json' )    
copy2(sFilename_basins_in, sFilename_configuration_basins_copy)

#now we can modify these two configuration files without worrying about the original files
sFilename_configuration = sFilename_configuration_copy
print(sFilename_configuration)
sFilename_basins = sFilename_configuration_basins_copy
change_json_key_value(sFilename_configuration, 'sWorkspace_output', sWorkspace_output) #output folder
change_json_key_value(sFilename_configuration, 'sFilename_basins', sFilename_basins) #basin configuration file

#we want to change the boundary file, which is a geojson file
sFilename_mesh_boundary = realpath(os.path.join(sWorkspace_input, 'boundary.geojson')) #boundary to clip mesh
change_json_key_value(sFilename_configuration, 'sFilename_mesh_boundary', sFilename_mesh_boundary) 
sFilename_dem = realpath(os.path.join(sWorkspace_input, 'dem.tif')) #boundary to clip mesh
change_json_key_value(sFilename_configuration, 'sFilename_dem', sFilename_dem) 



/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/input/pyhexwatershed_yukon_dggrid.json
/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/
HexWatershed compset is being initialized
The model will use the user provided binary file
The user provided binary file does not exist. The model will use the default binary file
Directory /Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002 created successfully
The dggrid binary file does not exist, you need to update this parameter before running the model!
The DEM file does not exist in pyflowline!
The filtered flowline file does not exist!
The filtered flowline file does not exist!
/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed_configuration_copy.json


we can now call the function to re-create a hexwatershed object

In [21]:
#the read function accepts several keyword arguments that can be used to change the default parameters.
#the normal keyword arguments are:
#iCase_index_in: this is an ID to identify the simulation case
#sMesh_type_in: this specifies the mesh type ('mpas' in this example)
#sDate_in: this specifies the date of the simulation, the final output folder will have a pattern such as 'pyflowline20230901001', where pyflowline is model, 20230901 is the date, and 001 is the case index.

oPyhexwatershed = pyhexwatershed_read_model_configuration_file(sFilename_configuration,
                    iCase_index_in=iCase_index,iFlag_stream_burning_topology_in=iFlag_stream_burning_topology,
                    iFlag_use_mesh_dem_in=iFlag_use_mesh_dem,
                    iFlag_elevation_profile_in=iFlag_elevation_profile,
                    iResolution_index_in = iResolution_index, 
                    sDggrid_type_in=sDggrid_type,
                    sDate_in= sDate, sMesh_type_in = sMesh_type)  

/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/
HexWatershed compset is being initialized
The model will use the user provided binary file
The user provided binary file does not exist. The model will use the default binary file
Directory /Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002 created successfully
The dggrid binary file does not exist, you need to update this parameter before running the model!
The filtered flowline file does not exist!
The filtered flowline file does not exist!


You can review the setting again.

In [22]:
print(oPyhexwatershed.tojson())

{
    "dAccumulation_threshold": 100000.0,
    "dBreach_threshold": 10.0,
    "dLatitude_bot": -20.4,
    "dLatitude_top": 5.5,
    "dLongitude_left": -80.0,
    "dLongitude_right": -50.6,
    "dMissing_value_dem": -9999.0,
    "dResolution_degree": 5000.0,
    "dResolution_meter": 31759.6,
    "iCase_index": 2,
    "iFlag_create_mesh": 1,
    "iFlag_elevation_profile": 0,
    "iFlag_flowline": 1,
    "iFlag_global": 0,
    "iFlag_intersect": 1,
    "iFlag_mesh_boundary": 1,
    "iFlag_multiple_outlet": 0,
    "iFlag_resample_method": 2,
    "iFlag_save_elevation": 1,
    "iFlag_save_mesh": 1,
    "iFlag_simplification": 1,
    "iFlag_stream_burning_topology": 1,
    "iFlag_use_mesh_dem": 0,
    "iFlag_user_provided_binary": 0,
    "iMesh_type": 5,
    "iResolution_index": 10,
    "nOutlet": 1,
    "pPyFlowline": "/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/pyflowline",
    "sCase": "pyhexwatershed20240416002",
    "sDate": "2024041

Other than change the json file directly, you can also change the parameters on the fly for some of the parameters.
This is useful for testing different parameters without changing the json file, especially for individual basins.

In [35]:
#we also need to set the output location for the only basin
dLongitude_outlet_degree= -164.47594
dLatitude_outlet_degree= 63.04269
oPyhexwatershed.pPyFlowline.aBasin[0].dThreshold_small_river = dResolution * 5 
oPyhexwatershed.pPyFlowline.pyflowline_change_model_parameter('dLongitude_outlet_degree', dLongitude_outlet_degree, iFlag_basin_in= 1)
oPyhexwatershed.pPyFlowline.pyflowline_change_model_parameter('dLatitude_outlet_degree', dLatitude_outlet_degree, iFlag_basin_in= 1)
sFilename_flowline = realpath(os.path.join(sWorkspace_input, 'dggrid10/river_networks.geojson') )
oPyhexwatershed.pPyFlowline.pyflowline_change_model_parameter('sFilename_flowline_filter', sFilename_flowline, iFlag_basin_in= 1)

You can check the setting for the single basin as well

In [24]:
print(oPyhexwatershed.aBasin[0].tojson())

{
    "dAccumulation_threshold": 100000.0,
    "dLatitude_outlet_degree": 0.17099,
    "dLongitude_outlet_degree": -50.71465,
    "dThreshold_small_river": 10000.0,
    "iFlag_correct_flowline_direction": 0,
    "iFlag_dam": 0,
    "iFlag_debug": 1,
    "iFlag_disconnected": 0,
    "iFlag_remove_low_order_river": 0,
    "iFlag_remove_small_river": 0,
    "iMesh_type": 1,
    "lBasinID": 1,
    "lCellID_outlet": -1,
    "sBasinID": "00000001",
    "sFilename_area_of_difference": "/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/hexwatershed/00000001/area_of_difference.geojson",
    "sFilename_basin_info": "/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/hexwatershed/00000001/basin_info.json",
    "sFilename_confluence_conceptual_info": "/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/hexwatershed/00000001/confluence_conceptual_info.json",
 

After the hexwatershed object was re-created, we can set up the model.

In [36]:
#setup the model  
oPyhexwatershed.iFlag_user_provided_binary = 0 
oPyhexwatershed.pPyFlowline.iFlag_user_provided_binary = 0     
oPyhexwatershed.pyhexwatershed_setup()

Started setting up model
Basin 00000001: initial flowline: /Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/input/dggrid10/river_networks.geojson
Found binary at: /Users/liao313/workspace/python/hexwatershed_tutorial/bin/dggrid
Found binary at: /Users/liao313/workspace/python/hexwatershed_tutorial/bin/hexwatershed
Elapsed time: 0.7900 seconds


Before any operation, we can visualize the original or raw flowline dataset. 

In [32]:
iFlag_geopandas = importlib.util.find_spec("geopandas")
if iFlag_geopandas is not None:
    import geopandas as gpd
    import matplotlib.pyplot as plt  
else:
    print('The visulization packages are not installed.')
pass

You can also use QGIS.

The plot function provides a few optional arguments such as map projection and spatial extent. 
By default, the spatial extent is full. 
But you can set the extent to a zoom-in region.

Now let's run the the submodel pyflowline.

In [37]:
#run step 1
aCell_origin = oPyhexwatershed.pyhexwatershed_run_pyflowline();

Started running pyflowline
Start flowline simplification: 00000001
Basin  00000001  has no dam
Basin  00000001 find flowline vertex
Elapsed time: 0.0041 seconds
Basin  00000001 split flowline
Elapsed time: 0.3807 seconds
Basin  00000001 started correction flow direction
Elapsed time: 0.0264 seconds
-164.46458333333362 63.03124999999984
Basin  00000001 started loop removal
Elapsed time: 0.0069 seconds
Basin  00000001 started update stream order initial
Elapsed time: 0.0078 seconds
Basin  00000001 find flowline confluence
Elapsed time: 0.0004 seconds
Basin  00000001 started stream segment definition
Elapsed time: 0.0000 seconds
Basin  00000001 started confluence definition
Elapsed time: 0.0001 seconds
Basin  00000001 started stream topology definition
Elapsed time: 0.0000 seconds
Basin  00000001 started stream order definition
Elapsed time: 0.0018 seconds
Finish flowline simplification: 00000001
Start mesh generation.
GTiff pDriver IS available.

Resolution is:  31759.6
./run_dggrid.sh
*

and check the result using a plot

Similarly, we can zoom in using the extent.

Next, we will creata a mesh from the global MPAS mesh.

we can also use a polygon to create a mesh

Last, we can generate the conceptual flowline.

Now we can overlap mesh with flowline.

In [38]:

oPyhexwatershed.pyhexwatershed_assign_elevation_to_cells();


Started assigning elevation
GTiff pDriver IS available.

Elapsed time: 1.6194 seconds
Started updating outlet
Elapsed time: 0.0003 seconds
PyFlowline started export results
Elapsed time: 0.3954 seconds
Started running hexwatershed
Running on a Unix-based system
Start to run HexWatershed model!
The following arguments are provided:
/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/configuration.json
The configuration file is:/Users/liao313/workspace/python/hexwatershed_tutorial/data/yukon/output/pyhexwatershed20240416002/configuration.json
Finished set up  model at Tue Apr 16 19:46:26 2024

Finished reading data at Tue Apr 16 19:46:26 2024

Finished initialization at Tue Apr 16 19:46:26 2024

This is a local simulation.
This is a local simulation with only one outlet.
 You have assigned the correct outlet mesh ID!
Finished depression filling at Tue Apr 16 19:46:26 2024

Finished flow direction at Tue Apr 16 19:46:26 2024

Finished flow acc

In [None]:
aCell_new = oPyhexwatershed.pyhexwatershed_update_outlet(aCell_origin)


In [None]:
oPyhexwatershed.pPyFlowline.pyflowline_export()
oPyhexwatershed.pyhexwatershed_export_config_to_json()


In [None]:
oPyhexwatershed.pyhexwatershed_run_hexwatershed();


In [None]:

oPyhexwatershed.pyhexwatershed_export()


After this, we can save the model output into a json file.

In [31]:
#export output

file1_path = oPyhexwatershed.sFilename_mesh
file2_path = oPyhexwatershed.aBasin[0].sFilename_flow_direction
gdf1 = gpd.read_file(file1_path)
gdf2 = gpd.read_file(file2_path)
fig, ax = plt.subplots()
gdf1.plot(ax=ax, color='blue')
gdf2.plot(ax=ax, color='red')
plt.show()
pass

the content of the one of the exported json files can be checked:

The outlet associated flowline is always assigned with a dam, because it would be preserved.

Congratulations! You have successfully finished a pyflowline simulation.