<img src='https://repository-images.githubusercontent.com/121802384/c355bb80-7d42-11e9-9e0e-4729609f9fbc' alt='WRF-Hydro Logo' width="15%"/>

# Open Source GIS Pre-Processing Tutorial

This notebook will rely on the WRF-Hydro GIS Pre-processing tools, found here:  
* https://github.com/NCAR/wrf_hydro_gis_preprocessor

## Table of Contents
1. [Create domain boundary shapefile](#1.-Create-domain-boundary-shapefile)<br>
2. [Build GeoTiff raster from a WPS Geogrid file](#2.-Build-GeoTiff-raster-from-a-WPS-Geogrid-file)<br>
3. [Building the hydrologic routing grids, aka the "routing stack"](#3.-Building-the-hydrologic-routing-grids,-aka-the-"routing-stack")<br>
4. [Understanding the outputs](#4.-Understanding-the-outputs)
5. [Examine outputs of GIS pre-processor](#5.-Examine-outputs-of-GIS-pre-processor)
6. [Vizualize the output grids](#6.-Vizualize-the-output-grids)
7. [Optional - Build Non-NWM WRF-Hydro Configurations of the Pocono, PA Test Case](#7.-[Optional]-Build-Non-NWM-WRF-Hydro-Configurations-of-the-Pocono,-PA-Test-Case)

#### First, set the file paths for inputs and outputs

Throughout this exercise, we will use python variables to store directory paths and other variables. However, we will call all GIS Pre-processing functionality as though it were on the command line. This is done by adding `!` syntax before the command-line syntax, to execute the line using bash.


In this cell, Python variables are created that point to the file paths of the test-case data and an output directory is defined to store the data created by these tools.

In [2]:
# Create a symlink to the GIS data directory - this step helps support multiple example cases
! ln -sf /home/docker/GIS_Training/Pocono_Lambert /home/docker/GIS_Training/GIS_DATA

# Import python core modules
import os
import shutil

# Set root directory for GIS lesson
gis_data_folder = "/home/docker/GIS_Training"

# Change the directory to the GIS_Training directory and get current working directory
os.chdir(gis_data_folder)
cwd = os.getcwd()

# Set paths to known input and output directories and files
data_folder = os.path.join(cwd, 'GIS_DATA')
in_geogrid = os.path.join(data_folder, 'geo_em.d01.nc')
output_folder = os.path.join(cwd, 'Outputs')

# Clear any outputs from previous runs by deleting (if necessary) and re-creating the output directory
if os.path.exists(output_folder):
    shutil.rmtree(output_folder)
os.mkdir(output_folder)

## 1. Create domain boundary shapefile

The tool `Create_Domain_Boundary_Shapefile.py` takes a WRF (WPS) output file, aka "Geogrid file", and creates a polygon shapefile that defines the boundary of the domain as a single rectangular polygon in projected coordinates. The script will read metadata in the geogrid file and the output shapefile will be in the projection of the WRF domain. The unstaggered grid, or "Mass" grid (e.g. "HGT_M" variable), is used as the routing grid domain by WRF-Hydro.

#### Request help message from the script
This is an example of the syntax for calling the `Create_Domain_Boundary_Shapefile.py` tool on the command line. By following the tool with `-h` or `--help`, we are able to call the help argument which explains the purpose of the tool, shows the different arguments we can use for this tool, as well as the descriptions for each argument. 

When the tool is run in the terminal, it is not necessary to use the exclamation point. In Jupyter Notebook, we can execute command-line syntax using "!". 

In [3]:
%%bash
ls /home/docker/wrf_hydro_gis_preprocessor/wrfhydro_gis
#cp /home/docker/wrf_hydro_gis_preprocessor/wrfhydro_gis/
ls

Build_GeoTiff_From_Geogrid_File.py
Build_Groundwater_Inputs.py
Build_PRJ_From_Geogrid_File.py
Build_Routing_Stack.py
Build_Spatial_Metadata_File.py
Create_Domain_Boundary_Shapefile.py
Create_SoilProperties_and_Hydro2D.py
Create_latitude_longitude_rasters.py
Create_wrfinput_from_Geogrid.py
Examine_Outputs_of_GIS_Preprocessor.py
Forecast_Point_Tools.py
Harmonize_Soils_to_LANDMASK.py
Testing_DEM_interpolation.py
Unused_Code.py
__init__.py
wrfhydro_functions.py
Build_GeoTiff_From_Geogrid_File.py
Build_Groundwater_Inputs.py
Build_PRJ_From_Geogrid_File.py
Build_Routing_Stack.py
Build_Spatial_Metadata_File.py
Create_Domain_Boundary_Shapefile.py
Create_latitude_longitude_rasters.py
Create_wrfinput_from_Geogrid.py
Croton_Lambert
Examine_Outputs_of_GIS_Preprocessor.py
Forecast_Point_Tools.py
GIS_DATA
Outputs
Pocono_Lambert
README.md
Testing_DEM_interpolation.py
Unused_Code.py
__init__.py
__pycache__
jupyter_functions.py
wrfhydro_functions.py


In [4]:
%%bash
ls /home/docker/GIS_Training/GIS_DATA/
ls /home/docker/wrf_hydro_gis_preprocessor/wrfhydro_gis

NED_30m_DEM.tif
Pocono_Lambert
forecast_points.csv
geo_em.d01.nc
lake_shapes
namelist.wps
Build_GeoTiff_From_Geogrid_File.py
Build_Groundwater_Inputs.py
Build_PRJ_From_Geogrid_File.py
Build_Routing_Stack.py
Build_Spatial_Metadata_File.py
Create_Domain_Boundary_Shapefile.py
Create_SoilProperties_and_Hydro2D.py
Create_latitude_longitude_rasters.py
Create_wrfinput_from_Geogrid.py
Examine_Outputs_of_GIS_Preprocessor.py
Forecast_Point_Tools.py
Harmonize_Soils_to_LANDMASK.py
Testing_DEM_interpolation.py
Unused_Code.py
__init__.py
wrfhydro_functions.py


In [5]:
%%bash
ls 
#import osgeo

Build_GeoTiff_From_Geogrid_File.py
Build_Groundwater_Inputs.py
Build_PRJ_From_Geogrid_File.py
Build_Routing_Stack.py
Build_Spatial_Metadata_File.py
Create_Domain_Boundary_Shapefile.py
Create_latitude_longitude_rasters.py
Create_wrfinput_from_Geogrid.py
Croton_Lambert
Examine_Outputs_of_GIS_Preprocessor.py
Forecast_Point_Tools.py
GIS_DATA
Outputs
Pocono_Lambert
README.md
Testing_DEM_interpolation.py
Unused_Code.py
__init__.py
__pycache__
jupyter_functions.py
wrfhydro_functions.py


In [6]:
# Execute script on the command-line, requesting tool help (parameter -h)
! python Create_Domain_Boundary_Shapefile.py -h

Script initiated at Fri Apr 25 19:35:48 2025
usage: Create_Domain_Boundary_Shapefile.py [-h] -i IN_NC -o OUT_DIR

This tool takes an WRF Geogrid file and creates a single polygon shapefile
that makes up the boundary of the domain of the M-grid (HGT_M, for example).

options:
  -h, --help  show this help message and exit
  -i IN_NC    Path to WPS geogrid (geo_em.d0*.nc) file or WRF-Hydro
              Fulldom_hires.nc file.
  -o OUT_DIR  Output directory.


----
You will see in the messages above that the tool provides a brief explanation of the expected input and output parameters. This tool requires a geogrid file as input (`-i`) and a directory to write the outputs into (`-o`)

#### Execute the script using command-line syntax
Now that we know what arguments are needed for this tool, we can enter those arguments and run the tool. For this script, we only need to specify the file path to the WPS geogrid file and an output folder to save the result. The result of this tool is a shapefile that shows the geographic boundary of the domain, as defined in the geogrid file. 

When running this tool in Jupyter, we can use brackets around our python variable names, and Jupyter will substitute the variable values when executing the syntax. This is akin to using an environment variable on the command-line. For the sake of repeatability, we also print the full syntax for reference. This can be copied into the terminal if desired.

In [7]:
# Print information to screen for reference
print('Command to run:\n')
print('python Create_Domain_Boundary_Shapefile.py \\\n\t -i {0} \\\n\t -o {1}\n'.format(in_geogrid, output_folder))

# Run the script with required parameters
! python Create_Domain_Boundary_Shapefile.py -i {in_geogrid} -o {output_folder}

Command to run:

python Create_Domain_Boundary_Shapefile.py \
	 -i /home/docker/gis/GIS_Training/GIS_DATA/geo_em.d01.nc \
	 -o /home/docker/gis/GIS_Training/Outputs

Script initiated at Fri Apr 25 19:35:57 2025
WPS netCDF projection identification initiated...
    Map Projection: Lambert Conformal Conic
    Using MOAD_CEN_LAT for latitude of origin.
    Using Standard Parallel 2 in Lambert Conformal Conic map projection.
    Geo-referencing step completed without error in  0.04 seconds.
    Created projection definition from input NetCDF GEOGRID file.
      ESRI Shapefile driver is available.
  Done producing output vector polygon shapefile in  0.02 seconds
  Output shapefile: /home/docker/gis/GIS_Training/Outputs/geo_em.d01_boundary.shp
Process completed in 0.08 seconds.


----
The messages returned by the tool can be quite useful. You will see the coordinate system information printed to the screen and any other progress messages. 

#### Visualize the domain boundary shapefile created in the above example

Now that the domain boundary shapefile has been created, we want to see where the domain is relative to other features on a map. The next cell creates a map and adds the domain boundary as a layer. Use the map to explore the domain. A swipe feature allows the basemap to be changed between OpenStreetMap and satellite imagery. 

In [8]:
import json
import geopandas
from ipyleaflet import Map, GeoJSON, ScaleControl, FullScreenControl, basemaps, SplitMapControl, basemap_to_tiles, LayersControl
from jupyter_functions import create_map

import warnings
warnings.filterwarnings("ignore")

# Setup display items
boundary_shp = os.path.join(output_folder,'geo_em.d01_boundary.shp')
b_shp = geopandas.read_file(boundary_shp)
b_shp = b_shp.to_crs(epsg=4326)

# Export vector to GeoJSON
b_json = os.path.join(output_folder, 'boundary.json')
b_shp.to_file(b_json, driver='GeoJSON')

# Read GeoJSON
with open(b_json, 'r') as f:
    data = json.load(f)
    
# Obtain vector center point
x = b_shp.geometry.centroid.x
y = b_shp.geometry.centroid.y
map_center = y[0], x[0]

# Instantiate map object
m = Map(center=(41.50, -73.73), zoom=10, scroll_wheel_zoom=True)

# Read GeoJSON
with open(b_json, 'r') as f:
    data = json.load(f)

# Obtain vector center point
x = b_shp.geometry.centroid.x
y = b_shp.geometry.centroid.y
map_center = y[0], x[0]

# Instantiate map object
m = create_map(map_center, 10)

# Read GeoJSON
geo_json = GeoJSON(data=data, name='Domain boundary')

# Define basemaps to swipe between
right_layer = basemap_to_tiles(basemap=basemaps.OpenStreetMap.Mapnik)
left_layer = basemap_to_tiles(basemap=basemaps.Esri.WorldImagery)

# Setup basemap swipe control
control = SplitMapControl(left_layer=left_layer, right_layer=right_layer)
m.add_control(control)
m.add_layer(geo_json)

# Draw map
m

Map(center=[np.float64(41.149904512199356), np.float64(-75.48088072504373)], controls=(ZoomControl(options=['p…

## 2. Build GeoTiff raster from a WPS Geogrid file

The tool `Build_GeoTiff_From_Geogrid_File.py` is a program to export variables from a WRF-Hydro input file (geogrid or Fulldom_hires) file to an output raster format, with all spatial and coordinate system metadata. If a 3-dimensional variable is selected, individual raster bands will be created in the output raster for each index in the 3rd dimension. If a 4-dimensional variable is selected, the first index in the 4th dimension will be selected and the variable will be treated as a 3-dimensional variable described above.

This tool is handy for performing a quick vizualization using GIS or othe software to examine the contents of the WRF-Hydro input file and overlay these grids with other goespatial data.

The tool takes three input parameters: an input Geogrid or Fulldom_hires netCDF file (`-i`), a variable name (`-v`), and an output GeoTiff raster file (`-o`) that the tool will create. For this example, we will export the variable "HGT_M", or surface elevation in meters above sea level.

#### Request help message from the script

In [9]:
# Get script help information
! python Build_GeoTiff_From_Geogrid_File.py -h

Script initiated at Fri Apr 25 19:36:12 2025
usage: Build_GeoTiff_From_Geogrid_File.py [-h] -i IN_NC [-v VARIABLE]
                                          [-o OUT_FILE]

This is a program to export >=2D variables from a WRF-Hydro input file
(geogrid or Fulldom_hires) file to an output raster format, with all spatial
and coordinate system metadata. If a 3-dimensional variable is selected,
individual raster bands will be created in the output raster for each index in
the 3rd dimension. If a 4-dimensional variable is selected, the first index in
the 4th dimension will be selected and the variable will be treated as a
3-dimensional variable described above.

options:
  -h, --help   show this help message and exit
  -i IN_NC     Path to WPS geogrid (geo_em.d0*.nc) file or WRF-Hydro
               Fulldom_hires.nc file.
  -v VARIABLE  Name of the variable in the input netCDF file. default=HGT_M
  -o OUT_FILE  Output GeoTiff raster file.


----
#### Execute the script using command-line syntax

In [10]:
# Define the variable to export to raster
in_var = "HGT_M"

# Define the output raster file using variable name defined above
out_file = os.path.join(output_folder, f'{in_var}.tif')

# Print information to screen for reference
print('Command to run:\n')
print('python Build_GeoTiff_From_Geogrid_File.py \\\n\t -i {0} \\\n\t -v {1} \\\n\t -o {2}\n'.format(in_geogrid, in_var, out_file))

# Run the script with required parameters
! python Build_GeoTiff_From_Geogrid_File.py -i {in_geogrid} -v {in_var} -o {out_file}

Command to run:

python Build_GeoTiff_From_Geogrid_File.py \
	 -i /home/docker/gis/GIS_Training/GIS_DATA/geo_em.d01.nc \
	 -v HGT_M \
	 -o /home/docker/gis/GIS_Training/Outputs/HGT_M.tif

Script initiated at Fri Apr 25 19:36:14 2025
Using default variable name: HGT_M
Input WPS Geogrid or Fulldom file: /home/docker/gis/GIS_Training/GIS_DATA/geo_em.d01.nc
Input netCDF variable name: HGT_M
Output raster file: /home/docker/gis/GIS_Training/Outputs/HGT_M.tif
WPS netCDF projection identification initiated...
    Map Projection: Lambert Conformal Conic
    Using MOAD_CEN_LAT for latitude of origin.
    Using Standard Parallel 2 in Lambert Conformal Conic map projection.
    Geo-referencing step completed without error in  0.03 seconds.
  if LooseVersion(netCDF4.__version__) > LooseVersion('1.4.0'):
    X-dimension: 'west_east'.
    Y-dimension: 'south_north'.
    Reversing order of dimension 'south_north'
    Time dimension found: 'Time'.
      Time dimension size = 1.
    Dimensions and indi

#### View outputs
Now that the tool has completed, we want to take a look at the output. We will create another interactive map and load the data as a layer. 

In [11]:
# Import third-party visualization libraries
import rasterio
from matplotlib import pyplot
from osgeo import gdal
from ipyleaflet import ImageOverlay
from jupyter_functions import cmap_options, show_raster_map

# Create a map object from pre-build function
m2 = create_map(map_center, 10)

# Render the map
m2

Map(center=[np.float64(41.149904512199356), np.float64(-75.48088072504373)], controls=(ZoomControl(options=['p…

In [12]:
# Use pre-built function to render the GeoTiff on the map, already warped to the map's coordinate system
show_raster_map(out_file, m2, b_shp, output_folder)

Above, you will see the elevation grid applied to the map. This grid is 1km, so there is not much detail, but it is still useful to see if the topographic features are in the correct geographic locations according to the basemap. Also, there is no color-ramp for reference. This is a limitation of using the web browser over a GIS application.

## 3. Building the hydrologic routing grids, aka the "routing stack"

The `Build_Routing_Stack.py` script is a program to build the full set of hydrologically-processed routing grids and additional data required by WRF-Hydro. This is the main utility for performing WRF-Hydro GIS pre-processing. The required inputs are the domain file (WPS geogrid file), desired routing grid resolution as a function of geogrid resolution, and other options and parameter values. The output will be a "routing stack" zip file with WRF-Hydro domain and parameter files.

• Required Parameters:<br>
&emsp;`-i` -WRF/WPS GEOGRID file (geo_em.d0*.nc)<br>
&emsp;`-d` -High-resolution Elevation raster file (Esri GRID, GeoTIFF, VRT, etc.)<br>
&emsp;`-R` -Regridding Factor – nesting relationship of routing:land surface model grid cells<br>
&emsp;`-t` -Minimum basin area threshold (in routing grid cells)<br>
&emsp;`-o` -Output ZIP File containing all script outputs<br>

• Optional Parameters:<br>
&emsp;`--CSV` -Station Locations location file (.csv)<br>
&emsp;`-b` -Option to mask channel grids not contributing to provided station locations<br>
&emsp;`-r` -Reach based (Muskingum / Muskingum-Cunge) routing option<br>
&emsp;`-l` -Lake Polygons (polygon feature class or .shp)<br>
&emsp;`-O` -OVROUGHRTFAC – Multiplier on Manning's roughness for overland flow. default=1.0<br>
&emsp;`-T` -RETDEPRTFAC – Multiplier on maximum retention depth before flow is routed as overland flow. default=1.0<br>
&emsp;-LKSATFAC – (script global variable) Multiplier on saturated hydraulic conductivity in lateral flow direction. default=1000.0<br>
&emsp;`--starts` -Path to point shapefile or feature class containing channel initiation locations (overrides `-t` parameter)<br>
&emsp;`--gw` -Path to polygon shapefile or feature class containing prescribed groundwater basin locations<br>

#### Request help message from the script
This tool has many different parameters and possible configurations. Using the command line, we can take a look at the different arguments that can be used, if they are required or optional, and what their default values are.

In [13]:
! python Build_Routing_Stack.py -h

Script initiated at Fri Apr 25 19:36:41 2025
usage: Build_Routing_Stack.py [-h] -i IN_GEOGRID [--CSV IN_CSV]
                              [-b BASIN_MASK] [-r RB_ROUTING]
                              [-l IN_RESERVOIRS] -d INDEM [-R CELLSIZE]
                              [-t THRESHOLD] [-o OUT_ZIP_FILE]
                              [-O OVROUGHRTFAC_VAL] [-T RETDEPRTFAC_VAL]
                              [--starts CHANNEL_STARTS] [--gw GW_POLYS]

This is a program to perform the full routing-stack GIS pre-processingfor WRF-
Hydro. The inputs will be related to the domain, the desired routing nest
factor, and other options and parameter values. The output will be a routing
stack zip file with WRF-Hydro domain and parameter files.

options:
  -h, --help            show this help message and exit
  -i IN_GEOGRID         Path to WPS geogrid (geo_em.d0*.nc) file [REQUIRED]
  --CSV IN_CSV          Path to input forecast point CSV file [OPTIONAL]
  -b BASIN_MASK         Mask CHANNELGRID vari

----
#### Execute the script using command-line syntax
We will begin by assigning file paths to python variables, then substitude those values in the command line syntax.

In [117]:
%%bash
#wget https://www.whiteboxgeo.com/WBT_Linux/WhiteboxTools_linux_amd64.zip
#unzip WhiteboxTools_linux_amd64.zip
#cd WhiteboxTools_linux_amd64/WBT
#export WBT_EXECUTABLE_PATH=/home/docker/WhiteboxTools_linux_amd64/WBT
#export WBT_EXECUTABLE_PATH=/home/docker/WhiteboxTools_linux_amd64/WBT

In [129]:
%%bash
#ls $WBT_EXECUTABLE_PATH/WBT
#mv /home/docker/WhiteboxTools_linux_amd64/WBT .
#ls /home/docker/gis/GIS_Training/WBT

LICENSE.txt
UserManual.txt
img
plugins
readme.txt
settings.json
whitebox_tools
whitebox_tools.py


In [14]:
import os

os.environ["WBT_EXECUTABLE_PATH"] = "/home/docker/gis/GIS_Training/WBT/whitebox_tools"

from whitebox.whitebox_tools import WhiteboxTools

wbt = WhiteboxTools()

Downloading WhiteboxTools pre-compiled binary for first time use ...
Downloading WhiteboxTools binary from https://www.whiteboxgeo.com/WBT_Linux/WhiteboxTools_linux_amd64.zip
Decompressing WhiteboxTools_linux_amd64.zip ...
WhiteboxTools package directory: /home/docker/miniconda3/lib/python3.12/site-packages/whitebox
Downloading testdata ...


In [15]:
import Build_Routing_Stack

# Define script input parameters using python variables
in_geogrid = os.path.join(data_folder, 'geo_em.d01.nc')
lakes = os.path.join(data_folder, 'lake_shapes', 'lakes.shp')
csv = os.path.join(data_folder, 'forecast_points.csv')
in_dem = os.path.join(data_folder, 'NED_30m_DEM.tif')
regrid_factor = 4
routing_cells = 25
out_zip = os.path.join(output_folder, 'pocono_test.zip')

# Print information to screen for reference
print('Command to run:\n')
print('python Build_Routing_Stack.py \\\n\t -i {0} \\\n\t -l {1} \\\n\t --CSV {2} \\\n\t -d {3} \\\n\t -R {4} \\\n\t -t {5} \\\n\t -o {6}\n'.format(in_geogrid, lakes, csv, in_dem, regrid_factor, routing_cells, out_zip))

# Run the script with required parameters
! python Build_Routing_Stack.py -i {in_geogrid} -l {lakes} --CSV {csv} -d {in_dem} -R {regrid_factor} -t {routing_cells} -o {out_zip}

Command to run:

python Build_Routing_Stack.py \
	 -i /home/docker/gis/GIS_Training/GIS_DATA/geo_em.d01.nc \
	 -l /home/docker/gis/GIS_Training/GIS_DATA/lake_shapes/lakes.shp \
	 --CSV /home/docker/gis/GIS_Training/GIS_DATA/forecast_points.csv \
	 -d /home/docker/gis/GIS_Training/GIS_DATA/NED_30m_DEM.tif \
	 -R 4 \
	 -t 25 \
	 -o /home/docker/gis/GIS_Training/Outputs/pocono_test.zip

Script initiated at Fri Apr 25 19:37:11 2025
  Parameter values that have not been altered from script default values:
    Using default basin mask setting: False
    Using default reach-based routing setting: False
    Using default OVROUGHRTFAC parameter value: 1.0
    Using default RETDEPRTFAC parameter value: 1.0
  Values that will be used in building this routing stack:
    Input WPS Geogrid file: /home/docker/gis/GIS_Training/GIS_DATA/geo_em.d01.nc
    Forecast Point CSV file: /home/docker/gis/GIS_Training/GIS_DATA/forecast_points.csv
    Mask CHANNELGRID variable to forecast basins?: False
    Creat

### 4. Understanding the outputs

The `Build_Routing_Stack.py` script creates a Zip archive of output files according to the options provided to the tool. There will be at least four netCDF files. The output Zip file may additionally include shapefiles (.shp and accompanying files) describing the geometry of modeled lakes or the vector stream network. Below is an alphabetically sorted list of gridded variables that are created by the `Build_Routing_Stack.py` tool.

Fulldom_hires.nc file. This file stores 2D gridded variables that describe the hydro routing grid:
  + CHANNELGRID - The channel grid. Channel pixels = 0, non-channel pixels = -9999. If the `-b` option is set to TRUE, the output will be masked to the gaged basins provided, where non-gaged channels are given a value of ‘-1’. If lake routing is activated, lake outflow points will be identified by the lake ID value.
  + FLOWACC – Flow accumulation grid. This grid gives the number of contributing cells for each cell in the domain. This grid is provided for convenience and is not read by WRF-Hydro.
  + FLOWDIRECTION – Flow direction grid. This grid gives the direction of flow using the D8 algorithm between each cell and the steepest downslope neighbor according to Jenson and Domingue (1988). The result is an integer grid with values ranging from 1 to 128.
  + frxst_pts – Gage location grid. If a forecast point CSV file is provided, the grid will have a cell identified at the location of each forecast point (gage) in the gage CSV file. If no input CSV gage location file is provided, this grid will be uniform with values of ‘-9999’. Gage pixels are numbered in the same way as the ‘basn_msk’ grid. NoData cells are given a value of ‘-9999’.
  + basn_msk – Forecast basins grid. If a CSV gage location file is provided, catchments are delineated from a point that is up to 3 pixels downstream of the gage coordinates. This distance can be modified by altering the ‘walker’ global variable in the ‘wrfhydro_functions.py’ script. If masking of the ‘CHANNELGRID’ is selected, this layer is the mask. Basins are numbered according to the values in the ‘FID’ field of the input gage CSV file. If no gage location file is provided, this grid will be uniform with values of ‘-9999’.
  + LAKEGRID – The lake grid. If a lake polygon shapefile is provided to the `-l` parameter, this grid will contain ID values for each lake that can be resolved on the routing grid. Otherwise, this grid will be uniform with values of -9999.
  + landuse – This is the same data as the ‘LU_INDEX’ variable in the GEOGRID file, but resampled using Nearest Neighbor assignment to the resolution of the routing grid. This grid is provided for convenience and is not read by WRF-Hydro.
  + LATITUDE – Grid of the latitude at the center of each grid cell, in a geographic coordinate system (WGS84).
  + LINKID – The channel ID grid. This grid provides a unique integer identifier for each channel segment that is defined in the ‘link’ variable of the ‘Route_Link.nc’ file and the ‘STRM_VAL’ field in the ‘streams.shp’ shapefile. The ‘LINKID’ grid will only be created if the option `-r` is TRUE.
  + LONGITUDE – Grid of longitude value at the center of each grid cell, in a geographic coordinate system (WGS84).
  + OVROUGHRTFAC – OVROUGHRTFAC parameter. Currently set to a default of 1.0. This default value may be changed by providing an alternate value to the `-O` parameter.
  + RETDEPRTFAC – RETDEPRTFAC (retention depth multiplier) parameter. Currently set to a default of 1.0. This default value may be changed by providing an alternate value `-T` parameter.
  + STREAMORDER – Stream order grid, calculated using the Strahler method (Strahler 1957).
  + TOPOGRAPHY – Elevation grid. The units of elevation are the same as the input elevation raster dataset (`-d INDEM`), which should be specified in meters (m) above sea level (ASL). This grid is derived from the elevation values in the input elevation raster, but has been resampled to the routing grid resolution, and pit-filled to remove depressions.

Other files:
  + GEOGRID_LDASOUT_Spatial_Metadata.nc  - This is a CF-netCDF format file that provides the spatial metadata associated with the GEOGRID variables, which contsitute the LSM grid. By default, no 2-dimensional grids are written to the file. This file may be used by WRF-Hydro for appending geospatial metadata to the land surface model output, if necessary.
  + GWBASINS.nc - This is a 2D netCDF file of the location of groundwater basins, regridded to the LSM grid resolution. NoData cells are given a value of ‘-9999’. This file is by default created using the ‘FullDom LINKID local basins’ method of defining groundwater basins.
  + GWBUCKPARM.nc - The 1D groundwater basin parameter file.
  + LAKEPARM.nc – Lake parameter table. This 1D netCDF format file is created if a lake shapefile is provided as input to the `-l` parameter. The table will contain a record for each lake feature in the + Fulldom_hires.nc ‘LAKEGRID’ variable, and contain derived and default parameters for each lake.
  + Route_Link.nc – The reach-based routing parameter file. This 1D netCDF format file is created if the `-r` parameter is TRUE. The file contains a record for each stream segment. The stream segments in this table are also identified by the unique ‘LINKID’ values in the ‘LINKID’ variable in the ‘Fulldom_hires.nc’ file, and values in the ‘STRM_VAL’ field of the output ‘streams.shp’ shapefile. This table contains derived and default stream segment parameters that are calculated based on the vector stream network and topology in the ‘streams.shp’ shapefile.
  + streams.* (ancillary) - Streams shapefile, containing one feature for each stream segment in the domain. This file is meant to accompany the ‘Route_Link.nc’ reach-based routing parameter file and Fulldom_hires.nc ‘LINKID’ variable. The ‘streams’ shapefile is only created when the option `-l` is used. The ‘STRM_VAL’ field is the unique identifier for each stream segment, and corresponds to the ‘link’ variable of the ‘Route_Link.nc’ file and the ‘LINKID’ variable in the ‘Fulldom_hires.nc’ file. The geometry of the stream segments in this shapefile informs many of the parameters in the ‘Route_Link.nc’ file.
  + lakes.* (ancillary) - Lakes shapefile, containing one feature for each reservoir in the simulation domain. This file is meant to accopmany the 'LAKEPARM.nc' reservoir parameter file. If `-r TRUE` is used, then Fulldom_hires.nc 'LAKEID' variable will contain -9999 values only. The geometry of reservoir objects informs many of the parameters in 'LAKEPARM.nc' file.


### 5. Examine outputs of GIS pre-processor

The `Examine_Outputs_of_GIS_Preprocessor.py` script is a tool used to create easy-to-use geospatial files for examining the resulting WRF-Hydro input files in a GIS. This script takes the output ZIP file generated using the Process Geogrid script (executed above) and creates a raster from each 2D variable in each WRF-Hydro input netCDF file. In addition, other data will be extracted such as any shapefiles, 1D netCDF tables, etc. The input to the tool should be a .zip file that was created using the WRF Hydro pre-processing tools. The tool will create the output folder if it does not already exist, and write all results to that location.

#### Request help message from the script
This tool has a single input and single output parameter. Using the command line, we can take a look at the arguments.

In [20]:
! python Examine_Outputs_of_GIS_Preprocessor.py -h

Script initiated at Fri Apr 25 19:40:42 2025
usage: Examine_Outputs_of_GIS_Preprocessor.py [-h] -i IN_ZIP -o OUT_FOLDER

This tool takes the output zip file from the ProcessGeogrid script and creates
a raster from each output NetCDF file. The Input should be a .zip file that
was created using theWRF Hydro pre-processing tools. The tool will create the
folder which will contain theresults (out_folder), if that folder does not
already exist.

options:
  -h, --help     show this help message and exit
  -i IN_ZIP      Path to WRF Hydro routing grids zip file.
  -o OUT_FOLDER  Path to output folder.


----
#### Execute the script using command-line syntax
We will define the output directory for the tool to create, then use defined variables to execute the command line tool.

In [21]:
# Define output directory to store GeoTiff output of all routing stack grids
raster_outputs = os.path.join(output_folder, "Raster_Outputs")

# Print information to screen for reference
print('Command to run:\n')
print('python Examine_Outputs_of_GIS_Preprocessor.py \\\n\t -i {0} \\\n\t -o {1}\n'.format(out_zip, raster_outputs))

# Run the script with required parameters
! python Examine_Outputs_of_GIS_Preprocessor.py -i {out_zip} -o {raster_outputs}

Command to run:

python Examine_Outputs_of_GIS_Preprocessor.py \
	 -i /home/docker/gis/GIS_Training/Outputs/pocono_test.zip \
	 -o /home/docker/gis/GIS_Training/Outputs/Raster_Outputs

Script initiated at Fri Apr 25 19:41:00 2025
    File Copied: lakes.dbf
  if LooseVersion(netCDF4.__version__) > LooseVersion('1.4.0'):
  GeoTransform: (1713000.911056803, 1000.0, 0.0, 247068.34176177936, 0.0, -1000.0)
  DX: 1000.0
  DY: 1000.0
  PROJ.4 string: +proj=lcc +lat_0=41.14990234375 +lon_0=-97 +lat_1=30 +lat_2=60 +x_0=0 +y_0=0 +R=6370000 +units=m +no_defs
    File Created: /home/docker/gis/GIS_Training/Outputs/Raster_Outputs/BASIN.tif
    File Copied: LAKEPARM.nc
    File Copied: lakes.shx
    File Copied: lakes.shp
    File Copied: GWBUCKPARM.nc
    File Copied: GEOGRID_LDASOUT_Spatial_Metadata.nc
  GeoTransform: (1713000.911056803, 250.0, 0.0, 247068.34176177936, 0.0, -250.0)
  DX: 250.0
  DY: 250.0
  PROJ.4 string: +proj=lcc +lat_0=41.14990234375 +lon_0=-97 +lat_1=30 +lat_2=60 +x_0=0 +y_0=0 

### 6. Vizualize the output grids

Finally, we want to take a look at some of the output rasters. Below, we utilize a Jupyter widget to choose each raster from a drop down menu. Each of the 2D variables in Fulldom_hires.nc (the routing grid netCDF file) will be displayed as a rectangular grid. Take a look at some of the outputs.

In [22]:
import ipywidgets as widgets
from ipywidgets import interact

def see_raster(x):
    src = rasterio.open(os.path.join(raster_outputs, f"{x}.tif"))
    cmap, norm = cmap_options(x)
    if x in ['TOPOGRAPHY']:
        pyplot.imshow(src.read(1), cmap=cmap,  aspect='auto', norm=norm, interpolation='nearest', vmin=0)
    else:
        pyplot.imshow(src.read(1), cmap=cmap,  aspect='auto', norm=norm, interpolation='nearest')
    cbar = pyplot.colorbar()
    
    # Keep the automatic aspect while scaling the image up in size
    fig = pyplot.gcf()
    w, h = fig.get_size_inches()
    fig.set_size_inches(w * 1.75, h * 1.75)
    
    # Show image
    pyplot.show()

in_raster = widgets.Dropdown(
    options=[('Basin', 'BASIN'), ('Basin mask', 'basn_msk'), ('Channel grid', 'CHANNELGRID'), ('Flow accumulation', 'FLOWACC'),
            ('Flow direction', 'FLOWDIRECTION'), ('Forecast points', 'frxst_pts'), ('Lake grid', 'LAKEGRID'),
            ('Land use', 'landuse'), ('Latitude', 'LATITUDE'), ('LKSATFAC', 'LKSATFAC'), ('Longitude', 'LONGITUDE'),
            ('OVROUGHRTFAC', 'OVROUGHRTFAC'), ('RETDEPRTFAC', 'RETDEPRTFAC'), ('Stream order', 'STREAMORDER'),
            ('Topography', 'TOPOGRAPHY')],
    value='FLOWACC',
    description='Variable:')

interact(see_raster, x=in_raster)

interactive(children=(Dropdown(description='Variable:', index=3, options=(('Basin', 'BASIN'), ('Basin mask', '…

<function __main__.see_raster(x)>

We can also look at the raster data projected onto a map. Use the dropdown widget below the map to add rasters as layers to the map. Use the menu in the top right corner of the map to turn layers on and off. Behind the scenes, each model grid is being reprojected to Web Mercator for overlay on a WMS basemap for some interactivity.

In [23]:
# Create a map object from pre-build function
m3 = create_map(map_center, 10)    # Create the map
m3                                 # Render the map

Map(center=[np.float64(41.149904512199356), np.float64(-75.48088072504373)], controls=(ZoomControl(options=['p…

In [24]:
def f(x):
    r_path = os.path.join(raster_outputs, f"{x}.tif")
    show_raster_map(r_path, m3, b_shp, output_folder)

in_raster = widgets.Dropdown(
    options=[('Basin', 'BASIN'), ('Forecast point basins', 'basn_msk'), ('Channel grid', 'CHANNELGRID'), ('Flow accumulation', 'FLOWACC'), ('Flow direction', 'FLOWDIRECTION'), 
             ('Forecast points', 'frxst_pts'), ('Lake grid', 'LAKEGRID'), ('Landuse', 'landuse'), ('OVROUGHRTFAC grid', 'OVROUGHRTFAC'), ('RETDEPRTFAC grid', 'RETDEPRTFAC'), 
             ('Stream order', 'STREAMORDER'), ('Topography', 'TOPOGRAPHY')],
    value='TOPOGRAPHY',
    description='Variable:',
)
interact(f, x=in_raster)

interactive(children=(Dropdown(description='Variable:', index=11, options=(('Basin', 'BASIN'), ('Forecast poin…

<function __main__.f(x)>

### 7. [Optional] Build Non-NWM WRF-Hydro Configurations of the Pocono, PA Test Case

We will run command-line arguments to create output directories for each of the test cases. Then, each cell will run the WRF-Hydro GIS Pre-processor to build the 'routing stack' files. The final command is used to unzip the outputs so they can be more readily used by WRF-Hydro. There are 4 Non-NWM test-case configurations which can all be created using different combinations of arguments to the GIS pre-processor. Currently missing at this time is the Reach-with-Lakes capability as well as the ability to build wrfinput, hydro2dtbl, and soil_properties files.

#### A. Gridded channel routing configuration with reservoirs and forecast points

This domain is a gridded domain with lakes, gages and a regridding factor of 4 and threshold of 20

In [25]:
# Create the output directory and ensure it is empty
! mkdir -p /home/docker/GIS_Training/Outputs/Gridded
! rm -rf /home/docker/GIS_Training/Outputs/Gridded/*

# Run the GIS Pre-processing script (with line-breaks)
! python Build_Routing_Stack.py \
    -i /home/docker/wrf-hydro-training/example_case/Gridded/DOMAIN/geo_em.d01.nc \
    -l /home/docker/GIS_Training/GIS_DATA/lake_shapes/lakes.shp \
    --CSV /home/docker/GIS_Training/GIS_DATA/forecast_points.csv \
    -d /home/docker/GIS_Training/GIS_DATA/NED_30m_DEM.tif \
    -R 4 \
    -t 25 \
    -o /home/docker/GIS_Training/Outputs/Gridded/Gridded_r4_t25_lakes_frxst_mask.zip

# Unzip the directory in-place
! unzip /home/docker/GIS_Training/Outputs/Gridded/Gridded_r4_t25_lakes_frxst_mask.zip -d /home/docker/GIS_Training/Outputs/Gridded

Script initiated at Fri Apr 25 19:41:25 2025
  Parameter values that have not been altered from script default values:
    Using default basin mask setting: False
    Using default reach-based routing setting: False
    Using default OVROUGHRTFAC parameter value: 1.0
    Using default RETDEPRTFAC parameter value: 1.0
  Values that will be used in building this routing stack:
    Input WPS Geogrid file: /home/docker/wrf-hydro-training/example_case/Gridded/DOMAIN/geo_em.d01.nc
    Forecast Point CSV file: /home/docker/GIS_Training/GIS_DATA/forecast_points.csv
    Mask CHANNELGRID variable to forecast basins?: False
    Create reach-based routing (RouteLink) files?: False
    Lake polygon feature class: /home/docker/GIS_Training/GIS_DATA/lake_shapes/lakes.shp
    Input high-resolution DEM: /home/docker/GIS_Training/GIS_DATA/NED_30m_DEM.tif
    Regridding factor: 4
    Stream initiation threshold: 25
    OVROUGHRTFAC parameter value: 1.0
    RETDEPRTFAC parameter value: 1.0
    Input chann

#### B. Gridded channel routing configuration with forecast points, and channel masking

This domain is a gridded domain without lakes. It contains gages, masked basins and a regridding factor of 4 and threshold of 20

In [26]:
# Create the output directory and ensure it is empty
! mkdir -p /home/docker/GIS_Training/Outputs/Gridded_no_lakes
! rm -rf /home/docker/GIS_Training/Outputs/Gridded_no_lakes/*

# Run the GIS Pre-processing script (with line-breaks)
! python Build_Routing_Stack.py \
    -i /home/docker/wrf-hydro-training/example_case/Gridded_no_lakes/DOMAIN/geo_em.d01.nc \
    --CSV /home/docker/GIS_Training/GIS_DATA/forecast_points.csv \
    -d /home/docker/GIS_Training/GIS_DATA/NED_30m_DEM.tif \
    -R 4 \
    -t 25 \
    -b True \
    -o /home/docker/GIS_Training/Outputs/Gridded_no_lakes/Gridded_r4_t25_frxst_mask.zip

# Unzip the directory in-place
! unzip /home/docker/GIS_Training/Outputs/Gridded_no_lakes/Gridded_r4_t25_frxst_mask.zip -d /home/docker/GIS_Training/Outputs/Gridded_no_lakes/

Script initiated at Fri Apr 25 19:41:29 2025
  Parameter values that have not been altered from script default values:
    Using default reach-based routing setting: False
    Using default OVROUGHRTFAC parameter value: 1.0
    Using default RETDEPRTFAC parameter value: 1.0
  Values that will be used in building this routing stack:
    Input WPS Geogrid file: /home/docker/wrf-hydro-training/example_case/Gridded_no_lakes/DOMAIN/geo_em.d01.nc
    Forecast Point CSV file: /home/docker/GIS_Training/GIS_DATA/forecast_points.csv
    Mask CHANNELGRID variable to forecast basins?: True
    Create reach-based routing (RouteLink) files?: False
    Lake polygon feature class: 
    Input high-resolution DEM: /home/docker/GIS_Training/GIS_DATA/NED_30m_DEM.tif
    Regridding factor: 4
    Stream initiation threshold: 25
    OVROUGHRTFAC parameter value: 1.0
    RETDEPRTFAC parameter value: 1.0
    Input channel initiation start point feature class: None
    Input groundwater basin polygons: None
   

#### C. Reach-based channel routing configuration with forecast points

This domain is a reach-based routing configuration with gages, no masking, no lakes. Regridding factor of 4 and threshold of 20

In [27]:
# Create the output directory and ensure it is empty
! mkdir -p /home/docker/GIS_Training/Outputs/Reach
! rm -rf /home/docker/GIS_Training/Outputs/Reach/*

# Run the GIS Pre-processing script (with line-breaks)
! python Build_Routing_Stack.py \
    -i /home/docker/wrf-hydro-training/example_case/Reach/DOMAIN/geo_em.d01.nc \
    -r True \
    --CSV /home/docker/GIS_Training/GIS_DATA/forecast_points.csv \
    -d /home/docker/GIS_Training/GIS_DATA/NED_30m_DEM.tif \
    -R 4 \
    -t 25 \
    -o /home/docker/GIS_Training/Outputs/Reach/Reach_r4_t25_frxst.zip

# Unzip the directory in-place
! unzip /home/docker/GIS_Training/Outputs/Reach/Reach_r4_t25_frxst.zip -d /home/docker/GIS_Training/Outputs/Reach/

Script initiated at Fri Apr 25 19:41:34 2025
  Parameter values that have not been altered from script default values:
    Using default basin mask setting: False
    Using default OVROUGHRTFAC parameter value: 1.0
    Using default RETDEPRTFAC parameter value: 1.0
  Values that will be used in building this routing stack:
    Input WPS Geogrid file: /home/docker/wrf-hydro-training/example_case/Reach/DOMAIN/geo_em.d01.nc
    Forecast Point CSV file: /home/docker/GIS_Training/GIS_DATA/forecast_points.csv
    Mask CHANNELGRID variable to forecast basins?: False
    Create reach-based routing (RouteLink) files?: True
    Lake polygon feature class: 
    Input high-resolution DEM: /home/docker/GIS_Training/GIS_DATA/NED_30m_DEM.tif
    Regridding factor: 4
    Stream initiation threshold: 25
    OVROUGHRTFAC parameter value: 1.0
    RETDEPRTFAC parameter value: 1.0
    Input channel initiation start point feature class: None
    Input groundwater basin polygons: None
    Output ZIP file: /h

You have reached the end of Lesson S2

© UCAR 2025