# Lesson S1 - Subsetting the domain files

## Overview
In this lesson, we will review the subsetting scripts used at NCAR for subsetting a small part of the domain from the full domain files. Here, we will start with an overview of the static domain and parameter files required by the configuration used for Idaho, then we will briefly discuss and run the subsetting scripts as an example.

**IMPORTANT NOTE:** Before we provide some explanation about the static files, it is worth mentioning that the name of these files are not hardcoded in WRF-Hydro and it is defined in the `namelist.hrldas` and `hydro.namelist` files and can be called anything and therefore you would see different names used by different users. Here, we are using the name of the files provided in the example case to talk about the content of the IWAAs static domain files.

## WRF-Hydro Static files (IWAAs configuration)

The following table provides the list of the static files that are required for model run. 

| Filename | Description | Source | 
| ------------- | ------------- | ------------- | 
| geo_em_d0x.nc | The data required to define the domain and geospatial attributes of a spatially-distributed, or gridded, 1-dimensional (vertical) land surface model (LSM) | geogrid.exe utility in the WRF Preprocessing System (WPS) |
| wrfinput_d0x.nc | file including all necessary fields for the Noah-MP land surface model, but with spatially uniform initial conditions. Users should be aware that the model will likely require additional spin-up time when initialized from this file. | create_Wrfinput.R script | 
| Fulldom_hires.nc | High resolution full domain file. Includes all fields specified on the routing grid. | WRF-Hydro GIS Pre-processing toolkit with some custom modification | 
| Route_Link.nc | This file contains all the information and required parameters of reaches required for channel routing | based on the NHDPlus and other custom hydrography dataets |
| spatialweights.nc | netCDF file specifying the weights to map between the land surface grid and the pre-defined groundwater basin boundaries | custom python script | 
| LAKEPARM.nc | Lake parameter table containing lake model parameters for each surface-water reservoir | WRF-Hydro GIS pre-processing toolkit |
| GWBUCKPARM.nc | Groundwater parameter table containing bucket model parameters for each basin | WRF-Hydro GIS pre-processing toolkit |
| hydro2dtbl.nc | Spatially distributed parameter table for lateral flow routing within WRF-Hydro. | create_SoilProperties.R script (will also be automatically generated by WRF-Hydro) | 
| soil_properties.nc | Spatially distributed land surface model parameters | create_SoilProperties.R script | 
| GEOGRID_LDASOUT_Spatial_Metadata.nc | CF-compliant projection and coordinate information for the land surface model grid. | WRF-Hydro GIS Pre-processing toolkit | NO, but allows for CF compliant outputs |

These files will be explained in more detail in the upcoming lessons and you will tweak some of them in the training lessons. Here, we briefly talk about the content of each file.

In general, WRF-Hydro IWAAs configuration proceeds as follows: the 1-dimensional (1D) column land surface model, calculates infiltration and exfiltration on a 1-kilometer grid across the land surface (i.e. LSM grid). These volumes are then disaggregated to a 250 meter “hydro” overland routing grid using a time-step weighted method and the excess water is routed horizontally across a the hydro grid using an overland diffusive wave scheme. This horizontally routed flow is aggregated into catchments at the National Hydrography Databset (NHD-Plus) version 2.1 Medium Resolution scale and transferred into the channel network, inserting the flow at the top of the reaches for routing downstream. The process of moving from the hydro grid to the NHD+ catchments is achieved through a mapping process that calculates a spatial weight between each catchment and hydro grid pixel. So all the 2D variables are either on LSM grid (1 km here) or hydro grid (250 m here).

Geospatial inputs for WRF-Hydro are typically generated using the [WRF Pre-Processing System (WPS)](www2.mmm.ucar.edu/wrf/users/wrf_users_guide/build/html/wps.html) to define the domain (geo_em.d0x.nc and wrfinput.nc) and the WRF-Hydro GIS Pre-processing tools (both [ArcGIS](https://github.com/NCAR/wrf_hydro_arcgis_preprocessor) and [Open-Source](github.com/NCAR/wrf_hydro_gis_preprocessor)) to create hydrologicaly-consistent input fields. The IWAAs configuration of WRF-Hydro that we are working with today has been customized in many ways using various input datasets such as NHDPlus, WBD, NED, NLCD, and others.

### Geogrid File (geo_em.d0x.nc)

The data required to define the domain and geospatial attributes of a spatially-distributed, or gridded, 1-dimensional (vertical) land surface model (LSM) are specified in a geogrid (geo_em.d0x.nc) netCDF file. This file is generated by the GEOGRID utility in the [WRF preprocessing system (WPS)](www2.mmm.ucar.edu/wrf/users/wrf_users_guide/build/html/wps.html). GEOGRID interpolates land surface terrain, soils and vegetation data from standard, [readily available data products](https://www2.mmm.ucar.edu/wrf/users/download/get_sources_wps_geog.html). These data are distributed as a geographical input data package via the WRF website. Let's take a look at the content of the file.

In [None]:
%%bash 
ls ~/DOMAIN/geo_em.d0x.nc

This file defines the LSM grid, and has a spatial resolution of 1 km.

We will discuss plotting in the upcoming lessons. For now, we will import a custom interactive plottig widget so that you can explore the gridded 2D and 3D layers within the geogrid file. Run the following cell and select from different geogrid layers using the drop-down widget. Note that for 3D variables, the first instance of the 3rd dimension is chosen for plotting.

In [None]:
# Import necessary plotting and analysis libraries
%matplotlib inline
from ipywidgets import interact
from geospatial_plotting import populate_dropdown, see_geogrid

# Create the widget
interact(see_geogrid, x=populate_dropdown(file_type='geogrid', default_val='HGT_M'))

### Wrfinput file (wrfinput_d0x.nc)

Initial conditions for the land surface such as soil moisture, soil temperature, and snow states, are prescribed via the wrfinput_d1x.nc file. This netCDF file can be generated one of two ways, through the real.exe program within WRF or via an R script (create_Wrfinput.R) distributed on the WRF-Hydro website. When created using the real.exe program in WRF, initial conditions are pulled from existing reanalysis or realtime products (see WRF documentation for data and system requirements). This will typically result in more realistic initial model states. However, the process is somewhat involved and requires the user to obtain additional external datasets. The R script will create a simplified version of the wrfinput (wrfinput_d0x.nc) file including all necessary fields for the Noah-MP land surface model, but with spatially uniform initial conditions that are prescribed within the script and requires only the geogrid file geo_em.d0x.nc as input. Step-wise instructions and detailed requirements are included in documentation distributed with the script. Users should be aware that the model will likely require additional spin-up time when initialized from this file.

Now, check out the content of the file. 

In [None]:
%%bash
ncdump -h ~/DOMAIN/wrfinput_d0x.nc

Now we will explore the gridded variables in Wrfinput

In [None]:
%matplotlib inline

# Import widget and plotting libraries for enhanced interatvitiy
from geospatial_plotting import see_wrfinput
interact(see_wrfinput, x=populate_dropdown(file_type='wrfinput', default_val='IVGTYP'))

This file is on the LSM grid (1 km spatial resolution), and also has another dimension for soil layers. IWAAs is modeling the soil as 4 layers with the thickness of 10 cm, 30 cm, 60 cm and 1 m from top to bottom.

### Fulldom File (Fulldom_hires.nc)

This file is on the hydro grid (250 m spatial resolution) and has most of the required fields for the terrain routing. Check out the content of the file.

In [None]:
%%bash 
ncdump -h ~/DOMAIN/Fulldom_hires.nc

As mentioned earlier, all the variables in this file are on the hydro grid, and therefore has a 250 m spatial resolution. Note that some of the parameters are being calibrated. 

#### Hydrologic Routing Grid file parameters (Fulldom_hires.nc)

| Variable Name | Description | Source of the data | 
| ------------- | ------------- | ------------- |
| CHANNELGRID | Channel network grid identifying the location of stream channel grid cells | NHDWaterbodyComID (which matches RouteLink) |
| FLOWDIRECTION | Flow direction grid, which explicitly defines flow directions along the channel network in gridded routing. This variable dictates where water flows into channels from the land surface as well as in the channel. Uses [esri flow direction](pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-analyst/how-flow-direction-works.htm) definition. | based on DEM|
| LAKEGRID | Gridded reservoir locations | Not used | 
| LKSATFAC | Multiplier on lateral hydraulic conductivity (controls anisotropy between vertical and lateral conductivity) (Unitless) |  This parameter is being calibrated|
| OVROUGHRTFAC | Multiplier on Manning's roughness for overland flow | set to 1.0| |
| RETDEPRTFAC | Multiplier on retention depth limit (Unitless) | This parameter is being calibrated| - |
| STREAMORDER | Strahler stream order grid identifying the stream order for all channel pixels within the channel network | Calculated from CHANNELGRID and FLOWDIRECTION |
| TOPOGRAPHY | Terrain grid or Digital Elevation Model (DEM) | NED 30m (CONUS) and HydroSheds 90m (oCONUS) |
| IMPERVFRAC | Fraction of impervious surface area | NLCD |
| landuse | [USGS 24-class](www2.mmm.ucar.edu/wrf/users/docs/user_guide_v4/v4.4/users_guide_chap3.html#_Land_Use_and) landcover types | NLCD 2016 (CONUS) and MODIS, both reclassified to USGS 24-class | 
| LONGITUDE | longitude (WGS84) | WPS |
| LATITUDE | longitude (WGS84) | WPS | 
| x | x coordinate of projection (meters) | WRF-Hydro GIS Pre-processing tools |
| y | y coordinate of projection (meters) | WRF-Hydro GIS Pre-processing tools |
| crs | projection coordinate system | WRF-Hydro GIS Pre-processing tools |
| basn_msk | Basin Mask | Not used |
| frxst_pts | Gridded forecast points | Not Used |

Now we will explore the gridded variables in the Fulldom file

In [None]:
%matplotlib inline

# Import widget and plotting libraries for enhanced interatvitiy
from geospatial_plotting import see_fulldom
interact(see_fulldom, x=populate_dropdown(file_type='fulldom', default_val='FLOWDIRECTION'))

### Route Link File (Route_Link.nc)

Using a vector-based channel network, the IWAAs configuration (same as NOAA National Water Model) makes use of a fairly standard implementation of the Muskingum-Cunge (MC) method of hydrograph routing which makes use of time varying parameter estimates. But as a one-dimensional explicit scheme, it does not allow for backwater or localized effects. Channel flows are routed upstream to downstream in a cascaded routing manner (Gunner and Gorbetch, 1991) with the assumption that there are negligible backwater effects.

The IWAAS flowline network is built from the National Hydrography Dataset (NHD+ v2.1 dataset). Modifications were made to add external contributing basins (oCONUS) into the model, such as portions of the Columbia and Rio Grande river basins. Gauges are snapped to reaches (links) in the network and the Routelink file serves as the lookup table for parameters and locations of all reaches and reservoirs. All the required parameters for the channel routing are provided in the route link file, let's take a look at the content of the file.

In [None]:
%%bash 
ncdump -h ~/DOMAIN/Route_Link.nc

There are 35529 reaches modeled in the Idaho full domain. In the table below, all the variables in the RouteLink file are described and the source of the data is provided.

| Variable Name | Description | Source of the data | 
| ------------- | ------------- | ------------- |
| link | Link ID |  NHDPlus COMID where possible |
| from | From Link ID | Not Used |
| to | To Link ID  | NHDPlus flowline connectivity | 
| lon | Longitude of the segment midpoint [degrees east] | Calculated based on flowline geometry |
| lat | Latitude of the segment midpoint [degrees north] | Calculated based on flowline geometry |
| alt | Elevation from the NAD88 datum at start node [m] | NHDPlus MaxElevSmo |
| order | Strahler stream order | NHDPlus |
| Qi | Initial flow in link [m3/s] | Default |
| Musk | Muskingum routing time [s] | Default |
| MusX | Muskingum weighting coefficient | Default |
| Length | Stream segment length [m] | NHDPlus |
| n | Manning’s roughness | Order based, from 0.04 to 0.06 |
| So | Slope [m/m] | NHDPlus |
| ChSlp | Channel side slope [m/m] | Calculated from Tw and BtmWdth
| BtmWdth | Channel bottom width [m] |  Regression, based on drainage area  |
| NHDWaterbodyComID | Connectivity with reservoir | NHDPlus |
| gages | Identifier for USGS stream gage at this locations | NHDPlus and USGS |
| Kchan | Channel conductivity [mm/hr] | Not active |
| ascendingIndex | Index to user for sorting IDs - only in NWM files | Calculated |
| nCC | Compound Channel Manning's n | Default, set to 2 times Manning’s in the channel |
| TopWdthCC | Compound Channel Top Width (m) | Default, set to 3 times TopWdth |
| TopWdth |  Top Width (m) | Regression, based on drainage area |


Now we will visualize the variables present in the RouteLink file. Because each 'reach' has a midpoint latitude and longitude, we can plot the points in 2 dimensions using variables `lat` and `lon`.

In [None]:
%matplotlib inline

# Import widget and plotting libraries for enhanced interatvitiy
from geospatial_plotting import see_routelink
interact(see_routelink, x=populate_dropdown(file_type='routelink', default_val='link'))

### Lake Parameter File (LAKEPARM.nc)

All the information about the lakes are stored in the lake parameter file. At this point, there is no active management in the model. Each lake/reservoir in the has an orifice and weir assigned to it. The input to the lake will be routed using level pool routing. In the IWAAs configuration, *we do not simulate any lakes and only perform a natualized flow simulation*.

The table below provides a brief desciption of all the variables in the lake parameter file. 

| Variable Name | Description | 
| ------------- | ------------- |
| lake_id | NHDWaterbodyComID (which matches RouteLink) |
| LkArea | Area [km2], from NHDPlus |
| LkMxE | Elevation of maximum lake height [m, AMSL] |
| OrificeC | Orifice coefficient (ranges from zero to one) |
| OrificeA | Orifice area [m2] |
| OrificeE | Orifice elevation [m, AMSL] |
| WeirC | Weir coefficient (ranges from zero to one) |
| WeirL | Weir length [m] |
| WeirE | Weir elevation [m, AMSL] |
| ascendingIndex | index to use for sorting IDs (ascending) |
| ifd | Initial fractional depth, as a ratio of how full the lake is |
| lat | Latitude [decimal degrees] |
| lon | Longitude [decimal degrees] |
| crs | coordinate system (WGS84) |


### Spatial Weight File (spatialweights.nc)

This file specifies the weights to map between the hydrologic routing grid and the pre-defined groundwater basin boundaries. This file is used in the subsetting of the files. Let's take a look at the content of the file.

In [None]:
%%bash 
ncdump -h ~/DOMAIN/spatialweights.nc

Note that there are 24052 polygon IDs in this file which matches the number of NHD basins and consequently the groundwater basins in this domain.

| Variable Name | Description | 
| ------------- | ------------- |
| polyid | ID of polygon | 
| IDmask | Polygon ID (polyid) associated with each record) |
| overlaps | Number of intersecting polygons |
| weight | Fraction of intersecting polygon(polyid) intersected by poly2 |
| regridweight | Fraction of intersecting polyid(overlapper) intersected by polygon(polyid) |
| i_index | Index in the x dimension of the raster grid (starting with 1,1 in the LL corner) |
| j_index | Index in the y dimension of the raster grid (starting with 1,1 in the LL corner) |

#### Explore the spatial weight file

In order to conservatively map gridded values to unstructured meshes (basin boundaries), we will apply a weight to each grid cell for each basin in the domain. The sum of these weights will be the area-weighted basin average from the grid. Using this regridding process, we can apply values to the basins from the grid. This is the mechanism used to move volumes from the grid into the vector-based catchment and reach network used in the IWAAs configuration of WRF-Hydro.

In [None]:
import xarray as xr
weightfile = xr.open_dataset('~/DOMAIN/spatialweights.nc')

# Isolate one basin polygon by its polyid value
polyid = weightfile['polyid'].data[0]

# Subset the weight file to just the selected polygon by ID
weightfile2 = weightfile.where(weightfile['polyid']==polyid, drop=True)
weightfile2 = weightfile2.where(weightfile['IDmask']==polyid, drop=True)

# Print some information about this polygon
print('Examining polygon with ID = {0}'.format(polyid))
print('Found {0} grid cells that contribute to polygon {1}'.format(int(weightfile2['overlaps'].data[0,0]), polyid))
print('Sum of the weights for polygon {0}: {1}'.format(polyid, weightfile2['weight'].data.sum()))
print('Sum of the regridweights for polygon {0}: {1}'.format(polyid, weightfile2['regridweight'].data.sum()))

In [None]:
%matplotlib inline

# Import widget and plotting libraries for enhanced interatvitiy
from geospatial_plotting import visualize_weights
fulldom = xr.open_dataset('~/DOMAIN/Fulldom_hires.nc')
visualize_weights(r'/home/docker/DOMAIN/spatialweights.nc', 
                  basinID=polyid, 
                  nrows=len(fulldom['y']), 
                  ncols=len(fulldom['x']), 
                  spatialweight=True, 
                  regridweight=False, 
                  trimRaster=False)
fulldom.close()

### Groundwater Bcuket Parameter File (GWBUCKPARM.nc)

This file contains the groundwater parameters governing the behavior of the bucket model parameterization for each groundwater/baseflow basin specified within the model domain. In IWAAs, the groundwater basins match the NHD basins. Two of the varriables in this file have been calibrated in the IWAAs. Let's take a look at the content of the GWBUCKPARM.nc file.

In [None]:
%%bash 
ncdump -h ~/DOMAIN/GWBUCKPARM.nc

There are 24052 groundwater basins in the domain that matches the number of NHD basins being modeled in this domain.

| Variable Name | Description | Calibrated| 
| ------------- | ------------- | ------------- | 
| Basin | Basin monotonic ID (1...n) | - |
| Coeff | Bucket model coefficient | - |
| expon | Exponent controlling rate of bucket drainage as a function of depth | Yes |
| Zmax | Maximum groundwater bucket depth (mm) | Yes |
| Zinit | Initial depth of water in the bucket model | No, set to 10 for all the basins | 
| Area_sqkm | Basin area in square kilometers | - | 
| ComID | NHDCatchment FEATUREID (NHDFlowline ComID) | - |

### Soil Properties File (soil_properties.nc)

This file contains all the spatially varying soil and vegetation parameters. Some of the varriables in this file have been calibrated. Let's take a look at the content of the soil properties file.

In [None]:
%%bash 
ncdump -h ~/DOMAIN/soil_properties.nc

Note, that this file is on the coarse 1 km resolution. Below we have summarized all the variables with a short description of each parameter. Note, that for the parameters that have the soil layer as a dimension, the same parameter value have been used for all the soil layers.

| Variable Name | Description | Calibrated| 
| ------------- | ------------- | ------------- |
| AXAJ |Tension water distribution inflection parameter | Yes|
|BXAJ| Tension water distribution shape parameter| Yes|
|XXAJ| Free water distribution shape parameter | Yes|
| bexp | Beta parameter | Yes |
| cwpvt | Empirical canopy wind parameter | Yes |
| dksat | Saturated soil hydraulic conductivity | Yes |
| dwsat | Saturated soil hydraulic diffusivity | No |
| hvt | Top of vegetation canopy [m] | No |
| imperv| Impervious fraction | No |
| mfsno | Snowmelt m parameter | Yes |
| mp | Slope of conductance to photosynthesis relationship | Yes |
| psisat | Saturated soil matric potential | No |
| quartz | Soil quartz content | No |
| refdk | Parameter in the surface runoff parameterization. A reference for soil conductivity | No |
| refkdt | Parameter in the surface runoff parameterization. A soil infiltration parameter | No |
| slope | Slope index | Yes |
| smcdry | Dry soil moisture threshold where direction evaporation from the top layer ends | No |
| smcmax | Saturated value of soil moisture [volumetric] | Yes |
| smcref | Reference soil moisture (field capacity) [volumetric] | No |
| smcwlt | Wilting point soil moisture [volumetric] | No |
| vcmx25 | Maximum rate of carboxylation at 25 C [umol CO2/m2/s] | Yes |
| rsurfexp |Exponent in the resistance equation for soil evaporation | Yes |
| rsurfsnow |surface resistance for snow [s/m] | No|
|scamax |maximum fractional snow covered area [0-1]| No|
|snowretfac |snowpack water release timescale factor (1/s)| No|
|ssi |liquid water holding capacity for snowpack (m3/m3)| No|
|tau0 |snow related parameter - tau0 from Yang97 eqn. 10a| No|


Now we will explore the gridded variables in the Soil Properties file

In [None]:
%matplotlib inline

# Import widget and plotting libraries for enhanced interatvitiy
from geospatial_plotting import see_soil_properties
interact(see_soil_properties, x=populate_dropdown(file_type='soil_properties', default_val='AXAJ'))

### Hydro 2D File (hydro2dtbl.nc)

This file contains all the spatially varying parameters for the lateral routing. Some of the variables in this file have been calibrated. Let s take a look at the content of the hydro 2D file for the IWAAs configuration.

In [None]:
%%bash
ncdump -h ~/DOMAIN/hydro2dtbl.nc

Note the variables are on the coarse 1km resolution, and are all 2D variables. Below the table of the variables with a short description is provided. Three of these parameters have been calibrated.

| Variable Name | Description | Calibrated in V2.0 | 
| ------------- | ------------- | ------------- | 
| SMCMAX1 | Maximum volumetric soil moisture [m3/m3] | Yes, this is the same as `scmax` in soil propoerties file |
| SMCREF1 | Reference volumetric soil moisture [m3/m3] | No |
| SMCWLT1 | Wilting point volumetric soil moisture [m3/m3] | No |
| OV_ROUGH2D | Overland flow roughness coefficient | No | 
| LKSAT | Lateral saturated soil hydraulic conductivity [m/s] | Yes, this is the same as `DKSAT` in soil properties file |
| NEXP| Exponent in the decay function for lateral Ksat over depth | Yes|

Now we will explore the gridded variables in the Hydro 2D file

In [None]:
%matplotlib inline

# Import widget and plotting libraries for enhanced interatvitiy
from geospatial_plotting import see_hydro2D
interact(see_hydro2D, x=populate_dropdown(file_type='hydro2D', default_val='LKSAT'))

### GEOGRID_LDASOUT_Spatial_Metadata.nc (GEOGRID_LDASOUT_Spatial_Metadata.nc)

This file contains projection and coordinate information for the land surface model grid. It has three variables of x, y on the coarse 1km resolution grid and crs indicating the projection coordinate system for the output files.This file is optional, but if present, the [Climate and Forecast Conventions (CF)](https://cfconventions.org/) metadata will be appended to the WRF-Hydro outputs, allowing many client applications to interpret the spatial aspects of the data. For hydro routing grid outputs, the Fulldom_hires.nc file already contains this information.

In [None]:
%%bash
ncdump -h ~/DOMAIN/GEOGRID_LDASOUT_Spatial_Metadata.nc

## How to subset the domain files to a smaller domain

There are two ways to subset the domains. 
1. Subsetting using the bounding box coordinates
2. Subsetting using tracing upstream of a gage or a comID 

### Requirements

R (rwrfhydro, ncdf4, sp, raster, stringr, plyr libraries)

NCO (ncks, and ncatted commands are used)

### Subset using bounding box coordinates

This was the first attempt to subset the domain done by Aubrey Dugger. This is the methodology adopted by CUAHSI in order to provide web service for cutting out the NWM files. For this method, user provides x and y coordinates from the GEOGRID file projection system, which is currently the Lambert Conformal Conic. Then, the relative indices (west, east, south, north) in the original GEOGIRD (coarse grid) and Fulldom (fine grid) file will be calculated. These indices are directly used in clipping the GEOGRID, Wrfinput, and Fulldom, soil properties, hydro 2D file. Then all the links and catchments which fall in the rectangle domain (clip) are identified and will be used in subsetting the Routelink, spatial weight, groundwater bucket and spatial weight files.


### Subset using tracing a gage or a comID

The second method to subset the domain files is to provide the gage id or the comId of the outlet reach of the basin. Of course, then this method will only work for the basins that have a single outlet. One could modify the script to work for multiple outlet locations, if required.

Here, user provides a list of gageIds/comIds. If the gageId/comId does not exist in the Routlink file, it will give a warning and continue with the ones that exist in the Routlink file and provide one cutout per reach (outlet point of the basin).

First, we trace all the links/reaches about the outlet that drain to the outlet reach. The upstream tracing is performed using three supplementary files that are derived based on the route link file. The three indexing files are unique to the original route link file and if there is a change to the original route link these files should be regenerated. (These indexing files were created by James McCreight.)

After we have all the contributing reaches, we will find out the bounding box around the basin with the help of spatial weight file. The Spatial weight file has all the pixels that cover a comId (NHD basin).

After the indices are derived, the rest of the procedure will be similar to subsetting using the bounding box coordinates. All the indices used in the subsetting are written into a default text file named "params.txt". The user can change the naming of this file.

Finally, a script will be prepared for the FORCING subsetting. User might need to modify this script to match the name of the forcing dimensions.

### How to create the indexing files:

As mentioned earlier, we need three extra files for subsetting. There is an R function available in rwrfhydro library to create these files. The following lines of Rscript will create three files in where the Routlink file resides. Note: This can be a time consuming process. For the CONUS this needs 32 cores, with 12 hour wall clock approximately.

`library(rwrfhydro) # This is the R package develope by the WRF-Hydro team`

`routlinkFile <- PATH/TO/Route_Link.nc/File`

`ReExpressRouteLink(routlinkFile, parallel = TRUE)`


## Script for subsetting the example case 

All the scripts used for providing the RFC specific cutouts are distributed with the course materials. We have placed the script used in providing the example case in this training in the docker image. Let's take a look at the supplementary directory first.

In [None]:
%%bash 
ls ~/wrf-hydro-training/lessons/subsetting_scripts/

`subset_domain.R` is the main script that calls `Utils_ReachFiles.R` script. `Utils_ReachFiles.R` has some of the utility functions. Let's check out the content of the subsetting script quickly.

In [None]:
%%bash 
cat ~/wrf-hydro-training/lessons/subsetting_scripts/subset_domain.R | head -n 100

## how to run the subsetting script

The user is required to provide the following:
* The path to your NEW subset domain file directory
* List of gageIds of comIds, it will create one clipped basin for each gageId/comId
* Path to the ORIGINAL (full extent) domain files (for Fulldom, GEOGRID, Wrfinput, Routelink, spatial weight, groundwater bucket parameter, lake parameter and soil parameter files and ...)
* Path to the indexing file (downstreamReExpFile, upstreamReExpFile and reIndFile)
* dxy : the multiplier between routing grid and LSM grid (4 for IWAAs)

In [None]:
%%bash
mkdir -p ~/wrf-hydro-training/output/subsetting/
cd ~/wrf-hydro-training/lessons/subsetting_scripts/
Rscript subset_domain.R

The script will create one folder for each cutout basin, the name of the folder would be the USGS gage identifier if the gageId was provided, otherwise the comId of the outlet NHD reach. Now lets check the output directory content.

In [None]:
%%bash 
ls ~/wrf-hydro-training/output/subsetting/13010065/

We will check out the content of these files in the upcoming lessons.

© UCAR 2023