The goal of reference.catchments is to build simple, valid, catchments for the NHDPlus.
These will serve as the national reference catchments in the USGS geospatial fabric, and will be central to the NOAA Nextgen modelling hydrofabric.
GDAL
(ogr2ogr
is used for file conversions)mapshaper
(used for polygon simplification, not enough to usermapshaper
)
Simple: a simple polygon does not intersect itself and has no holes.
Valid: Validity is most important for polygons, which define bounded areas and require a good deal of structure. Some of the rules of polygon validity are:
- Polygon rings must close.
- Rings that define holes should be inside rings that define exterior boundaries.
- Rings may not self-intersect (they may neither touch nor cross themselves).
- Rings may not touch other rings, except at a point.
- Elements of multi-polygons may not touch each other.
MULTI vs POLYGON: A POLYGON
is a shape with a closed exterior comprised of lines. A MULTIPOLYGON
is a collection of POLYGON
. A multipolygon catchment reprenstation VIOLATES the assumption of 1 flowline to 1 catchment divide expected.
This project is set up as a series of scripts contained in the workflow directory. They can be run in the following order to tackle the summarized tasks:
- This sets the configuration variables for execution. These include:
- the epa s3 bucket
- the local base directory to build the output folder structure
- The desired output CRS (set to that of the
FACFDC
DEM files) - The Percentage of removable points to retain in simplification." see here
- This prepossessing step identifies the topology list of VPUs and documents which ones touch each other.
- It creates the
vpu_topology.csv
file indata/
- This file scans the EPA s3 bucket for the NHDCatchment zip files.
- These are downloaded and unzipped into in the
01_EPA_downloads/
directory created when sourcing config.R if they don't yet exist - The shapefiles distributed by EPA are then converted to geopackages (GPKG) in the
02_Catchments
directory
- This file will work over the raw EPA geopackages and run a cleaning algorithm that explodes and re-dissolves fragments in the catchment fabric to ensure simple, valid, polygon geometries are created.
- The output files are written to
03_cleaned_catchments
- Just as the NHD releases a Catchment and CatchmentSP (simplified) layer, we also release a simplified layers that smooths the "grid" pattern from the edges created during the raster to polygon conversion.
- We run
mapshaper
over the VPU files to do this asmapshaper
has a hard 2GB limit on file output size. While themapshaper
(Visvalingam) algorithm is topology preserving within the input class, it cannot and does not preserve topology at the VPU boundaries. - The output files are written to
04_simplified_catchments
- A separate algorithm is developed to fill the gaps and overlaps created when simplifying VPUS individually.
- The output files are written to
05_reference_catchments
- Merge VPUs into a single national reference fabric (stored in
data/
) - Run one last cleaning algorithm to ensure complete, seamless coverage
- If authentication permits, send national geopackage to ScienceBase