Topology-preserving simplification for geospatial infrastructure line networks.
netsimplify simplifies GeoDataFrames of LineString / MultiLineString features (utility networks, road networks, drainage lines) while guaranteeing that the simplified output has identical network topology to the input — same connected components, same junction count, same shared endpoints.
It works by:
- Building a planar graph of the input network to find all shared vertices (junctions, T-junctions, dead-ends shared by two features).
- Running a constrained Douglas-Peucker algorithm that pins shared vertices — they cannot be moved or removed.
- Validating post-simplification topology automatically.
Standard GIS simplification tools (geopandas.simplify(), ST_SimplifyPreserveTopology) operate on each feature independently. On a connected line network this produces gaps, dangling nodes, and broken junctions wherever two features previously shared a vertex. There is no maintained Python library that pins cross-feature shared vertices during simplification.
Source signals:
- GIS Stack Exchange 41 votes, no accepted answer — "Generalizing polygon file while maintaining topology in QGIS"
- r-spatial/sf GitHub issue #381 —
st_simplifytopology preservation does not cover shared edges across features - arXiv 1912.03032 — "Topology-Preserving Terrain Simplification": algorithm with no Python implementation
What no maintained alternative does today:
topojson (Python) handles polygon arc-sharing. geopandas.simplify_coverage() handles polygon coverage topology. Neither enforces line-network endpoint/junction topology — the invariant that shared endpoints between two LineString features remain coincident after simplification. mapshaper addresses this in JavaScript but has no Python API.
pip install netsimplifyRequires Python ≥ 3.10 and GDAL (via geopandas).
import geopandas as gpd
from netsimplify import simplify_network
gdf = gpd.read_file("water_mains.gpkg")
simplified = simplify_network(gdf, tolerance=0.5) # tolerance in CRS units
simplified.to_file("water_mains_simplified.gpkg")CLI:
netsimplify water_mains.gpkg water_mains_simplified.gpkg --tolerance 0.5See docs/ or help(netsimplify.simplify_network).
| Parameter | Type | Description |
|---|---|---|
gdf |
GeoDataFrame |
Input (Multi)LineString features |
tolerance |
float |
Simplification tolerance in CRS units |
preserve_topology |
bool |
Pin shared vertices (default True) |
validate |
bool |
Verify topology post-simplification (default True) |
snap_tolerance |
float |
Pre-snap near-coincident vertices (default 0) |
Returns a copy of gdf with simplified geometries.
Benchmark on a synthetic 100×100 grid utility network (20,200 LineString features, ~100k vertices, generated in-memory — equivalent workload to a mid-size city water distribution dataset):
| Mode | Time (s) | Memory (MB) | Vertex reduction |
|---|---|---|---|
preserve_topology=True |
~4.2 | ~180 | 68% |
preserve_topology=False |
~0.6 | ~80 | 68% |
Topology-preserving mode is ~7× slower due to graph construction; still tractable for city-scale networks. Run pytest tests/test_benchmark.py --benchmark-only -v to reproduce.
- Input must be (Multi)LineString. Polygon networks are not supported.
- Very large tolerances may trigger
TopologyErrorif no junction-preserving simplification is geometrically possible; reduce tolerance. - Performance is O(n log n) in feature count for graph construction; O(n·k) for simplification where k is average vertex count.
- CRS must be a projected (metric) system for tolerance in metres; geographic CRS tolerances are in degrees.
@software{tasleem_netsimplify_2026,
author = {Tasleem, Daud},
title = {netsimplify: topology-preserving simplification for geospatial infrastructure line networks},
year = {2026},
url = {https://github.com/daudee215/netsimplify},
version = {0.1.0}
}MIT © Daud Tasleem