Problem
fresh has primitives for snapping points to FWA streams (frs_point_snap) and for finding upstream/downstream features per segment (frs_network_features), but no primitive for matching two point datasets along the network within a distance threshold.
Concrete use case driving this: link's bcfp parity layer needs to reproduce bcfp's 02_pscis_streams_150m.sql at smnorris/bcfishpass@v0.7.14-125-g6e9cf1c (current tunnel state, bcfishpass.log.model_run_id=121 rebuilt 2026-05-05) -- match PSCIS crossings to modelled crossings within 100m instream distance on the same stream, then keep the nearest PSCIS per modelled crossing. Currently link's lnk_pipeline_crossings is missing this layer, leaving modelled crossings duplicating PSCIS positions in the working schema; cascades into >+1000 false-positive anthropogenic barriers in BULK alone (see link's research/bcfp_table_map.md).
The operation generalizes beyond PSCIS:
- field-assessed crossings <-> user-added crossings deduplication
- observations <-> habitat confirmation points
- any "merge two point datasets on the same FWA network" workflow
Proposed primitive
frs_point_match(conn, table_a, table_b, table_to, distance_max, ...):
- Both input tables must already be snapped to FWA (carry
blue_line_key, downstream_route_measure -- typically via frs_point_snap upstream).
- Output table has columns from
table_a plus a <table_b_id_col> linking column populated where a match within distance_max exists.
- Matching is on same
blue_line_key + instream distance <= distance_max (computed from downstream_route_measure deltas).
- Dedup: each
table_b row links to at most one table_a row -- the closest one (DISTINCT ON (table_b_id) ORDER BY distance ASC).
Signature
frs_point_match(
conn,
table_a, # schema-qualified, points to match FROM
table_b, # schema-qualified, points to match TO
table_to, # schema-qualified destination
distance_max, # max instream distance (metres)
table_a_id_col = "id",
table_b_id_col = "id"
)
Returns conn invisibly. Side effect: drops + recreates table_to with table_a's columns plus the linking ID column.
Network-position columns (blue_line_key, downstream_route_measure) are hard-coded to the FWA convention -- every FWA-snapped point table in the bcfp ecosystem uses these names. Per-side overrides (like frs_network_features got in fresh#204) can be added later if a real divergence appears.
Why fresh, not link
This is a generic FWA-primitive operation. fresh already owns:
frs_point_snap -- point <-> stream snap
frs_network_features -- segment <-> feature dnstr/upstr
frs_point_match rounds out the point-handling family -- point <-> point along the network. Adding it in fresh makes the primitive available to link, bcfishpass-comparable tooling, and any future packages working with point-on-network data. Name uses singular point to match frs_point_snap (the closest existing analog).
Acceptance
Out of scope
- Stream-name matching (bcfp does this for PSCIS-specific reasons via
gnis_name <-> stream_name regex matching). Caller can layer that on top -- primitive provides just the network-distance match.
- Multi-stream matching (matching across wscode_ltree subtrees) -- single
blue_line_key is enough for the parity case.
- Bidirectional dedup variants (matching pairs both ways) --
DISTINCT ON (table_b_id) is enough.
- Per-side column-name overrides (
table_*_blk_col etc.) -- add when a real divergence appears.
Problem
fresh has primitives for snapping points to FWA streams (
frs_point_snap) and for finding upstream/downstream features per segment (frs_network_features), but no primitive for matching two point datasets along the network within a distance threshold.Concrete use case driving this: link's bcfp parity layer needs to reproduce bcfp's
02_pscis_streams_150m.sqlatsmnorris/bcfishpass@v0.7.14-125-g6e9cf1c(current tunnel state,bcfishpass.log.model_run_id=121rebuilt 2026-05-05) -- match PSCIS crossings to modelled crossings within 100m instream distance on the same stream, then keep the nearest PSCIS per modelled crossing. Currently link'slnk_pipeline_crossingsis missing this layer, leaving modelled crossings duplicating PSCIS positions in the working schema; cascades into >+1000 false-positive anthropogenic barriers in BULK alone (see link'sresearch/bcfp_table_map.md).The operation generalizes beyond PSCIS:
Proposed primitive
frs_point_match(conn, table_a, table_b, table_to, distance_max, ...):blue_line_key,downstream_route_measure-- typically viafrs_point_snapupstream).table_aplus a<table_b_id_col>linking column populated where a match withindistance_maxexists.blue_line_key+ instream distance <=distance_max(computed fromdownstream_route_measuredeltas).table_brow links to at most onetable_arow -- the closest one (DISTINCT ON (table_b_id) ORDER BY distance ASC).Signature
Returns
conninvisibly. Side effect: drops + recreatestable_towith table_a's columns plus the linking ID column.Network-position columns (
blue_line_key,downstream_route_measure) are hard-coded to the FWA convention -- every FWA-snapped point table in the bcfp ecosystem uses these names. Per-side overrides (likefrs_network_featuresgot in fresh#204) can be added later if a real divergence appears.Why fresh, not link
This is a generic FWA-primitive operation. fresh already owns:
frs_point_snap-- point <-> stream snapfrs_network_features-- segment <-> feature dnstr/upstrfrs_point_matchrounds out the point-handling family -- point <-> point along the network. Adding it in fresh makes the primitive available to link, bcfishpass-comparable tooling, and any future packages working with point-on-network data. Name uses singularpointto matchfrs_point_snap(the closest existing analog).Acceptance
frs_point_match(...)produces output byte-identical to bcfp's02_pscis_streams_150m.sqloutput (atsmnorris/bcfishpass@v0.7.14-125-g6e9cf1c) for a test WSG (after stream-name filtering, which is bcfp-specific and stays in link's caller, not the primitive)Out of scope
gnis_name<->stream_nameregex matching). Caller can layer that on top -- primitive provides just the network-distance match.blue_line_keyis enough for the parity case.DISTINCT ON (table_b_id)is enough.table_*_blk_coletc.) -- add when a real divergence appears.