Skip to content

BT/ST accessible_km over-credit: access segmentation drops the accessibility frontier #223

Description

@NewGraphEnvironment

Summary

link's accessible_km over-credits for BT (and, smaller, ST) vs the tunnel-free bcfp reference (fresh.streams_vw_bcfp). Per-WSG the overcredit is +23.6% (FINA), +40% (PCEA); every WSG has link ≥ bcfp, and the magnitude scales monotonically with the species access_gradient_max (CO 0.15 clean ≤0.27% → ST 0.20 minor → BT 0.25 material).

The root cause is a segmentation defect, not an access-decision defect. At co-located segment positions link and bcfp agree on BT accessibility at 99.99% (link_only = link-accessible & bcfp-blocked = 0 km). The gap is entirely that link's stream segments don't break at the per-species gradient/falls barriers, so a single segment straddles the accessibility frontier and the whole segment — including the blocked reach above the barrier — is labelled accessible.

The access-decision barrier set is correct and already contains the frontier barrier. The fix is purely to make the streams break at those positions (as bcfp does) and reclassify. No barrier is added or moved; no length is clipped.

blk 359209845 BT — link keeps segment [3391,7998] accessible; bcfp breaks at 3835 and blocks [3835,7998] (4163 m)

The code path

  1. R/lnk_pipeline_run.R:216-224 — the access set barriers_<sp>_access (post-override natural barriers ∪ user_definite) is passed to lnk_pipeline_access(). This set is correct — it contains the frontier barrier.
  2. R/lnk_pipeline_prepare.R:566-594 builds barriers_<model> = class-filtered gradient_barriers_raw ∪ falls, then :592 runs fresh::frs_barriers_minimal() on it, and :637-651 unions the results into gradient_barriers_minimal — the stream break source.
  3. fresh/R/frs_barriers_minimal.R:114-134 — that reduction DELETEs every barrier that has another barrier downstream on the same flow path, keeping only the downstream-most position per path. Correct for an access DECISION; wrong as a SEGMENTATION source — it strips the interior break points that separate the accessible reach below a barrier from the blocked reach above.
  4. R/lnk_pipeline_access.R:155-165 runs frs_network_features(direction = "downstream") per segment; :356 / :364 label a segment blocked only when a barrier sits downstream of its downstream route measure. A segment that straddles the frontier has its downstream measure below the barrier → labelled accessible.

The author already knew this exact tension — for the orphan break source, R/lnk_pipeline_prepare.R:606-611 says:

Critically — DO NOT run frs_barriers_minimal on the orphan set. Minimal reduction keeps only the most-downstream-most blocking position per flow path; that's correct semantics for ACCESS barriers, but orphans are segmentation positions only. We want every detected position to split the network, not just the downstream-most one. Use the raw set directly as the orphan break table.

The same reasoning applies to the per-species gradient/falls break source — but the minimal reduction was not skipped there. That is the bug.

Decisive evidence — blk 359209845 (FINA, BT)

Three distinct barrier sets on this one blue line:

set what count on this blk frontier 3834.78 present?
working_fina.barriers_bt prep, pre-minimal (gradient ≥ 0.25 ∪ falls) 16 yes
gradient_barriers_minimal (via barriers_bt_min) frs_barriers_minimal reduction 0 no
barriers_bt_access the access-decision set 16 yes

Why the minimal set is empty here: frs_barriers_minimal pruned all 16 tributary barriers because two BT barriers on the parent mainstem blk 359572348 (wscode 200.948755, measures 1684183.02 / 1706109.93) are downstream of the confluence per fwa_upstream(). But those parent barriers are overridden out of barriers_bt_access — so the segmentation frontier and the access frontier disagree, and the tributary is left as one unbroken segment.

Result:

source segment length label
link [3391, 7998] 4607 m access_bt = 1 (accessible, whole)
bcfp [3391, 3835] 444 m accessible
bcfp [3835, 7998] 4163 m blocked

Over-credit on this ONE blue line = 4163 m. Aggregated, the finer-only blocked km is the parity gap:

WSG link − bcfp accessible_km above-frontier km explained
FINA +1436 1438 (100%)
PCEA +2020 2021 (100%)
PARS +234 238 (97%)

Why it stayed hidden at 99%+ parity

  • habitat km is low-gradient-gated, so the steep over-credited reach above a 25% barrier carries ~0 habitat km — the habitat rollups never saw it (only 6.4% of link BT-accessible km lies above 0.25; this is not a broad leak, it's a granularity accuracy gap concentrated in steep tails).
  • mapping_code parity is an INNER merge on segment position (R/lnk_compare_mapping_code.R:229) and count-weighted (:276), so bcfp's finer-only segment is dropped from the comparison. accessible_km is the first metric to independently sum both full segmentations — so it's the first to surface the gap.

Fix (separate branch)

Feed the full per-species barrier positions (the pre-minimal barriers_<model> set — equivalently the contents of barriers_<sp>_access) into the stream break source, exactly as the orphan path already does, so streams break at every gradient/falls position. Then the existing downstream-barrier reclassification (lnk_pipeline_access.R:356/:364) is correct with no further change.

We break and reclassify like bcfp — no length clipping in the rollup, no barrier added or moved.

Validation

References


Verified this session (code trace + live DB, 2026-07-03)

The fix is isolated and safe

  • frs_barriers_minimal is used exactly once in link (R/lnk_pipeline_prepare.R:592) — only for the per-model segmentation source.
  • gradient_barriers_minimal's only consumer is R/lnk_pipeline_break.R:110 (the gradient_minimal break source). It is segmentation-only — classify uses a full gradient set, access uses barriers_<sp>_access.
  • barriers_<sp>_access (the access decision) is built by a separate path: lnk_barriers_unify()R/lnk_barriers_views.R:171 creates it as an anti-join over the unified post-override barriers. It does not derive from gradient_barriers_minimal or the _min tables. So changing the segmentation source cannot disturb the access decision — which already holds the frontier barrier.

The fix (surgical, one file)

R/lnk_pipeline_prepare.R:592-593 — stop minimal-reducing the per-model set for segmentation; union the raw barriers_<model> (gradient ∪ falls) into gradient_barriers_minimal instead of the reduced barriers_<model>_min:

# before
fresh::frs_barriers_minimal(conn, from = model_tbl, to = min_tbl)
minimal_tbls <- c(minimal_tbls, min_tbl)
# after
minimal_tbls <- c(minimal_tbls, model_tbl)   # raw positions = every break, like the orphan path

This mirrors the orphan treatment the author already documents at :606-611 ("we want every detected position to split the network, not just the downstream-most one") and matches bcfp. The existing lnk_pipeline_access downstream-barrier check (:356/:364) then reclassifies the above-frontier reach as blocked. Break + reclassify — no clipping, no barrier added or moved.

Design decision: keep the table name gradient_barriers_minimal for now (add a clarifying comment that it's no longer minimal-reduced); a separate follow-up issue will rename it (e.g. gradient_barriers_break) once this fix is confirmed, so the misleading name doesn't persist.

Live pre-fix fingerprint — FINA (proves the bug reproduces)

Queried on local fwapg (:5432), current persisted state:

  • working_fina.gradient_barriers_minimal on blk 359209845 = empty (no break at the frontier 3834.78).
  • fresh.streams on blk 359209845 FINA: id_segment 4218 = [3391, 7998] = 4607 m as ONE segment, straddling the 3835 barrier (adjacent segments 4215–4217 break normally below 3391).
  • BT accessible_km FINA: link 7520.7 km vs bcfp 6085.2 km = +23.59% (link joins fresh.streamsfresh.streams_access on full PK, access_bt IN (1,2); bcfp = fresh.streams_vw_bcfp where barriers_bt_dnstr = '').

After the fix, blk 359209845 should break at ~3835 and the segment starting there should carry access_bt = 0, dropping FINA link BT accessible_km toward 6085 km.

Validation targets (FINA / PARS / PCEA all persisted + in fresh.streams_vw_bcfp)

  1. BT accessible_km converges to bcfp (FINA was +23.59%, PCEA ~+40%).
  2. Salmon (CO) accessible_km stays ≤0.27% (no regression).
  3. Habitat + mapping_code parity holds or improves vs the 99.66% baseline.
  4. Unit check: segment at measure 3835 on blk 359209845 has access_bt = 0.

Handoff state

  • Fix branch 223-access-segmentation-frontier created off origin/main (upstream-to-main tracking removed for safety).
  • PWF scaffolded + baseline-committed on that branch: planning/active/{task_plan,findings,progress}.md.
  • The proof PNG + full root-cause writeup live on the 221-… branch (research/blk359209845_bt_accessible_km.png, research/accessible_km_divergence.md); they rebase onto this fix later.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions