Skip to content

Wire barriers_definite_control into lnk_barrier_overrides #44

@NewGraphEnvironment

Description

@NewGraphEnvironment

Problem

bcfishpass pairs user_barriers_definite.csv with a control table user_barriers_definite_control.csv that marks specific positions as non-overridable via barrier_ind = TRUE. bcfishpass's access SQL excludes controlled positions from the observation-based override step. This captures the real-world semantics that some barriers — concrete dams, long impassable falls, diversions — should never be reopened by the presence of upstream historical observations. A known fish-blocking dam is the canonical case: tall, currently impassable, irrelevant how many fish of any species were noted upstream in historical records.

link's current pipeline does not wire this check into lnk_barrier_overrides. Controlled positions in barriers_definite_control.csv can therefore be overridden by observations just like any other natural barrier. This is a simplification from bcfishpass and is the wrong default for any pipeline applied to jurisdictions with named large barriers.

This must be fixed. Both to replicate bcfishpass behaviour in the "bcfishpass" config, and because the same wiring applies to any future variant (newgraph default, custom) that needs "these barriers are known absolutes."

What's already in place

  • lnk_barrier_overrides(conn, barriers, ..., control = NULL, ...) already accepts a control parameter. The documented contract: "Schema-qualified table of barrier controls with columns blue_line_key, downstream_route_measure, barrier_ind. Barriers in this table with barrier_ind = TRUE cannot be overridden."
  • The "bcfishpass" config bundle already declares barriers_definite_control: overrides/user_barriers_definite_control.csv in config.yaml.
  • .lnk_pipeline_prep_load_aux already loads the control CSV into <schema>.barriers_definite_control per-AOI when the manifest key is present.
  • A separate use of the same control table (filtering gradient barriers with barrier_ind = FALSE out of the raw detection) is already wired into .lnk_pipeline_prep_gradient.

The only gap is .lnk_pipeline_prep_overrides calling lnk_barrier_overrides(...) without passing control.

Design — manifest key as convention

lnk_barrier_overrides()'s control parameter is fully generic at the API level. The DB table name <schema>.barriers_definite_control is a convention tied to the manifest key:

overrides:
  barriers_definite_control: overrides/<whatever-filename>.csv

Any config bundle — bcfishpass, a future default variant, a custom one — activates the control semantics by including that manifest key. Any CSV filename works. No code changes needed per variant. Same pattern as other override roles (modelled_fixes, pscis_barrier_status, barriers_definite).

This keeps the control surface inside the config bundle where it belongs, not in the pipeline code.

Scope

  1. Pre-flight verification — read lnk_barrier_overrides's control-handling block. Confirm it actually excludes barrier_ind = TRUE positions from the generated override set. Negative test: barrier_ind = FALSE or no control row → override is generated normally. If the block is a stub or has the logic inverted, fix in the same PR.
  2. .lnk_pipeline_prep_overrides — probe information_schema.tables for barriers_definite_control in the run schema. When present, pass as control to lnk_barrier_overrides. Same guard pattern already used for the habitat table. Expected delta: ~5 lines.
  3. Tests — positive case (controlled position + upstream observations above threshold → override NOT in skip list) and negative case (same barrier without a control row → override IS in skip list). Reproducibility: two tar_make() runs after the change produce bit-identical rollups.
  4. tar_make() rerun on all four WSGs. Rollup numbers will shift by some amount because the current rollup is overriding positions that bcfishpass does not. Direction should be toward bcfishpass reference, not away.
  5. Update artifacts regardless of magnitude
    • research/bcfishpass_comparison.md rollup numbers + any % diff changes on BT rearing and other species
    • vignettes/reproducing-bcfishpass.Rmd — regenerate the two per-WSG pivot tables via data-raw/vignette_reproducing_bcfishpass.R and the two per-WSG sub_ch*.rds map artifacts. Commit the resulting .rds files.
    • NEWS.md — version bump to 0.6.0, entry noting the behaviour change.
  6. Research doc: add a short paragraph documenting what the control mechanism does, how it's wired, and that it generalises to any config bundle declaring the manifest key.

Risks

  • Numeric shift — current rollup drifts from the post-fix rollup by an unknown amount. All four WSGs currently within 5% of bcfishpass; should end up closer. If a shift pushes some rows outside 5%, that would indicate a real defect in one of the config files or the control logic — investigate, don't accept.
  • lnk_barrier_overrides control block may be stubbed — read first, plan concretely after reading. If the existing logic is sound, change is trivial. If broken, rewrite needed.
  • Reproducibility across runs must remain bit-identical post-fix.
  • Downstream regressions in other callers of lnk_barrier_overrides — unlikely since the parameter is NULL by default and was always NULL before, but verify by running the full test suite.

Execution

  • Branch from main. Plan in planning/active/ via /planning-init + PWF.
  • /code-check on the staged diff before every commit.
  • PR with SRED tag (NewGraphEnvironment/sred-2025-2026#24).
  • Verify the three artifact updates (research doc, vignette tables, map .rds regeneration) land in the same PR.

Cross-references

Versions at start

  • fresh: 0.14.0
  • link: main (0.5.0, target 0.6.0)
  • bcfishpass: ea3c5d8
  • fwapg: Docker (FWA 20240830)

Amendment 2026-04-23 — per-species control application

Finding

Post-Phase-2 tar_make() rolled 11–22pp away from bcfishpass on ADMS and BABL (BULK/ELKR unchanged — they have no TRUE control row with upstream observations). Root cause: bcfishpass applies the control filter per-species, only in model_access_ch_cm_co_pk_sk.sql and model_access_st.sql. BT, WCT, CT/DV/RB models have no control filter in their obs_upstr CTE. My Phase-2 implementation applies the filter across every species in the params loop — over-applying relative to bcfishpass.

Biology

The control table encodes an anadromous-barrier semantic. Residents (BT, WCT, CT, DV, RB) routinely inhabit reaches upstream of falls that block anadromous species: they don't need ocean access, and post-glacial connectivity at headwaters populated many upper basins before the present channel dropped below its late-Pleistocene profile. A fall that blocks CH/SK today doesn't imply that BT upstream are "errors."

So the fix is not "always apply control" nor "never apply control" — it's species-level opt-in, with defaults matching bcfishpass.

Scope addition

  1. New column in inst/extdata/configs/bcfishpass/parameters_fresh.csv: observation_control_apply (boolean).
    Defaults:

    • TRUE — CH, CM, CO, PK, SK, ST (matches bcfishpass's control-applying models)
    • FALSE — BT, WCT (bcfishpass does not apply control)
    • NA — CT, DV, RB (no observation_threshold either — the flag is inapplicable)
  2. lnk_barrier_overrides() — gate the NOT EXISTS control clause per-species inside the existing loop. When params$observation_control_apply[i] is TRUE, emit the clause; otherwise, no clause. Docstring updated: "honours the control table for species where observation_control_apply = TRUE."

  3. Tests — species with flag TRUE produces the NOT EXISTS clause; species with flag FALSE or NA does not.

  4. Direction of next rerun — ADMS/BABL should recover. BT/WCT/ST numbers on ADMS/BABL should return to roughly pre-fix levels (BT ~ +2 to +4%); CH/CM/CO/PK/SK rows should sit slightly closer to bcfishpass than pre-fix (since the filter is now actually applied for these species, where previously it was not applied at all).

Why this column lives in parameters_fresh.csv

  • It's an observation-handling parameter, not a habitat-classification rule. parameters_fresh.csv already owns observation_threshold, observation_date_min, observation_buffer_m, observation_species.
  • Rules YAML stays focused on classification (gradient/width/edge_type) — unchanged.
  • The cfg$overrides$barriers_definite_control manifest key remains the table-level contract (is the control CSV declared?). The new column is the application-level contract (does this species honour it?). Two concerns, two locations.

Revised commit plan

Phase 1 (committed: d1a7109) — lnk_barrier_overrides filter semantics fix.
Phase 2 (committed: 53bedbd) — wire control through .lnk_pipeline_prep_overrides, manifest-gated.
Phase 2a (new) — add observation_control_apply column to parameters_fresh.csv, gate the NOT EXISTS clause per-species in lnk_barrier_overrides, tests.
Phase 3 — tar_make() rerun, reproducibility check.
Phase 4 — artifact regen (research doc, vignette, NEWS, version bump).
Phase 5 — /code-check, PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions