Wire CABD waterfalls + 4 edit CSVs into the falls source

## Problem

link's falls source is `fresh::system.file("extdata", "falls.csv", package = "fresh")` — a static 3,295-row CSV. bcfp pulls falls from `cabd.waterfalls` + applies four edit CSVs. The static fresh extract is missing falls that exist in CABD, including known dam-coincident falls in WSGs we'd expect to compare:

| Site | WSG | In fresh `falls.csv`? | In bcfp output (via CABD)? |
|---|---|---|---|
| Elk Falls | CAMB | yes (354154635 / 7662) | yes |
| Stave Falls | LFRA | **no** | yes |
| Alouette Falls | LFRA | **no** | yes |
| Ladore Falls | (Vancouver Island) | **no** | yes |
| Strathcona Dam falls | (Vancouver Island) | **no** | yes |
| John Hart falls | (Vancouver Island) | **no** | yes |
| Coquitlam / Seymour / Jordan / Capilano | LFRA / nearby | **no** | yes |

Where these falls exist in bcfp's output, link's habitat models classify habitat *above* the natural fall as accessible — over-crediting habitat that bcfp correctly excludes. This is the actual mechanism that bites the bcfp-bundle parity rollup on Vancouver Island and Lower Fraser. The "dams" framing was a red herring — bcfp's per-species habitat models are dam-blind (verified, see link#83 body); the natural falls *at* dam sites are what bcfp uses, via CABD.

## bcfp's load_falls.sql, mapped

`bcfp/model/01_access/sql/load_falls.sql` — single transaction, four CABD-edit CSVs:

```sql
-- 1. Pull CABD waterfalls, exclude false positives, snap blkey via xref
with cabd as (
  select w.cabd_id as falls_id, blk.blue_line_key, st_transform(w.geom, 3005) as geom
  from cabd.waterfalls w
  left outer join bcfishpass.cabd_exclusions x on w.cabd_id = x.cabd_id        -- exclusions.csv
  left outer join bcfishpass.cabd_blkey_xref blk on w.cabd_id = blk.cabd_id    -- blkey_xref.csv
  where x.cabd_id is null
),

-- 2. Snap to nearest stream segment within 65 m, choose closest
matched as ( ...lateral join on fwa_stream_networks_sp... ),

-- 3. Apply passability override; convert passability_status_code = 1 → barrier_ind = TRUE
cabd_pts as (
  select n.*, cabd.fall_name_en as falls_name, cabd.fall_height_m as height_m,
         case when coalesce(u.passability_status_code, cabd.passability_status_code) = 1
              then true else false end as barrier_ind
  from matched n
  inner join cabd.waterfalls cabd on n.falls_id = cabd.cabd_id
  left outer join bcfishpass.cabd_passability_status_updates u on n.falls_id = u.cabd_id  -- passability_updates.csv
)

insert into bcfishpass.falls (...)
  select * from cabd_pts
  union all
  -- 4. Add hand-curated additions (additions.csv where feature_type = 'waterfalls')
  select ... from bcfishpass.cabd_additions p where p.feature_type = 'waterfalls';
```

## CABD edit CSVs — what each one does + current row counts

All four edit CSVs are **shared between dams and falls** — they key on `cabd_id` which spans both `cabd.waterfalls` and `cabd.dams`. They're used by `load_dams.sql` and `load_falls.sql` symmetrically.

| CSV | Function | Rows | Falls relevance |
|---|---|---|---|
| `cabd_exclusions.csv` | Drop specific cabd_ids (false positives in upstream CABD) | **12** | Likely some falls in this set |
| `cabd_passability_status_updates.csv` | Override `passability_status_code` from CABD | **12** | Likely some falls |
| `cabd_blkey_xref.csv` | Override the blkey snap when default lateral snap picks wrong stream | **1** | Tiny |
| `cabd_additions.csv` | Add features not in CABD; filtered by `feature_type` (`dams` / `waterfalls`) | **4 dams (US placeholders only); 0 waterfalls** | Zero impact on falls |

None are deprecated; all four are actively wired in bcfp's `01_access` pipeline. `cabd_additions` for waterfalls is currently empty, so for the falls path the edit layer is just exclusions + passability + (one) blkey.

The big lift is the `cabd.waterfalls` source pull — likely tens of thousands of rows in the upstream CABD database — not the ~30 edit rows on top.

## Proposed approach

Three components, each gated:

1. **Source pull**: a script that reads `cabd.waterfalls` (DB join via the tunnel? or a download from CABD's public release? — needs decision based on what NGE's licensing path looks like) and writes to `<schema>.falls_cabd` mirroring bcfp's intermediate.

2. **Edit application**: ingest the four CSVs into `<schema>.cabd_exclusions`, `cabd_blkey_xref`, `cabd_passability_status_updates`, `cabd_additions` via `lnk_load_overrides`. Apply them in the same SQL pattern as `load_falls.sql` to produce the final `<schema>.falls`.

3. **Cutover**: drop the static `fresh::falls.csv` consumption in link's pipeline. `prep_load_aux` reads from the DB-built `<schema>.falls` instead.

The 4 edit CSVs go into `inst/extdata/configs/{bcfishpass,default}/overrides/`, sourced from bcfp upstream — same redistribution pattern as `user_barriers_definite_control.csv` and the other bcfp-sourced files. Bundle config declares them in `files:` for `lnk_load_overrides` to pick up.

## Verification — find the bite, prove the fix

Before closing this issue we want to demonstrate the fix on a specific case. Pre-flight WSG candidates (multiple known dam-coincident natural falls):

- **CAMP** (Campbell River system, Vancouver Island) — Elk Falls / John Hart / Strathcona Dam area
- **NTOL / SQAM** (Vancouver Island other)
- **LFRA** — Stave Falls, Alouette Falls, Ruskin

Verification steps:

1. **Pre-fix snapshot**: query bcfp tunnel for `bcfishpass.falls` rows in CAMP (or whichever), get the named-falls inventory.
2. **Confirm gap on link**: query link's pre-fix `<schema>.falls` for the same WSG, list which named falls are absent.
3. **Apply fix**: wire CABD source pull + edit layer per above.
4. **Post-fix verify**: confirm the previously-absent named falls now appear in `<schema>.falls` with matching `(blue_line_key, downstream_route_measure)` to bcfp's coordinates (within rounding tolerance).
5. **Rollup confirm**: re-run the WSG comparison, confirm the parity rollup tightens — segments now correctly inaccessible above each newly-added fall.

Won't close the issue until #5 is logged.

## Out of scope

- Dams data (tracked separately — see link#? CABD dams ingestion, this PR's sibling)
- CABD release version pin / refresh cadence — open question, file follow-up after this lands
- `frs_falls_*` function family in fresh (if the source-pull SQL ends up generic enough to belong there) — defer naming until prototype works on link side first

## Reference

- bcfp `model/01_access/sql/load_falls.sql` — implementation reference
- bcfp `data/cabd_*.csv` — the 4 edit CSVs
- `research/bcfishpass_comparison.md` § "2026-04-29 — parity claim retracted" — names this gap
- link#83 — per-species dam methodology (sibling concern, separate scope)

Relates to NewGraphEnvironment/sred-2025-2026#24


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wire CABD waterfalls + 4 edit CSVs into the falls source #102

Problem

bcfp's load_falls.sql, mapped

CABD edit CSVs — what each one does + current row counts

Proposed approach

Verification — find the bite, prove the fix

Out of scope

Reference

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Site	WSG	In fresh `falls.csv`?	In bcfp output (via CABD)?
Elk Falls	CAMB	yes (354154635 / 7662)	yes
Stave Falls	LFRA	no	yes
Alouette Falls	LFRA	no	yes
Ladore Falls	(Vancouver Island)	no	yes
Strathcona Dam falls	(Vancouver Island)	no	yes
John Hart falls	(Vancouver Island)	no	yes
Coquitlam / Seymour / Jordan / Capilano	LFRA / nearby	no	yes

CSV	Function	Rows	Falls relevance
`cabd_exclusions.csv`	Drop specific cabd_ids (false positives in upstream CABD)	12	Likely some falls in this set
`cabd_passability_status_updates.csv`	Override `passability_status_code` from CABD	12	Likely some falls
`cabd_blkey_xref.csv`	Override the blkey snap when default lateral snap picks wrong stream	1	Tiny
`cabd_additions.csv`	Add features not in CABD; filtered by `feature_type` (`dams` / `waterfalls`)	4 dams (US placeholders only); 0 waterfalls	Zero impact on falls

Wire CABD waterfalls + 4 edit CSVs into the falls source #102

Description

Problem

bcfp's load_falls.sql, mapped

CABD edit CSVs — what each one does + current row counts

Proposed approach

Verification — find the bite, prove the fix

Out of scope

Reference

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions