Problem
link's falls source is fresh::system.file("extdata", "falls.csv", package = "fresh") — a static 3,295-row CSV. bcfp pulls falls from cabd.waterfalls + applies four edit CSVs. The static fresh extract is missing falls that exist in CABD, including known dam-coincident falls in WSGs we'd expect to compare:
| Site |
WSG |
In fresh falls.csv? |
In bcfp output (via CABD)? |
| Elk Falls |
CAMB |
yes (354154635 / 7662) |
yes |
| Stave Falls |
LFRA |
no |
yes |
| Alouette Falls |
LFRA |
no |
yes |
| Ladore Falls |
(Vancouver Island) |
no |
yes |
| Strathcona Dam falls |
(Vancouver Island) |
no |
yes |
| John Hart falls |
(Vancouver Island) |
no |
yes |
| Coquitlam / Seymour / Jordan / Capilano |
LFRA / nearby |
no |
yes |
Where these falls exist in bcfp's output, link's habitat models classify habitat above the natural fall as accessible — over-crediting habitat that bcfp correctly excludes. This is the actual mechanism that bites the bcfp-bundle parity rollup on Vancouver Island and Lower Fraser. The "dams" framing was a red herring — bcfp's per-species habitat models are dam-blind (verified, see link#83 body); the natural falls at dam sites are what bcfp uses, via CABD.
bcfp's load_falls.sql, mapped
bcfp/model/01_access/sql/load_falls.sql — single transaction, four CABD-edit CSVs:
-- 1. Pull CABD waterfalls, exclude false positives, snap blkey via xref
with cabd as (
select w.cabd_id as falls_id, blk.blue_line_key, st_transform(w.geom, 3005) as geom
from cabd.waterfalls w
left outer join bcfishpass.cabd_exclusions x on w.cabd_id = x.cabd_id -- exclusions.csv
left outer join bcfishpass.cabd_blkey_xref blk on w.cabd_id = blk.cabd_id -- blkey_xref.csv
where x.cabd_id is null
),
-- 2. Snap to nearest stream segment within 65 m, choose closest
matched as ( ...lateral join on fwa_stream_networks_sp... ),
-- 3. Apply passability override; convert passability_status_code = 1 → barrier_ind = TRUE
cabd_pts as (
select n.*, cabd.fall_name_en as falls_name, cabd.fall_height_m as height_m,
case when coalesce(u.passability_status_code, cabd.passability_status_code) = 1
then true else false end as barrier_ind
from matched n
inner join cabd.waterfalls cabd on n.falls_id = cabd.cabd_id
left outer join bcfishpass.cabd_passability_status_updates u on n.falls_id = u.cabd_id -- passability_updates.csv
)
insert into bcfishpass.falls (...)
select * from cabd_pts
union all
-- 4. Add hand-curated additions (additions.csv where feature_type = 'waterfalls')
select ... from bcfishpass.cabd_additions p where p.feature_type = 'waterfalls';
CABD edit CSVs — what each one does + current row counts
All four edit CSVs are shared between dams and falls — they key on cabd_id which spans both cabd.waterfalls and cabd.dams. They're used by load_dams.sql and load_falls.sql symmetrically.
| CSV |
Function |
Rows |
Falls relevance |
cabd_exclusions.csv |
Drop specific cabd_ids (false positives in upstream CABD) |
12 |
Likely some falls in this set |
cabd_passability_status_updates.csv |
Override passability_status_code from CABD |
12 |
Likely some falls |
cabd_blkey_xref.csv |
Override the blkey snap when default lateral snap picks wrong stream |
1 |
Tiny |
cabd_additions.csv |
Add features not in CABD; filtered by feature_type (dams / waterfalls) |
4 dams (US placeholders only); 0 waterfalls |
Zero impact on falls |
None are deprecated; all four are actively wired in bcfp's 01_access pipeline. cabd_additions for waterfalls is currently empty, so for the falls path the edit layer is just exclusions + passability + (one) blkey.
The big lift is the cabd.waterfalls source pull — likely tens of thousands of rows in the upstream CABD database — not the ~30 edit rows on top.
Proposed approach
Three components, each gated:
-
Source pull: a script that reads cabd.waterfalls (DB join via the tunnel? or a download from CABD's public release? — needs decision based on what NGE's licensing path looks like) and writes to <schema>.falls_cabd mirroring bcfp's intermediate.
-
Edit application: ingest the four CSVs into <schema>.cabd_exclusions, cabd_blkey_xref, cabd_passability_status_updates, cabd_additions via lnk_load_overrides. Apply them in the same SQL pattern as load_falls.sql to produce the final <schema>.falls.
-
Cutover: drop the static fresh::falls.csv consumption in link's pipeline. prep_load_aux reads from the DB-built <schema>.falls instead.
The 4 edit CSVs go into inst/extdata/configs/{bcfishpass,default}/overrides/, sourced from bcfp upstream — same redistribution pattern as user_barriers_definite_control.csv and the other bcfp-sourced files. Bundle config declares them in files: for lnk_load_overrides to pick up.
Verification — find the bite, prove the fix
Before closing this issue we want to demonstrate the fix on a specific case. Pre-flight WSG candidates (multiple known dam-coincident natural falls):
- CAMP (Campbell River system, Vancouver Island) — Elk Falls / John Hart / Strathcona Dam area
- NTOL / SQAM (Vancouver Island other)
- LFRA — Stave Falls, Alouette Falls, Ruskin
Verification steps:
- Pre-fix snapshot: query bcfp tunnel for
bcfishpass.falls rows in CAMP (or whichever), get the named-falls inventory.
- Confirm gap on link: query link's pre-fix
<schema>.falls for the same WSG, list which named falls are absent.
- Apply fix: wire CABD source pull + edit layer per above.
- Post-fix verify: confirm the previously-absent named falls now appear in
<schema>.falls with matching (blue_line_key, downstream_route_measure) to bcfp's coordinates (within rounding tolerance).
- Rollup confirm: re-run the WSG comparison, confirm the parity rollup tightens — segments now correctly inaccessible above each newly-added fall.
Won't close the issue until #5 is logged.
Out of scope
- Dams data (tracked separately — see link#? CABD dams ingestion, this PR's sibling)
- CABD release version pin / refresh cadence — open question, file follow-up after this lands
frs_falls_* function family in fresh (if the source-pull SQL ends up generic enough to belong there) — defer naming until prototype works on link side first
Reference
- bcfp
model/01_access/sql/load_falls.sql — implementation reference
- bcfp
data/cabd_*.csv — the 4 edit CSVs
research/bcfishpass_comparison.md § "2026-04-29 — parity claim retracted" — names this gap
- link#83 — per-species dam methodology (sibling concern, separate scope)
Relates to NewGraphEnvironment/sred-2025-2026#24
Problem
link's falls source is
fresh::system.file("extdata", "falls.csv", package = "fresh")— a static 3,295-row CSV. bcfp pulls falls fromcabd.waterfalls+ applies four edit CSVs. The static fresh extract is missing falls that exist in CABD, including known dam-coincident falls in WSGs we'd expect to compare:falls.csv?Where these falls exist in bcfp's output, link's habitat models classify habitat above the natural fall as accessible — over-crediting habitat that bcfp correctly excludes. This is the actual mechanism that bites the bcfp-bundle parity rollup on Vancouver Island and Lower Fraser. The "dams" framing was a red herring — bcfp's per-species habitat models are dam-blind (verified, see link#83 body); the natural falls at dam sites are what bcfp uses, via CABD.
bcfp's load_falls.sql, mapped
bcfp/model/01_access/sql/load_falls.sql— single transaction, four CABD-edit CSVs:CABD edit CSVs — what each one does + current row counts
All four edit CSVs are shared between dams and falls — they key on
cabd_idwhich spans bothcabd.waterfallsandcabd.dams. They're used byload_dams.sqlandload_falls.sqlsymmetrically.cabd_exclusions.csvcabd_passability_status_updates.csvpassability_status_codefrom CABDcabd_blkey_xref.csvcabd_additions.csvfeature_type(dams/waterfalls)None are deprecated; all four are actively wired in bcfp's
01_accesspipeline.cabd_additionsfor waterfalls is currently empty, so for the falls path the edit layer is just exclusions + passability + (one) blkey.The big lift is the
cabd.waterfallssource pull — likely tens of thousands of rows in the upstream CABD database — not the ~30 edit rows on top.Proposed approach
Three components, each gated:
Source pull: a script that reads
cabd.waterfalls(DB join via the tunnel? or a download from CABD's public release? — needs decision based on what NGE's licensing path looks like) and writes to<schema>.falls_cabdmirroring bcfp's intermediate.Edit application: ingest the four CSVs into
<schema>.cabd_exclusions,cabd_blkey_xref,cabd_passability_status_updates,cabd_additionsvialnk_load_overrides. Apply them in the same SQL pattern asload_falls.sqlto produce the final<schema>.falls.Cutover: drop the static
fresh::falls.csvconsumption in link's pipeline.prep_load_auxreads from the DB-built<schema>.fallsinstead.The 4 edit CSVs go into
inst/extdata/configs/{bcfishpass,default}/overrides/, sourced from bcfp upstream — same redistribution pattern asuser_barriers_definite_control.csvand the other bcfp-sourced files. Bundle config declares them infiles:forlnk_load_overridesto pick up.Verification — find the bite, prove the fix
Before closing this issue we want to demonstrate the fix on a specific case. Pre-flight WSG candidates (multiple known dam-coincident natural falls):
Verification steps:
bcfishpass.fallsrows in CAMP (or whichever), get the named-falls inventory.<schema>.fallsfor the same WSG, list which named falls are absent.<schema>.fallswith matching(blue_line_key, downstream_route_measure)to bcfp's coordinates (within rounding tolerance).Won't close the issue until #5 is logged.
Out of scope
frs_falls_*function family in fresh (if the source-pull SQL ends up generic enough to belong there) — defer naming until prototype works on link side firstReference
model/01_access/sql/load_falls.sql— implementation referencedata/cabd_*.csv— the 4 edit CSVsresearch/bcfishpass_comparison.md§ "2026-04-29 — parity claim retracted" — names this gapRelates to NewGraphEnvironment/sred-2025-2026#24