Skip to content

frs_habitat: watershed group habitat pipeline #61

@NewGraphEnvironment

Description

@NewGraphEnvironment

Problem

No way to run the fresh habitat pipeline (extract, break, classify, categorize) across all species in a watershed group. The vignette demonstrates the pattern for one species on a subbasin. Scaling to a full WSG with all species needs orchestration.

Baseline timing (BULK, 32472 segments, 7 species)

Species Total Access Classify Categorize
BT 162s 120s 34s 1s
CH 180s 130s 43s 1s
CO 207s 163s 36s 1s
PK 173s 135s 32s 1s
RB 163s 121s 35s 2s
SK 167s 126s 33s 1s
ST 155s 111s 37s 1s

Total: 20 min. Access barriers (frs_break_find gradient mode) = 70-80% of cost.

Key optimization: deduplicate access barriers

Species share access_gradient_max thresholds:

  • 0.15: CH, CO, PK, SK (4 species, 1 gradient scan)
  • 0.25: BT, RB, CT, DV (4 species, 1 gradient scan)
  • 0.20: ST (1 species, 1 gradient scan)

Falls barriers are species-independent. Computing access once per unique threshold instead of per species cuts the bottleneck from 7 scans to 3. Estimated 8-10 min for BULK.

Proposed functions

  • frs_access(conn, table, threshold, to) - gradient breaks + falls at one threshold
  • frs_species(conn, species_code, base_tbl, breaks_tbl, params) - copy base, apply pre-computed breaks, classify, categorize
  • frs_habitat(conn, wsg) - orchestrator: extract base, group species by threshold, fan out, cleanup

Output tables mirror bcfishpass: working.streams_co, working.streams_bt, etc. Per-species tables because break points create different segment geometries per threshold group.

Parallelism (furrr)

After access barriers are pre-computed, per-species work is about 40s (copy + classify + categorize). These are independent (separate output tables), parallelizable with furrr across species within a WSG, or across WSGs for multi-group runs.

Supporting pieces (already on feature/extract-where)

  • frs_wsg_species() - lookup species and bcfishpass view names per WSG from bundled wsg_species_presence.csv
  • frs_extract(where=) - column filter for fast WSG extraction (Add where parameter to frs_extract() #60)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions