Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .claude/visibility
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
public
497 changes: 185 additions & 312 deletions CLAUDE.md

Large diffs are not rendered by default.

3 changes: 3 additions & 0 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ services:
-c max_parallel_workers_per_gather=8
-c max_parallel_workers=14
-c max_worker_processes=16
-c random_page_cost=1.1
-c effective_io_concurrency=200
-c temp_buffers=64MB
volumes:
- ./postgres-data:/var/lib/postgresql/data
ports:
Expand Down
20 changes: 20 additions & 0 deletions docker/tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,26 @@ Tuning notes for the local Docker fwapg instance. Settings in `docker-compose.ym
| `max_parallel_workers` | 14 | Leave 2 cores for OS/Docker |
| `max_worker_processes` | 16 | Match core count |
| `shm_size` | 36gb | Must exceed shared_buffers (Docker constraint) |
| `random_page_cost` | 1.1 | SSD: random reads ≈ sequential. Default 4.0 (spinning rust) over-prices index scans, biases planner toward seq scans |
| `effective_io_concurrency` | 200 | SSD/NVMe: hundreds of concurrent requests. Default 1 under-issues prefetch on bitmap heap scans |
| `temp_buffers` | 64MB | Per-session temp-table memory. Default 8MB spills working tables (per-WSG segments, breaks, scratch) to disk gratuitously on a 128 GB box |

## SSD assumption

All NewGraph hosts run on SSD (M4 NVMe, M1 Colima virtiofs over APFS,
cypher DO block storage). The `random_page_cost`,
`effective_io_concurrency`, and `temp_buffers` defaults above bias the
planner toward index scans for segment-keyed lookups (the
`WHERE blue_line_key = … AND drm <= …` against `streams_breaks` hot
path in link's pipeline). Out-of-the-box PostgreSQL defaults (4.0 / 1
/ 8MB) are calibrated for spinning rust and are wrong for any modern
host. Don't lower these without a spinning-disk reason.

**M1/cypher hosts use a docker-compose override file.** Compose merges
override `command:` lists by **replacing** the base, not by appending —
so any `-c` flags added here must also be added to the M1/cypher
override (tracked in the rtj repo). Forgetting this silently leaves
the 32 GB hosts on PostgreSQL defaults.

## Scaling for other machines

Expand Down
110 changes: 66 additions & 44 deletions planning/active/findings.md
Original file line number Diff line number Diff line change
@@ -1,63 +1,85 @@
# Findings — fresh#158
# Findings — pg tuning: SSD planner-cost defaults (#199)

## Diagnosis (2026-05-01)
## Issue context

bcfishpass's per-species rear rule has an inline OR clause:
### Problem

```sql
(cw.channel_width >= t.rear_channel_width_min OR
(s.stream_order_parent >= 5 AND s.stream_order = 1))
```
`fresh/docker/docker-compose.yml` doesn't set `random_page_cost` or
`effective_io_concurrency`, so postgres falls through to PostgreSQL
defaults of `4.0` and `1` — both calibrated for spinning rust. All
NewGraph hosts run on SSD (M4 NVMe, M1 Colima virtiofs over APFS,
cypher DO block storage). With `random_page_cost=4`, the planner
systematically biases away from index scans for segment-keyed lookups
in link's pipeline (`WHERE blue_line_key = … AND drm <= …` against
`streams_breaks` is the hot path).

at [`bcfishpass/model/02_habitat_linear/sql/load_habitat_linear_bt.sql`](https://github.com/smnorris/bcfishpass/blob/main/model/02_habitat_linear/sql/load_habitat_linear_bt.sql) lines 89–96 (BT — same pattern in CH/CO/ST/WCT). Direct order-1 tributaries of order-5+ mainstems get credited as rearing **even when cw < rear_min**.
Verified on M4 + M1 + cypher today (2026-05-04) — all three show
`random_page_cost=4`, `effective_io_concurrency=1`, `temp_buffers=8MB`.

Biology: small tribs of large rivers support juvenile rearing despite small FWA-measured channel width — parent supplies flow / temperature / access; cool tributary water mixes at confluence; backwater + off-channel habitat near the mouth is high-value.
### Proposed

fresh has no implementation. link's `dimensions.csv::rear_stream_order_bypass = no` for all species in both bundles (correctly anticipating fresh has nothing to read). Provincial parity baseline (link 0.20.1) shows this gap on HORS / COLR / KHOR / CLRH / etc — Class B in `link/research/provincial_parity_2026_05_01.md`.
Add to the `db.command` block in base `docker-compose.yml`:

## Function design (per issue body)
```yaml
-c random_page_cost=1.1
-c effective_io_concurrency=200
-c temp_buffers=64MB
```

`frs_order_child(conn, table, habitat, species, label = "rearing", parent_order_min = 5, child_order_min = NULL, child_order_max = NULL, distance_max = NULL)`
Document in `fresh/docker/tuning.md` alongside the existing memory
rationale: SSD cost values, why they matter for link's segment-heavy
SQL.

Post-classification UPDATE:
### Verification

```sql
UPDATE habitat
SET <label> = TRUE
FROM streams s
WHERE habitat.id_segment = s.id_segment
AND habitat.species_code = '<species>'
AND habitat.accessible = TRUE
AND habitat.<label> IS NOT TRUE
AND s.stream_order = s.stream_order_max -- direct child
AND s.stream_order_parent >= <parent_order_min> -- of large river
AND s.stream_order >= <child_order_min> -- if set
AND s.stream_order <= <child_order_max> -- if set
AND (<distance_max> IS NULL OR
s.downstream_route_measure <= <distance_max>);
```
Post-merge, M4 reruns one of link's provincial-style runs. Compare
per-WSG wall times in `data-raw/logs/<TS>_per_wsg_times.csv` against
today's baseline (2026-05-04 `default_extrabreaks` run,
`data-raw/logs/provincial_default_extrabreaks/<TS>_per_wsg_times.csv`).
Acceptance: median per-WSG wall reduced by ≥ 10 % at unchanged segment
count.

### Cross-ref

Companion rtj issue for the M1/cypher override file — the override's
`command:` list REPLACES the base under docker-compose's merge
semantics, so settings added here don't propagate to the 32 GB hosts.
Same change must land in both.

Direct-child filter: `s.stream_order = s.stream_order_max` ensures we stop at the order-change point (once the segment's order would exceed `stream_order_max`, you're no longer on the direct-child reach). Captures "small tributary directly into a large parent."
## Setting-by-setting rationale

`accessible = TRUE` guard: never adds rearing on segments above a definite barrier.
### `random_page_cost = 1.1`

`<label> IS NOT TRUE` guard: idempotent + additive — already-classified segments untouched.
PostgreSQL default is `4.0` — the cost ratio for a random vs
sequential page read on spinning rust (~4× slower for random I/O).
On SSD, random reads are nearly as cheap as sequential. PostgreSQL
docs and tuning consensus recommend `1.1` to `1.5` for SSD. With the
default, the planner systematically over-prices index scans and
prefers seq scans even when an index would be faster — exactly the
wrong bias for segment-keyed lookups.

## Pre vs post cluster — design decision (post-cluster wins)
### `effective_io_concurrency = 200`

bcfishpass embeds the bypass in the rule predicate (pre-cluster). Issue body's analysis: post-cluster is cleaner because:
- Bypassed segments don't need to pass connectivity (they're connected to the large parent by definition; the biology of the bypass IS the connectivity)
- `accessible = TRUE` already gates barrier-blocked reaches
- Post-cluster matches the parametric form (function takes WSG/species and applies, not a rule predicate that needs SQL grammar in fresh)
- Same end-state numbers as bcfp on the BCFP-parity caller-side defaults
PostgreSQL default is `1`. This setting tells the planner how many
concurrent I/O requests the storage can usefully serve, used to drive
prefetch on bitmap heap scans. SSDs (especially NVMe) handle hundreds
of concurrent requests trivially — `200` is the standard recommendation
for SSD. Value of `1` causes the planner to under-issue prefetches and
underestimate bitmap-scan throughput.

## Caller defaults (link side, separate PR)
### `temp_buffers = 64MB`

- bcfishpass bundle: `frs_order_child(species, parent_order_min = 5)` (defaults) for BT/CH/CO/ST/WCT
- default bundle: methodology-pending. Could ship same as bcfp first, tune later
PostgreSQL default is `8MB` — per-session memory for temporary tables.
Bumping to `64MB` lets short-lived temp/working tables live in RAM
instead of spilling to disk. fresh/link pipelines build per-WSG temp
tables (segments, breaks, working scratch) that are read repeatedly
within a session; spilling these to disk is gratuitous on a dev box
with 128 GB RAM.

## Versions at start
## Repo state

- fresh main: 253abf2 (0.26.0)
- link main: 9643be5 (0.21.0)
- bcfishpass: 440bc1e
- `docker/docker-compose.yml:21-35` — current `db.command` block has
memory + parallelism flags but no I/O cost flags
- `docker/tuning.md` — has "Settings rationale" table and a "Scaling
for other machines" recipe; new flags are global (not RAM-scaled),
no scaling change needed
24 changes: 19 additions & 5 deletions planning/active/progress.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,21 @@
# Progress — fresh#158
# Progress — pg tuning: SSD planner-cost defaults (#199)

## Session 2026-05-01
## Session 2026-05-04

- Branch `158-frs-order-child` from main HEAD `253abf2` (post-fresh#191 release).
- PWF baseline.
- Next: write `R/frs_order_child.R` per issue body spec, mock SQL tests.
- Plan-mode exploration — phases approved by user
- Archived stale top-level planning files (#177) and stale active/ (#158)
with outcome READMEs in commit `80170b4`
- Created branch `199-pg-tuning-ssd-planner-cost-defaults` off main (HEAD `80170b4`)
- Scaffolded PWF baseline from issue #199 with approved phases
- Phase 1 complete: added `random_page_cost=1.1`, `effective_io_concurrency=200`,
`temp_buffers=64MB` to `db.command`. Restarted local DB; all three values
verified live via `SHOW`.
- Phase 2 complete: added three rows to the Settings rationale table in
`docker/tuning.md`, plus an "SSD assumption" section noting the
M1/cypher override-file caveat (override `command:` replaces base, so
same flags must land in the rtj-tracked override).
- Phase 3 complete: pushed branch, opened PR #200 with `Closes #199`.
Manual code-check on the docker/ diff — clean (pure ASCII in the YAML
block, no secrets, indentation matches surrounding `-c` flags, DB
came up after restart and live `SHOW` confirmed all three values).
- Next: merge, archive PWF, companion rtj-side change (separate work).
91 changes: 46 additions & 45 deletions planning/active/task_plan.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,46 @@
# Task Plan — fresh#158: frs_order_child

## Phase 1: Setup
- [x] Branch `158-frs-order-child` from main (HEAD `253abf2`)
- [ ] PWF baseline (task_plan, findings, progress)

## Phase 2: Code change — fresh-side
- [ ] New `R/frs_order_child.R` exporting `frs_order_child(conn, table, habitat, species, label = "rearing", parent_order_min = 5, child_order_min = NULL, child_order_max = NULL, distance_max = NULL)`
- [ ] SQL per the [issue spec](https://github.com/NewGraphEnvironment/fresh/issues/158): post-classification UPDATE that adds `<label> = TRUE` to segments where `accessible IS TRUE`, `<label> IS NOT TRUE`, `s.stream_order = s.stream_order_max`, `s.stream_order_parent >= parent_order_min`, optionally bounded by `child_order_min/max` and `distance_max`
- [ ] roxygen docstring with parameters, examples, biology rationale
- [ ] `devtools::document()` clean

## Phase 3: Tests
- [ ] `tests/testthat/test-frs_order_child.R`: SQL shape (mock `.frs_db_execute`)
- default invocation emits the canonical SQL with parent_order_min=5
- `child_order_min/max` bounds appear when set
- `distance_max` adds `downstream_route_measure <= ...` clause
- `accessible IS TRUE` and `<label> IS NOT TRUE` guards always present
- `species` substituted correctly
- [ ] `devtools::test(filter = "frs_order_child")` clean

## Phase 4: Code-check
- [ ] `/code-check` on staged diff

## Phase 5: Release
- [ ] DESCRIPTION: 0.26.0 → 0.27.0
- [ ] NEWS.md: 0.27.0 entry
- [ ] PR with `Fixes #158`
- [ ] Merge, tag

## Phase 6: Link follow-up
- [ ] Bump fresh dep 0.26.0 → 0.27.0
- [ ] dimensions.csv: flip `rear_stream_order_bypass = yes` for BT/CH/CO/ST/WCT in bcfishpass bundle
- [ ] Default-bundle TBD per methodology decision (issue mentions parametric `distance_max` could differ)
- [ ] `lnk_rules_build` already emits `channel_width_min_bypass` field but fresh doesn't read it yet — wire to call `frs_order_child` per-species after classify
- [ ] Verify HORS BT closes from -7.68% to within ±1%
- [ ] 15-WSG distributed re-run

## Verification

- HORS BT rearing_stream: -7.68% → expected within ±1%
- HORS CH/CO/ST: similar closure expected
- COLR/KHOR/CLRH WCT: similar closure expected (provincial Class B)
- bcfishpass-bundle other WSGs: bit-identical (function only fires where `rear_stream_order_bypass: yes` is set)
- default-bundle: TBD per methodology
# Task: pg tuning — SSD planner-cost defaults in docker-compose.yml (#199)

`docker/docker-compose.yml` doesn't set `random_page_cost` or
`effective_io_concurrency`, so postgres falls through to PostgreSQL
defaults of `4.0` and `1` — both calibrated for spinning rust. All
NewGraph hosts run on SSD (M4 NVMe, M1 Colima virtiofs over APFS,
cypher DO block storage). With `random_page_cost=4`, the planner
systematically biases away from index scans for segment-keyed lookups
in link's pipeline (`WHERE blue_line_key = … AND drm <= …` against
`streams_breaks` is the hot path).

Verified on M4 + M1 + cypher (2026-05-04) — all three show
`random_page_cost=4`, `effective_io_concurrency=1`, `temp_buffers=8MB`.

## Phase 1: Add SSD planner-cost flags to docker-compose.yml
- [x] Append to `db.command` block in `docker/docker-compose.yml`:
- `-c random_page_cost=1.1`
- `-c effective_io_concurrency=200`
- `-c temp_buffers=64MB`
- [x] Restart local Docker DB (`docker compose down && docker compose up -d db`)
- [x] Verify via `SHOW random_page_cost; SHOW effective_io_concurrency; SHOW temp_buffers;`
(verified live: `random_page_cost=1.1`, `effective_io_concurrency=200`, `temp_buffers=64MB`)

## Phase 2: Document in tuning.md
- [x] Add three rows to the "Settings rationale" table
(`random_page_cost`, `effective_io_concurrency`, `temp_buffers`) with
SSD justification per setting
- [x] Add a short "SSD assumption" note: all NewGraph hosts run on SSD
(M4 NVMe, M1 Colima virtiofs over APFS, cypher DO block storage),
these defaults bias the planner toward index scans for segment-keyed
lookups in link's pipeline
- [x] Cross-reference the companion rtj issue for the M1/cypher override
file — override `command:` REPLACES base under docker-compose merge
semantics; same change must land in both

## Phase 3: Verify and PR
- [x] `/code-check` on the diff (manual; clean)
- [x] Push branch, open PR with `Closes #199` (PR #200)
- [x] Note benchmark verification is a post-merge step (≥10% median
per-WSG wall reduction at unchanged segment count) — owner runs from
M4 against `data-raw/logs/provincial_default_extrabreaks/<TS>_per_wsg_times.csv`
baseline (2026-05-04). Documented in PR test-plan.

## Validation
- [ ] PWF checkboxes match landed work
- [ ] `/planning-archive` on completion
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
## Outcome

Dropped `format` and `long_value_col` parameters from `frs_habitat_overlay()`. Function now accepts a single canonical shape (row-per-(segment × species) with per-habitat indicator columns), parameterized by `species_col`, `by`, and `habitat_types`. Callers with non-canonical sources transform first (SQL view, R pivot, or link's `lnk_ingest_bcfishpass()`). Bridge mode retained.

## Closed By

PR #176.
File renamed without changes.
File renamed without changes.
File renamed without changes.
7 changes: 7 additions & 0 deletions planning/archive/2026-05-issue-158-frs-order-child/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
## Outcome

Added `frs_order_child()` — post-classification UPDATE that flags direct-children of large-order parent streams (`stream_order = stream_order_max` AND `stream_order_parent >= parent_order_min`). Wired into the rules YAML via `channel_width_min_bypass` block so per-species rearing classifiers can opt in. Closes the BT/CO/CH/ST/WCT rearing-stream gap on bcfishpass-bundled WSGs (HORS, COLR, KHOR, CLRH, etc.).

## Closed By

PR #193 (initial impl), with follow-ups #195 and #196 (SQL fixes for the missing/restored `stream_order_max` column via CTE).
63 changes: 63 additions & 0 deletions planning/archive/2026-05-issue-158-frs-order-child/findings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Findings — fresh#158

## Diagnosis (2026-05-01)

bcfishpass's per-species rear rule has an inline OR clause:

```sql
(cw.channel_width >= t.rear_channel_width_min OR
(s.stream_order_parent >= 5 AND s.stream_order = 1))
```

at [`bcfishpass/model/02_habitat_linear/sql/load_habitat_linear_bt.sql`](https://github.com/smnorris/bcfishpass/blob/main/model/02_habitat_linear/sql/load_habitat_linear_bt.sql) lines 89–96 (BT — same pattern in CH/CO/ST/WCT). Direct order-1 tributaries of order-5+ mainstems get credited as rearing **even when cw < rear_min**.

Biology: small tribs of large rivers support juvenile rearing despite small FWA-measured channel width — parent supplies flow / temperature / access; cool tributary water mixes at confluence; backwater + off-channel habitat near the mouth is high-value.

fresh has no implementation. link's `dimensions.csv::rear_stream_order_bypass = no` for all species in both bundles (correctly anticipating fresh has nothing to read). Provincial parity baseline (link 0.20.1) shows this gap on HORS / COLR / KHOR / CLRH / etc — Class B in `link/research/provincial_parity_2026_05_01.md`.

## Function design (per issue body)

`frs_order_child(conn, table, habitat, species, label = "rearing", parent_order_min = 5, child_order_min = NULL, child_order_max = NULL, distance_max = NULL)`

Post-classification UPDATE:

```sql
UPDATE habitat
SET <label> = TRUE
FROM streams s
WHERE habitat.id_segment = s.id_segment
AND habitat.species_code = '<species>'
AND habitat.accessible = TRUE
AND habitat.<label> IS NOT TRUE
AND s.stream_order = s.stream_order_max -- direct child
AND s.stream_order_parent >= <parent_order_min> -- of large river
AND s.stream_order >= <child_order_min> -- if set
AND s.stream_order <= <child_order_max> -- if set
AND (<distance_max> IS NULL OR
s.downstream_route_measure <= <distance_max>);
```

Direct-child filter: `s.stream_order = s.stream_order_max` ensures we stop at the order-change point (once the segment's order would exceed `stream_order_max`, you're no longer on the direct-child reach). Captures "small tributary directly into a large parent."

`accessible = TRUE` guard: never adds rearing on segments above a definite barrier.

`<label> IS NOT TRUE` guard: idempotent + additive — already-classified segments untouched.

## Pre vs post cluster — design decision (post-cluster wins)

bcfishpass embeds the bypass in the rule predicate (pre-cluster). Issue body's analysis: post-cluster is cleaner because:
- Bypassed segments don't need to pass connectivity (they're connected to the large parent by definition; the biology of the bypass IS the connectivity)
- `accessible = TRUE` already gates barrier-blocked reaches
- Post-cluster matches the parametric form (function takes WSG/species and applies, not a rule predicate that needs SQL grammar in fresh)
- Same end-state numbers as bcfp on the BCFP-parity caller-side defaults

## Caller defaults (link side, separate PR)

- bcfishpass bundle: `frs_order_child(species, parent_order_min = 5)` (defaults) for BT/CH/CO/ST/WCT
- default bundle: methodology-pending. Could ship same as bcfp first, tune later

## Versions at start

- fresh main: 253abf2 (0.26.0)
- link main: 9643be5 (0.21.0)
- bcfishpass: 440bc1e
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Progress — fresh#158

## Session 2026-05-01

- Branch `158-frs-order-child` from main HEAD `253abf2` (post-fresh#191 release).
- PWF baseline.
- Next: write `R/frs_order_child.R` per issue body spec, mock SQL tests.
Loading
Loading