Skip to content

frs_habitat_overlay: drop format, accept canonical shape only (#177)#176

Merged
NewGraphEnvironment merged 2 commits intomainfrom
overlay-species-col
Apr 27, 2026
Merged

frs_habitat_overlay: drop format, accept canonical shape only (#177)#176
NewGraphEnvironment merged 2 commits intomainfrom
overlay-species-col

Conversation

@NewGraphEnvironment
Copy link
Copy Markdown
Owner

@NewGraphEnvironment NewGraphEnvironment commented Apr 27, 2026

Summary

Closes #177.

Drops format and long_value_col parameters from frs_habitat_overlay(). Accepts only the canonical source-table shape: one row per (segment × species), with join keys + species column + one indicator column per habitat type. Indicator coercion is universal — accepts integer 1, text 'true'/'t'/'1' (case + whitespace insensitive), boolean.

Why drop, not extend

PR #176's first attempt added a species_col parameter to bolt the new bcfishpass shape onto the existing format = "wide" dispatch. On review, that compounded the API rather than fixing the design. The earlier format = c("wide", "long") enum was scoped for two shapes that have no current production consumers:

  • format = "wide" per-species-suffixed (spawning_sk, rearing_sk) — was scoped for direct reads of bcfishpass.streams_habitat_known; never integrated
  • format = "long" (habitat_type rows + habitat_ind indicator) — was link's pre-2026-04-26 path; bcfishpass moved to a different shape on 2026-04-26

Carrying flexibility for two shapes with zero consumers is YAGNI. Drop them; rebuild when a real second consumer with a different shape appears.

What stays parameterized

param what it controls
from, to, bridge which tables
species which species to process (NULL = all in target)
species_col name of the column in source carrying the species code (default "species_code")
habitat_types which habitat columns to overlay
by join keys (default c("blue_line_key", "downstream_route_measure"))

Bridge mode (3-way range-containment join) is orthogonal to source shape — unchanged.

Non-canonical sources (transform first)

Callers with non-canonical sources (legacy long format, wide-suffix layout) transform first via a SQL view, R pivot, or upstream adapter, then call overlay against the canonical-shape view. The docstring's third example shows the long-to-canonical pivot. Shape-translation lives with the consumer; fresh stays a thin SQL adapter.

Test plan

  • devtools::test(filter = "frs_habitat_overlay"): 51 PASS (was 67 with bolt-on)
  • devtools::test() full suite: 815 PASS, 0 FAIL (was 831 with the now-removed paths)
  • /code-check: Clean. All caller-controlled identifiers validated, NULL-safe coercion, additive guard parenthesised.
  • Integration tests cover: integer indicators, text 'true'/'true'/'t' indicators, additive guard (already-TRUE rows untouched), custom species_col, custom by, bridge mode

Coordination

  • Fresh ships 0.22.0 with this change (breaking, pre-1.0)
  • link picks up fresh 0.22.0 and updates lnk_pipeline_classify call site (separate link PR after this merges; bumps to link 0.12.0 in the same release window)
  • crate's registry-driven lnk_ingest_bcfishpass() (canonicalize-at-ingest) lands as a follow-up — not blocking this unblock

Related

🤖 Generated with Claude Code

@NewGraphEnvironment NewGraphEnvironment changed the title frs_habitat_overlay: species_col for bcfishpass authoritative CSV shape WIP — Refactor overlay format dispatch to species_layout × habitat_layout (#177) Apr 27, 2026
@NewGraphEnvironment NewGraphEnvironment marked this pull request as draft April 27, 2026 14:45
NewGraphEnvironment and others added 2 commits April 27, 2026 09:03
Issue #177 retargeted from "layout decomposition" to the simpler
"drop format, hard-code canonical shape with col-name params"
scope per design discussion. PR #176 force-push pending.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pivot from PR #176's species_col bolt-on to the simpler scope agreed in
the design discussion: drop format and long_value_col entirely; keep
column-name params for genuine flexibility; require source to be in the
canonical shape (one row per (segment x species) with per-habitat
indicator columns).

Breaking changes (pre-1.0, single consumer = link):
- Removed: format = c("wide", "long"), long_value_col = "habitat_ind"
- Removed: per-species-suffixed wide layout dispatch (spawning_sk etc.)
- Removed: long-format dispatch (habitat_type rows + habitat_ind)
- New required path: species_col (default "species_code")

What stays parameterized:
- species_col, by, habitat_types — all column names customizable
- bridge (3-way range-containment join) — orthogonal to source shape
- Universal indicator coercion: lower(trim(::text)) IN ('true','t','1')
  matches integer 1, text 'true'/'t'/'1', boolean TRUE; everything
  else falsy (incl. NULL, 0, 'false', 'f', '')

Non-canonical sources (legacy long format, wide-suffix layout) are
recoverable via a SQL view that pivots into canonical shape; the
docstring's third example shows the long-to-canonical pivot for
reference.

Tests rewritten for canonical shape:
- Drop wide-suffix and long-format paths
- Canonical-shape integration tests cover integer/text/boolean
  indicators, additive guard, custom species_col, custom by, bridge
- 51 PASS in overlay tests (was 67 with the bolt-on); full fresh suite
  815 PASS, 0 FAIL (was 831 with the now-removed paths).

Code-check: clean. No SQL injection vectors, NULL-safe coercion,
additive guard parenthesised.

Coordinated link release (0.12.0) updates the call site in
lnk_pipeline_classify (separate PR after this merges).

Closes #177

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@NewGraphEnvironment NewGraphEnvironment changed the title WIP — Refactor overlay format dispatch to species_layout × habitat_layout (#177) frs_habitat_overlay: drop format, accept canonical shape only (#177) Apr 27, 2026
@NewGraphEnvironment NewGraphEnvironment marked this pull request as ready for review April 27, 2026 16:49
@NewGraphEnvironment NewGraphEnvironment merged commit f577aff into main Apr 27, 2026
1 check passed
@NewGraphEnvironment NewGraphEnvironment deleted the overlay-species-col branch April 27, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

frs_habitat_overlay: drop format param, accept only canonical shape

1 participant