frs_habitat_overlay: drop format, accept canonical shape only (#177)#176
Merged
NewGraphEnvironment merged 2 commits intomainfrom Apr 27, 2026
Merged
frs_habitat_overlay: drop format, accept canonical shape only (#177)#176NewGraphEnvironment merged 2 commits intomainfrom
NewGraphEnvironment merged 2 commits intomainfrom
Conversation
Pivot from PR #176's species_col bolt-on to the simpler scope agreed in the design discussion: drop format and long_value_col entirely; keep column-name params for genuine flexibility; require source to be in the canonical shape (one row per (segment x species) with per-habitat indicator columns). Breaking changes (pre-1.0, single consumer = link): - Removed: format = c("wide", "long"), long_value_col = "habitat_ind" - Removed: per-species-suffixed wide layout dispatch (spawning_sk etc.) - Removed: long-format dispatch (habitat_type rows + habitat_ind) - New required path: species_col (default "species_code") What stays parameterized: - species_col, by, habitat_types — all column names customizable - bridge (3-way range-containment join) — orthogonal to source shape - Universal indicator coercion: lower(trim(::text)) IN ('true','t','1') matches integer 1, text 'true'/'t'/'1', boolean TRUE; everything else falsy (incl. NULL, 0, 'false', 'f', '') Non-canonical sources (legacy long format, wide-suffix layout) are recoverable via a SQL view that pivots into canonical shape; the docstring's third example shows the long-to-canonical pivot for reference. Tests rewritten for canonical shape: - Drop wide-suffix and long-format paths - Canonical-shape integration tests cover integer/text/boolean indicators, additive guard, custom species_col, custom by, bridge - 51 PASS in overlay tests (was 67 with the bolt-on); full fresh suite 815 PASS, 0 FAIL (was 831 with the now-removed paths). Code-check: clean. No SQL injection vectors, NULL-safe coercion, additive guard parenthesised. Coordinated link release (0.12.0) updates the call site in lnk_pipeline_classify (separate PR after this merges). Closes #177 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d5a61b0 to
7538e76
Compare
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #177.
Drops
formatandlong_value_colparameters fromfrs_habitat_overlay(). Accepts only the canonical source-table shape: one row per (segment × species), with join keys + species column + one indicator column per habitat type. Indicator coercion is universal — accepts integer1, text'true'/'t'/'1'(case + whitespace insensitive), boolean.Why drop, not extend
PR #176's first attempt added a
species_colparameter to bolt the new bcfishpass shape onto the existingformat = "wide"dispatch. On review, that compounded the API rather than fixing the design. The earlierformat = c("wide", "long")enum was scoped for two shapes that have no current production consumers:format = "wide"per-species-suffixed (spawning_sk,rearing_sk) — was scoped for direct reads ofbcfishpass.streams_habitat_known; never integratedformat = "long"(habitat_typerows +habitat_indindicator) — was link's pre-2026-04-26 path; bcfishpass moved to a different shape on 2026-04-26Carrying flexibility for two shapes with zero consumers is YAGNI. Drop them; rebuild when a real second consumer with a different shape appears.
What stays parameterized
from,to,bridgespeciesspecies_col"species_code")habitat_typesbyc("blue_line_key", "downstream_route_measure"))Bridge mode (3-way range-containment join) is orthogonal to source shape — unchanged.
Non-canonical sources (transform first)
Callers with non-canonical sources (legacy long format, wide-suffix layout) transform first via a SQL view, R pivot, or upstream adapter, then call overlay against the canonical-shape view. The docstring's third example shows the long-to-canonical pivot. Shape-translation lives with the consumer; fresh stays a thin SQL adapter.
Test plan
devtools::test(filter = "frs_habitat_overlay"): 51 PASS (was 67 with bolt-on)devtools::test()full suite: 815 PASS, 0 FAIL (was 831 with the now-removed paths)/code-check: Clean. All caller-controlled identifiers validated, NULL-safe coercion, additive guard parenthesised.Coordination
0.22.0with this change (breaking, pre-1.0)0.22.0and updateslnk_pipeline_classifycall site (separate link PR after this merges; bumps to link 0.12.0 in the same release window)lnk_ingest_bcfishpass()(canonicalize-at-ingest) lands as a follow-up — not blocking this unblockRelated
link/comms/crate/20260427_fresh_bcfishpass_csv_consumers.md🤖 Generated with Claude Code