Config CSV provenance + pipeline run stamps (close the drift loop)

## Problem

Pipeline outputs drift silently when underlying inputs change — CSV syncs, fwapg refreshes, bcfishobs updates. On 2026-04-22 a 0.4 pp shift in BT rearing diff vs bcfishpass looked like a refactor regression. It turned out the legacy script on the same DB produced identical numbers — the drift was entirely from env state changes between the earlier comparison run (2026-04-15) and today. Without a stamp of all inputs at run time, "what changed?" is unanswerable.

This issue is about closing that loop end-to-end: every config CSV carries provenance; every pipeline run emits a stamp; drift between any two runs is diffable from their stamps alone.

## Proposed Solution

Two layers:

### 1. Config-bundle provenance (at rest)

Extend `inst/extdata/configs/<name>/config.yaml` with a `provenance` section per synced file:

```yaml
provenance:
  overrides/user_modelled_crossing_fixes.csv:
    source: https://github.com/smnorris/bcfishpass
    path: data/user_modelled_crossing_fixes.csv
    upstream_sha: ea3c5d8
    synced: 2026-04-13
    checksum: sha256:ab12cd34...
  rules.yaml:
    generated_from: dimensions.csv
    generated_by: lnk_rules_build
    generator_sha: 8266b52
  dimensions.csv:
    source: link (hand-authored)
    upstream_sha: 8266b52
    synced: 2026-04-13
```

`lnk_config()` reads this and exposes `cfg$provenance`. A `lnk_config_verify()` helper re-computes checksums on load and warns if any file drifted from its stored hash.

### 2. Run stamps (at run)

Every pipeline invocation emits a run-stamp object recording:

- `cfg$provenance` (the "at rest" state of every input CSV)
- Software versions + git SHAs: link, fresh, bcfishpass, fwapg
- DB snapshot hashes: bcfishobs row count, fwa_stream_networks_sp last-vacuum or relfilenode, bcfishpass reference row counts per species
- AOI + schema + break_order + any `species =` override
- Start/end timestamps, elapsed per phase
- The resulting comparison tibble (if a reference was provided)

Implementation options:

- **Expand `lnk_stamp()` (#24)** from "report appendix markdown" into "runtime reproducibility record." The report-appendix flavour becomes one rendering of the same underlying stamp object (`as.markdown(stamp)`).
- A run stamp is emitted as the return value of a forthcoming `lnk_pipeline_run()` wrapper (not built yet — right now pipelines are composed explicitly via `lnk_pipeline_*` phase calls).

## Scope for a first PR

1. Add `provenance` block to `inst/extdata/configs/bcfishpass/config.yaml` for every file currently tracked. Backfill with the smnorris SHA we know from the research doc (`ea3c5d8`, synced 2026-04-13).
2. Add `cfg$provenance` to the `lnk_config()` return.
3. Add `lnk_config_verify(cfg)` — recomputes sha256 of every provenanced file, reports drift.
4. Expand `lnk_stamp()` (reusing #24's scope) to produce a runtime-stamp list that merges `cfg$provenance` with runtime software + DB snapshot info.
5. Wire the stamp into the top of `data-raw/compare_bcfishpass.R` output — every verification log starts with a stamp dump.

## Non-goals

- Full diff-viewer tool — just capturing the data; diffing two stamps is a later concern.
- CSV auto-sync from upstream — that's a cron/maintenance issue, not a library one.
- Machine-readable schema that all downstream packages must consume — keep the stamp inspectable as a plain list.

## Cross-refs

- Supersedes the narrow scope of #24 (report-appendix) by making that one render of this broader stamp.
- Feeds PR 2 of #38 — the `_targets.R` target function should return `(diff_tibble, stamp)` so `tar_read(rollup)` carries full lineage.
- Related to the "always stamp verification logs" memory added on 2026-04-22.

## Versions

- fresh: 0.14.0
- link: 0.3.0 (on branch `38-targets-pipeline`)
- bcfishpass: ea3c5d8
- fwapg: Docker (FWA 20240830, channel_width synced from tunnel 2026-04-13)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Config CSV provenance + pipeline run stamps (close the drift loop) #40

Problem

Proposed Solution

1. Config-bundle provenance (at rest)

2. Run stamps (at run)

Scope for a first PR

Non-goals

Cross-refs

Versions

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Config CSV provenance + pipeline run stamps (close the drift loop) #40

Description

Problem

Proposed Solution

1. Config-bundle provenance (at rest)

2. Run stamps (at run)

Scope for a first PR

Non-goals

Cross-refs

Versions

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions