Skip to content

data-raw/snapshot_bcfp.sh: manual snapshot of bcfp deps from public sources (v0.31.1)#145

Merged
NewGraphEnvironment merged 2 commits into
mainfrom
137-data-raw-manual-snapshot-of-bcfp-depende
May 8, 2026
Merged

data-raw/snapshot_bcfp.sh: manual snapshot of bcfp deps from public sources (v0.31.1)#145
NewGraphEnvironment merged 2 commits into
mainfrom
137-data-raw-manual-snapshot-of-bcfp-depende

Conversation

@NewGraphEnvironment
Copy link
Copy Markdown
Owner

Summary

New data-raw/snapshot_bcfp.sh shell script. Loads bcfp dependencies into a local Postgres from public sources only (no SSH tunnel, no DB pg_dump). Prepares the local fwapg for lnk_pipeline_crossings() (link#138, in flight) and parity comparisons.

What it loads

Table Source Loader
whse_fish.pscis_* (4 BCDC tables) BCDC catalogue Python bcdata bc2pg --refresh
cabd.dams CABD's public GeoJSON API (same URL bcfp's jobs/load_weekly uses) ogr2ogr
fresh.modelled_stream_crossings bchamp modelled_stream_crossings.gpkg.zip curl + ogr2ogr
bcfishobs.observations bchamp observations.parquet (same artifact bcfp's jobs/load_observations consumes) ogr2ogr /vsicurl/...
fresh.crossings_bcfp / fresh.streams_bcfp (optional, --with-bcfp-views) s3://newgraph/bcfishpass.{crossings,streams}_vw.fgb.zip (anonymous read) aws s3 cp + ogr2ogr

Stamps data-raw/logs/bcfp_baselines.csv with the bcfp build identifier from s3://fresh-bc/bcfishpass/log.json via lnk_baseline_append(lnk_bucket_log()).

Documentation

data-raw/README.md — new ## Bootstrap section: prereqs (CLI tools + pip install bcdata), quick-start invocation, output schema list, pointer to lnk_pipeline_crossings() (#138) as the consumer.

Test plan

Closes #137.

Relates to NewGraphEnvironment/sred-2025-2026#24

NewGraphEnvironment and others added 2 commits May 8, 2026 06:57
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ublic sources

Loads everything lnk_pipeline_crossings() (#138) and parity comparisons need
into a local fwapg without a tunnel:

- BCDC PSCIS (4 tables) via Python `bcdata bc2pg --refresh`
- CABD dams via ogr2ogr from CABD's public GeoJSON API
- bchamp modelled_stream_crossings.gpkg.zip via curl + ogr2ogr
- bchamp observations.parquet via ogr2ogr /vsicurl/... (matches bcfp's
  jobs/load_observations -- the canonical observations source)
- Optional `--with-bcfp-views`: pulls Simon's bcfp output views
  (crossings_vw, streams_vw) from s3://newgraph for parity comparison
- Stamps data-raw/logs/bcfp_baselines.csv with the bcfp build identifier
  via lnk_baseline_append(lnk_bucket_log())

Documented in data-raw/README.md under a new ## Bootstrap section.

DESCRIPTION 0.31.0 -> 0.31.1 (patch -- data-raw + docs only, no R API change).
Tests no-op pass (808 PASS / 0 FAIL).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@NewGraphEnvironment NewGraphEnvironment merged commit 473146f into main May 8, 2026
1 check passed
@NewGraphEnvironment NewGraphEnvironment deleted the 137-data-raw-manual-snapshot-of-bcfp-depende branch May 8, 2026 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

data-raw: manual snapshot of bcfp dependencies into local fresh schema

1 participant