feat(flowsheet): explicit metadata_status enum lifecycle (BS#891)#1004
Merged
Conversation
Replaces the implicit two-column state machine ({metadata_attempt_at,
artwork_url/discogs_url}) with an explicit `metadata_status` enum column
on flowsheet. Five states: pending, enriching, enriched_match,
enriched_no_match, failed_no_retry. Backfill recipe in
docs/flowsheet-metadata-status-backfill.md.
Companion `enriching_since timestamptz` column carries the race-claim
timestamp Epic C C2 (BS#892) will use; the stale-claim recovery sweep
Epic C C6 (BS#896) will use a 60s TTL against it. Two partial indexes
ship with the migration: pending-sweep (id-keyed) and stale-enriching
(enriching_since-keyed). Both must be built CONCURRENTLY on prod first
per the 0070/0074 pattern; the migration's IF NOT EXISTS clauses make
the apply a no-op when the prebuilt indexes are present.
Migration is DDL-only — backfill SQL is documented but not inlined.
Per bulk-update playbook, the 2.6M-row UPDATE runs in batched ops sessions
with synchronous_commit=off, paired with ANALYZE per BS#934.
V2 wire format now projects metadata_status onto track rows so iOS
(WXYC/wxyc-ios-64#287, #270) can branch on it without falling back to
the proxy-fetch path for already-enriched rows.
Schema-source tests pin the migration <-> schema.ts <-> projection contract.
Schema constraint shape reportno schema or migration changes detected in this PR |
Member
Author
Code reviewFound 2 issues:
Backend-Service/shared/database/src/schema.ts Lines 749 to 774 in 576c3b5 Backend-Service/docs/flowsheet-metadata-status-backfill.md Lines 4 to 14 in 576c3b5 Backend-Service/docs/migrations.md Lines 102 to 106 in 576c3b5
|
…k path Reviewer-flagged drift in cross-references: - C6 (cron predicate flip to metadata_status='pending') is #895, not #896. #896 is C7 (CDC delivery verification). Five surfaces corrected: schema.ts comments, migration SQL comment, migrations.md, runbook body and Related section. - Migration SQL comment referenced docs/runbooks/flowsheet-metadata-status-backfill.md but the file is at docs/flowsheet-metadata-status-backfill.md (no runbooks/ subdir exists). - Also: prettier format fix on docs/flowsheet-metadata-status-backfill.md (markdown table alignment) so CI's format:check passes. Migration SQL hash refrozen via scripts/freeze-migration-hashes.mjs after the comment-only edit.
This was referenced May 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the implicit two-column state machine (
metadata_attempt_at+ populated-column inspection) with an explicitmetadata_statusenum onflowsheet. Five states encode the full enrichment lifecycle:pending— never tried OR transient failure (retry-eligible). Default on every new row.enriching— a consumer instance has claimed this row and is mid-LML-call. Set when Epic C C2 ([C2] Build CDC-driven enrichment consumer (separate worker) #892) flips a row frompendingtoenrichingvia the idempotent claim.enriching_sincecarries the claim timestamp.enriched_match— LML returned full Discogs metadata; populated columns are authoritative.enriched_no_match— LML succeeded with no Discogs match; only the synthesized YouTube/Bandcamp/SoundCloud search URLs are populated (post-Live LML metadata enrichment silently fails on most cold-cache lookups (5 s timeout, no catch-arm fallback) #873 fallback path).failed_no_retry— exceeded the retry budget; terminal.Companion
enriching_since timestamptzcarries the race-claim timestamp; the C6 (#896) recovery sweep uses a 60s TTL against it to revert stuck rows.Migration shape
DDL-only. The
CASE-derived backfill from existing state is documented but not inlined — a 2.6M-row UPDATE inside the migration transaction would block writes for the duration. Backfill recipe atdocs/flowsheet-metadata-status-backfill.md.CREATE TYPE wxyc_schema.metadata_status_enum AS ENUM(...)ALTER TABLE flowsheet ADD COLUMN metadata_status ... NOT NULL DEFAULT 'pending'— constant-default ADD COLUMN is metadata-only on PG11+; no row rewrite, no AccessExclusiveLock beyond the catalog update.ALTER TABLE flowsheet ADD COLUMN enriching_since timestamptz— nullable, no default.IF NOT EXISTS(same pattern as 0070 / 0074):flowsheet_metadata_status_pending_idxon(id)WHEREentry_type='track' AND artist_name IS NOT NULL AND metadata_status='pending'— covers the C6 sweep.flowsheet_metadata_status_enriching_stale_idxon(enriching_since)WHEREmetadata_status='enriching'— covers the stale-claim recovery sweep.Ops note: Build both indexes CONCURRENTLY on prod before merging. The migration's
IF NOT EXISTSclauses make the apply a no-op when the prebuilt indexes are present. Statements for the CONCURRENTLY build are in the migration's comment block.V2 wire format
transformToV2now projectsmetadata_statusonto track rows. iOS branches on it to decide whether to render inline metadata or fall back to the proxy-fetch path (WXYC/wxyc-ios-64#287 already shipped the decoder; #270 will ship the consumer logic).metadata_attempt_atstays as a historical marker — once Epic C C6 (#896) flips the cron predicate tometadata_status = 'pending', it's no longer used for control flow. Themetadata_attempt_atpartials inschema.tsand the corresponding cron query are not removed in this PR; they get dropped alongside the cron flip.Test plan
tests/unit/database/schema.flowsheet-metadata-status.test.ts— 14 schema-source tests pin migration +schema.tsto the documented enum order, the constant-default ADD COLUMN shape, the two partial-index predicates, the no-CONCURRENTLY-in-DDL rule, and the no-inline-backfill rule.tests/unit/services/flowsheet.transformToV2.metadata-status.test.ts— 6 unit tests covering all 5 enum values projected onto track rows + the non-track exclusion regression guard.node scripts/validate-migrations.mjsclean.npm run lintintroduces no new warnings/errors on the touched files.npm run format:checkclean.Critical-path next steps (out of scope for this PR)
docs/flowsheet-metadata-status-backfill.md.metadata_attempt_at_pending_*partials.Related
enriched_no_matchreachability requirement), Suggest endpoints regress to 5s timeouts after mojibake migration #863 (missing ANALYZE on touched tables) #934 (post-bulk-updateANALYZErule)