Skip to content

feat(assertion): detect library/artists artist_name drift#1302

Merged
jakebromberg merged 1 commit into
mainfrom
bs-1092-trigram-drift-assertion
Jun 3, 2026
Merged

feat(assertion): detect library/artists artist_name drift#1302
jakebromberg merged 1 commit into
mainfrom
bs-1092-trigram-drift-assertion

Conversation

@jakebromberg
Copy link
Copy Markdown
Member

Closes #1092. Slot 3 of #1279.

Summary

The existing soft-warning assertion on the catalog-search path covered the NULL leak path that 503'd catalog search on 2026-04-30, but only the NULL case. The same denormalization can drift quietly: any write to artists.artist_name that bypasses migration 0060's cascade trigger (ad-hoc admin SQL, a future logical-replication consumer, an admin UI touching artists directly) leaves library.artist_name stale. The catalog list view (library_artist_view's INNER JOIN) shows the new name; trigram search reads the denorm and silently returns nothing — invisible from Sentry until a DJ surfaces it.

This adds checkLibraryArtistNameDrift, a cheap indexed JOIN with IS DISTINCT FROM, memoized once per process, that surfaces a Sentry warning (tags tool=library-search, step=drift-check) with a sampled set of offending library_ids when drift > 0. Same soft-warning contract as the NULL check — never throws to the consumer; search continues to serve under any degraded state. The wrapper checkLibraryArtistNameHealth fans out to both checks via Promise.all so the existing fire-and-observe call sites in library.service.ts exercise both observability paths post-startup.

Files

  • apps/backend/services/library-artist-name-assertion.service.ts — add checkLibraryArtistNameDrift, fan out from the existing checkLibraryArtistNameHealth wrapper, extract runNullCheck from the prior monolithic runCheck. Module-level doc updated to describe the two-probe shape.
  • tests/unit/services/library-artist-name-assertion.test.ts — new checkLibraryArtistNameDrift describe (7 cases: clean, dirty with sample IDs, string-bigint coercion, missing sample_ids, memoization, cache-on-error, soft-warning contract); existing checkLibraryArtistNameHealth cases updated to mock both legs.
  • tests/mocks/database.mock.ts — expose artists.id and artists.artist_name for the JOIN in the new query.

Test plan

  • npm run test:unit — 2684 pass (16 in this file, up from 8)
  • npm run lint — 0 errors (478 pre-existing warnings unchanged)
  • npm run typecheck — clean
  • npx prettier --check <touched files> — clean
  • Remote CI green

…1092)

The existing assertion sweep covered the NULL leak path that 503'd catalog search on 2026-04-30, but only the NULL case. The same denormalization can drift quietly: any write to artists.artist_name that bypasses migration 0060's cascade trigger (ad-hoc admin SQL, a future logical-replication consumer, an admin UI touching artists directly) leaves library.artist_name stale. The catalog list view shows the new name via the JOIN; trigram search reads the denorm and silently returns nothing — invisible from Sentry until a DJ surfaces it.

Adds checkLibraryArtistNameDrift: a cheap indexed JOIN with IS DISTINCT FROM, memoized once per process, that surfaces a Sentry warning (tags tool=library-search, step=drift-check) with a sampled set of offending library_ids when drift > 0. Same soft-warning contract as the NULL check — never throws to the consumer; search continues to serve under any degraded state. The wrapper checkLibraryArtistNameHealth fans out to both checks via Promise.all so a single fire-and-observe call exercises both observability paths post-startup.
@jakebromberg jakebromberg merged commit 070e8cc into main Jun 3, 2026
5 checks passed
jakebromberg added a commit that referenced this pull request Jun 3, 2026
… doesn't 503 search (closes BS#1310)

Wraps both inner checks in `Promise.allSettled` so an inner-check rejection (e.g. drift query exceeding DB_STATEMENT_TIMEOUT_MS under load) does not propagate to catalog search callers — same failure shape as the 2026-04-30 OPN incident (BS#685), reintroduced via the larger query surface of the drift leg added in PR #1302 (BS#1092).

Also removes the `.catch(() => { _xPromise = null })` storm-pattern: a failed check now memoizes its rejection for the lifetime of the process, so a persistent timeout condition does not trigger a re-run of the expensive JOIN on every subsequent search call. Operator restart is the retry mechanism, matching the pre-PR #1302 behavior of the NULL check. Direct callers of `checkLibraryArtistNameDrift` still see the raw rejection (and re-await returns the memoized rejected promise) so opt-in consumers retain visibility; the wrapper absorbs both. Docstring updated to match.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

library_artist_view and trigram-search drift on ad-hoc rename: assertion only checks NULL, not value drift

1 participant