feat(mediorum): bound ops table via dormant cleanup, gap signal, and opt-in retention#277
Closed
RolfAris wants to merge 1 commit into
Closed
feat(mediorum): bound ops table via dormant cleanup, gap signal, and opt-in retention#277RolfAris wants to merge 1 commit into
RolfAris wants to merge 1 commit into
Conversation
…opt-in retention The mediorum CRUD ops table grows monotonically and is the largest relation on a stock-spec validator (docs.openaudio.org documents 200 GB disk; observed ~84 GB ops at val001). This change makes the table sustainable by addressing three concerns in one PR. ## 1. One-time dormant-table cleanup (default on) At startup, drop ops rows for CRUD-registered tables whose newest op is older than `OPENAUDIO_MEDIORUM_DORMANT_OPS_THRESHOLD` (default 90d, clamped up to a 24h safety floor). The motivating case is `qm_audio_analyses`: producer code paths for that table no longer write at the historical cadence; observed last op on val001 is 2025-11-07 (~6 months stale at default threshold). - Only tables registered via `RegisterModels` are eligible. An unregistered orphan (a table whose Go model was removed in a prior PR) is left alone. - Deletes are batched (10k rows per statement) and bounded by `ulid < cutoffULID`. The cutoff bound is load-bearing: if a producer writes a new op between dormancy classification and a later batch, that op sits above the cutoff and is never touched. - Idempotent: re-running on an already-cleaned table is a no-op. - Opt out with `OPENAUDIO_MEDIORUM_KEEP_DORMANT_OPS=true`. - Future-maintainer note in code: a new CRUD table added with a low write cadence (e.g. quarterly metrics) belongs outside CRUDR or behind a non-default threshold. ## 2. Sweep-handler retention-gap signal (default on, wire-compatible) This is a correctness fix independent of the retention work itself. Today's `serve_crud.go` + `client.go` pair has a latent silent-skip: when a peer's cursor `after=` is below the server's smallest available ULID, the server returns ops starting just above its floor, and the client sets its cursor to `ops[len(ops)-1].ULID` — silently jumping past the gap with no signal. On any long-running fleet the cursor table already has stalled cursors from peers offline for months or years that this path is silently feeding incomplete history. - Server: when `after != "" && after < min(ulid)`, set `X-Mediorum-Retention-Gap: true` and `X-Mediorum-Available-Min-Ulid` response headers. Response body is unchanged ops array, so older binaries see no protocol difference and continue with the existing silent-skip behavior — only upgraded clients benefit. - Client: detect the header, log an explicit operator-visible event, persist the cursor advance to the advertised floor *before* applying any ops (so a crash mid-apply doesn't leave us stuck below the gap), then continue applying the response. - Client: validate the advertised ulid. A candidate that fails to parse or decodes to a time more than 30 minutes ahead of the local wall clock is rejected; otherwise a hostile or misconfigured peer could silence one of our sweep streams with a forged far-future ulid. - `crudr.Stats().SweepGapAdvances` increments only on successful cursor persist, so the counter matches durable state. ## 3. Opt-in ongoing retention sweep Gated entirely by `OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYS`. Unset (the default) means archive mode: the goroutine starts, sees the unset config, blocks on `ctx.Done()`, and never deletes. Operators who want bounded ops storage set the env to a positive integer. The cutoff respects the cursor-floor invariant: no op whose ulid is greater than the slowest reachable peer's cursor (minus a safety margin) may be deleted. Empty cursors block all deletion (a peer that has never advanced is the most conservative cursor). The self-cursor row (if any) is skipped. Configuration (defaults shown): | Env | Default | Purpose | |---|---|---| | `OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYS` | unset | enable sweep at N days | | `OPENAUDIO_MEDIORUM_OPS_RETENTION_SWEEP_INTERVAL` | `1h` | per-tick cadence | | `OPENAUDIO_MEDIORUM_OPS_RETENTION_BATCH_LIMIT` | `10000` | rows per batch | | `OPENAUDIO_MEDIORUM_OPS_RETENTION_CURSOR_MARGIN` | `1h` | safety floor below slowest cursor | | `OPENAUDIO_MEDIORUM_DORMANT_OPS_THRESHOLD` | `90d` | dormancy window (clamped >= 24h) | | `OPENAUDIO_MEDIORUM_KEEP_DORMANT_OPS` | `false` | opt out of one-time cleanup | Each tick loops up to 10 batches so a backlogged node makes measurable progress without monopolizing the DB (upper bound 100k rows/tick at defaults, 2.4M rows/day at 1h cadence — exceeds the observed ~1.1M ops/day write rate). `Crudr.DryRunRetention` returns the same plan without executing any DELETE; operators can call it from a debug endpoint or audius-ctl subcommand before flipping retention on. ## Compatibility - Default behavior is unchanged. A node that doesn't set `OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYS` keeps every op it has today except dormant-table sediment. - The gap signal is wire-compatible: response body is unchanged, only two new optional headers. Older binaries ignore them and behave as before. ## Tests `go test -race -count=10 ./pkg/mediorum/crudr/...` is green (~56s wall-clock, no flakes): - Dormant cleanup: opt-out, idempotency, active-table protection, unregistered-table protection, context cancellation, race-guard (new op above cutoff during cleanup), batched large-table deletion, threshold-floor clamping. - Gap signal: server-side classification at/above/below min, empty-after-is-not-a-gap, empty-ops-table, stub round-trip, real `httptest.Server` end-to-end, double-tick suppression, hostile far-future ulid rejected, malformed ulid rejected, IsValidGapULID table tests. - Retention sweep: disabled-is-no-op, cursor floor, slow peer pins deletion to its cursor minus margin, empty cursor blocks all deletion, safety margin honored, batch limit, multi-table sweep, no-peers age-cutoff-only, self-cursor skipped, ancient cursor pins everything, per-tick max-batches cap, concurrent sweep + delete. - DryRunRetention: dormant-only preview, retention-skip-on-empty-cursor.
Contributor
Author
|
Superseded by #304, which is the same scope as a single squashed commit with the cursor-invariant correctness fixes folded in. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The mediorum CRUD ops table grows monotonically and is the largest
relation on a stock-spec validator (docs.openaudio.org documents 200 GB
disk; observed ~84 GB ops at val001). This change makes the table
sustainable by addressing three concerns in one PR.
1. One-time dormant-table cleanup (default on)
At startup, drop ops rows for CRUD-registered tables whose newest op
is older than
OPENAUDIO_MEDIORUM_DORMANT_OPS_THRESHOLD(default 90d,clamped up to a 24h safety floor). The motivating case is
qm_audio_analyses: producer code paths for that table no longerwrite at the historical cadence; observed last op on val001 is
2025-11-07 (~6 months stale at default threshold).
RegisterModelsare eligible. Anunregistered orphan (a table whose Go model was removed in a prior
PR) is left alone.
ulid < cutoffULID. The cutoff bound is load-bearing: if a producerwrites a new op between dormancy classification and a later batch,
that op sits above the cutoff and is never touched.
OPENAUDIO_MEDIORUM_KEEP_DORMANT_OPS=true.low write cadence (e.g. quarterly metrics) belongs outside CRUDR
or behind a non-default threshold.
2. Sweep-handler retention-gap signal (default on, wire-compatible)
This is a correctness fix independent of the retention work itself.
Today's
serve_crud.go+client.gopair has a latent silent-skip:when a peer's cursor
after=is below the server's smallest availableULID, the server returns ops starting just above its floor, and the
client sets its cursor to
ops[len(ops)-1].ULID— silently jumpingpast the gap with no signal. On any long-running fleet the cursor
table already has stalled cursors from peers offline for months or
years that this path is silently feeding incomplete history.
after != "" && after < min(ulid), setX-Mediorum-Retention-Gap: trueandX-Mediorum-Available-Min-Ulidresponse headers. Response body is unchanged ops array, so older
binaries see no protocol difference and continue with the existing
silent-skip behavior — only upgraded clients benefit.
persist the cursor advance to the advertised floor before applying
any ops (so a crash mid-apply doesn't leave us stuck below the gap),
then continue applying the response.
or decodes to a time more than 30 minutes ahead of the local wall
clock is rejected; otherwise a hostile or misconfigured peer could
silence one of our sweep streams with a forged far-future ulid.
crudr.Stats().SweepGapAdvancesincrements only on successfulcursor persist, so the counter matches durable state.
3. Opt-in ongoing retention sweep
Gated entirely by
OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYS. Unset (thedefault) means archive mode: the goroutine starts, sees the unset
config, blocks on
ctx.Done(), and never deletes. Operators who wantbounded ops storage set the env to a positive integer.
The cutoff respects the cursor-floor invariant: no op whose ulid is
greater than the slowest reachable peer's cursor (minus a safety
margin) may be deleted. Empty cursors block all deletion (a peer that
has never advanced is the most conservative cursor). The self-cursor
row (if any) is skipped.
Configuration (defaults shown):
OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYSOPENAUDIO_MEDIORUM_OPS_RETENTION_SWEEP_INTERVAL1hOPENAUDIO_MEDIORUM_OPS_RETENTION_BATCH_LIMIT10000OPENAUDIO_MEDIORUM_OPS_RETENTION_CURSOR_MARGIN1hOPENAUDIO_MEDIORUM_DORMANT_OPS_THRESHOLD90dOPENAUDIO_MEDIORUM_KEEP_DORMANT_OPSfalseEach tick loops up to 10 batches so a backlogged node makes
measurable progress without monopolizing the DB (upper bound 100k
rows/tick at defaults, 2.4M rows/day at 1h cadence — exceeds the
observed ~1.1M ops/day write rate).
Crudr.DryRunRetentionreturns the same plan without executing anyDELETE; operators can call it from a debug endpoint or audius-ctl
subcommand before flipping retention on.
Compatibility
OPENAUDIO_MEDIORUM_OPS_RETENTION_DAYSkeeps every op it has todayexcept dormant-table sediment.
two new optional headers. Older binaries ignore them and behave as
before.
Tests
go test -race -count=10 ./pkg/mediorum/crudr/...is green (~56swall-clock, no flakes):
unregistered-table protection, context cancellation, race-guard
(new op above cutoff during cleanup), batched large-table deletion,
threshold-floor clamping.
empty-after-is-not-a-gap, empty-ops-table, stub round-trip, real
httptest.Serverend-to-end, double-tick suppression, hostilefar-future ulid rejected, malformed ulid rejected, IsValidGapULID
table tests.
deletion to its cursor minus margin, empty cursor blocks all
deletion, safety margin honored, batch limit, multi-table sweep,
no-peers age-cutoff-only, self-cursor skipped, ancient cursor pins
everything, per-tick max-batches cap, concurrent sweep + delete.