fix(backend): clean stuck manifest backlog#2257
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a bounded PL/pgSQL hard-delete function for old soft-deleted app_versions, an admin Bun/TypeScript script to audit and iteratively call that function, trigger adjustments to avoid throwing on meta-update errors, and SQL/unit tests validating bounded batching and manifest cascade. ChangesManifest Backlog Cleanup System
Sequence Diagram(s)sequenceDiagram
participant Script as Admin Script
participant DB as Postgres
participant Fn as delete_old_deleted_versions
participant Manifest as public.manifest
Script->>DB: pre-cleanup audit queries (size, vacuum, bucket counts)
Script->>Fn: CALL public.delete_old_deleted_versions()
Fn->>DB: DELETE bounded rows -> cascade deletes Manifest rows
DB-->>Script: remaining eligible count
Script->>DB: optional VACUUM (ANALYZE) public.manifest
Script->>DB: post-cleanup audit queries
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Merging this PR will not alter performance
Comparing Footnotes
|
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/on-version-update-cleanup.unit.test.ts (1)
164-208: ⚡ Quick winUse
it.concurrent()for new test cases in this file to enable parallel test execution.The two test cases at lines 164 and 187 should be converted to concurrent form per the coding guidelines for test files.
♻️ Proposed change
- it('still deletes manifest rows when version metadata is missing', async () => { + it.concurrent('still deletes manifest rows when version metadata is missing', async () => { @@ - it('resets manifest counters after deleting manifest entries', async () => { + it.concurrent('resets manifest counters after deleting manifest entries', async () => {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/on-version-update-cleanup.unit.test.ts` around lines 164 - 208, Convert the two synchronous tests that call deleteIt (the ones referencing manifestEntries with ids 456 and 789) to concurrent tests by replacing it(...) with it.concurrent(...); specifically update the test declarations that use the descriptions 'still deletes manifest rows when version metadata is missing' and 'resets manifest counters after deleting manifest entries' so they call it.concurrent and keep the same test body and uses of deleteIt, createContext, createVersion, manifestEntries, and expectations intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/cleanup_stuck_manifest_backlog.ts`:
- Around line 204-208: The DB client config currently disables TLS certificate
validation by setting ssl: shouldUseSsl(databaseUrl) ? { rejectUnauthorized:
false } : undefined; update this so certificate validation is enabled: either
remove the rejectUnauthorized override or set it to true when
shouldUseSsl(databaseUrl) is true. Modify the Client initialization (the Client
constructor call that uses application_name, connectionString, ssl and
databaseUrl) so that ssl is either omitted when not needed or set to {
rejectUnauthorized: true } when shouldUseSsl(databaseUrl) returns true.
---
Nitpick comments:
In `@tests/on-version-update-cleanup.unit.test.ts`:
- Around line 164-208: Convert the two synchronous tests that call deleteIt (the
ones referencing manifestEntries with ids 456 and 789) to concurrent tests by
replacing it(...) with it.concurrent(...); specifically update the test
declarations that use the descriptions 'still deletes manifest rows when version
metadata is missing' and 'resets manifest counters after deleting manifest
entries' so they call it.concurrent and keep the same test body and uses of
deleteIt, createContext, createVersion, manifestEntries, and expectations
intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c2008313-2130-4d55-812f-56f97cc2f7f9
📒 Files selected for processing (6)
package.jsonscripts/cleanup_stuck_manifest_backlog.tssupabase/functions/_backend/triggers/on_version_update.tssupabase/migrations/20260513003104_manifest_cleanup_health.sqlsupabase/tests/16_test_retention.sqltests/on-version-update-cleanup.unit.test.ts
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
supabase/tests/16_test_retention.sql (1)
296-305: 💤 Low valueDocument the batch size dependency in the test assertion.
The test expects exactly 5 remaining versions after
delete_old_deleted_versions()runs, which depends on a batch size of 500 defined in the migration. While the batch size is documented in the migration's function comment, the test file itself has no explanation for this hard-coded expectation. Add a comment above the assertion clarifying the relationship: "Test expects 505 initial versions minus 500-version batch limit = 5 remaining".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@supabase/tests/16_test_retention.sql` around lines 296 - 305, Add a short explanatory comment above the assertion that documents the batch-size dependency: note that the test starts with 505 initial versions and the migration's batch limit (used by delete_old_deleted_versions) is 500, so the expectation is 505 - 500 = 5 remaining versions; reference the relevant symbols in the comment (delete_old_deleted_versions, remaining_batch_versions, retention_batch_targets) so future readers understand why the hard-coded 5 is used.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@supabase/tests/16_test_retention.sql`:
- Around line 296-305: Add a short explanatory comment above the assertion that
documents the batch-size dependency: note that the test starts with 505 initial
versions and the migration's batch limit (used by delete_old_deleted_versions)
is 500, so the expectation is 505 - 500 = 5 remaining versions; reference the
relevant symbols in the comment (delete_old_deleted_versions,
remaining_batch_versions, retention_batch_targets) so future readers understand
why the hard-coded 5 is used.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8099ccf7-b74a-48b3-b442-32cf418461f1
📒 Files selected for processing (3)
supabase/functions/_backend/triggers/on_version_update.tssupabase/tests/16_test_retention.sqltests/on-version-update-cleanup.unit.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/on-version-update-cleanup.unit.test.ts
|
|
There is still an object-lifecycle gap here: That is fine only if the soft-delete trigger has already removed every manifest object. But this PR is explicitly for a stuck/backlogged cleanup path, and I would either make the backlog script collect candidate |
KCDaemon
left a comment
There was a problem hiding this comment.
Rechecked latest head (4a3669f). This still has an object-lifecycle gap in the apply path.
admin:cleanup-stuck-manifest-backlog --apply calls public.delete_old_deleted_versions(), and that hard-deletes old app_versions; the related manifest rows are then removed by FK cascade. That clears the database backlog, but it does not call the R2/delete path for those manifest objects. After the cascade, the s3_path rows are gone, so there is no durable source left for a later storage cleanup retry or for the deleted-bundle read guard to identify those paths.
The new unit tests cover the normal deleteManifest() function path, but the backlog cleanup script bypasses that path and only calls SQL. To make this safe, the apply job needs to delete/enqueue the manifest objects before removing the rows, or persist a tombstone/path table until storage/cache cleanup is complete. Please add a regression for the actual cleanup script/function path, not just the regular version-delete path. The PR is also merge-conflicted (DIRTY), so keep blocked.



Summary (AI generated)
app_versions_metano longer blocks manifest row cleanup.delete_old_deleted_versions()migration so hard-delete cascades run in 500-version batches.admin:cleanup-stuck-manifest-backlogto audit and clear old stuck manifest rows, then vacuum analyzepublic.manifest.Motivation (AI generated)
public.manifestcan grow when old soft-deleted bundle cleanup is blocked or backlogged. The database should keep the same retention policy while providing a bounded operational path to remove old stuck rows safely.Business Impact (AI generated)
This reduces production database storage pressure and keeps bundle cleanup predictable without changing customer-facing bundle retention behavior. It also gives operators an auditable script to clean existing backlog after deployment.
Test Plan (AI generated)
bun lint:backendbunx eslint --no-ignore scripts/cleanup_stuck_manifest_backlog.tsbunx vitest run tests/on-version-update-cleanup.unit.test.tsPGSSLMODE=disable bunx supabase test db supabase/tests/00-supabase_test_helpers.sql supabase/tests/16_test_retention.sql --db-url postgresql://postgres:postgres@127.0.0.1:54562/postgresbun test:backendbun run admin:cleanup-stuck-manifest-backlog --helpGenerated with AI
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation