fix(cron): schedule missing GDPR retention and orphan cleanup jobs#863
fix(cron): schedule missing GDPR retention and orphan cleanup jobs#8632witstudios merged 5 commits intomasterfrom
Conversation
Two cron endpoints were fully implemented but never scheduled, causing expired personal data and orphaned files to accumulate indefinitely. - Add retention-cleanup at 1 AM UTC daily (sessions, tokens, versions, backups, permissions, monitoring) - Add orphaned file cleanup at 5 AM UTC Sundays (zero-reference file records + physical files) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 14 minutes and 57 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdded two UTC cron jobs (daily retention cleanup and weekly orphaned-files cleanup); orphaned-files cleanup now only deletes DB records for files successfully physically deleted; removed security_audit_log retention logic and related tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Cron as Cron (container)
participant Web as Web API (web:3000)
participant Processor as Processor Service
participant DB as Database
Cron->>Web: GET /api/cron/cleanup-orphaned-files
Web->>DB: query orphaned file records -> orphanIds
loop for each orphanId
Web->>Processor: request physical delete (storagePath)
alt Processor responds OK
Processor-->>Web: 200 OK
Web->>DB: mark id as safeToDelete (collect)
else Processor fails / error / malformed path
Processor-->>Web: non-OK / exception
Web-->>Web: record failure (exclude from safeToDelete)
end
end
alt safeToDelete not empty
Web->>DB: delete file records WHERE id IN safeToDelete
DB-->>Web: returning deleted ids
else no safeToDelete
Web-->>DB: skip deletions (dbDeleted = 0)
end
Web-->>Cron: HTTP response (summary/log)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docker/cron/crontab`:
- Around line 42-44: The scheduled endpoint /api/cron/cleanup-orphaned-files
currently removes DB records even when physical file deletion fails; open the
route handler in cleanup-orphaned-files/route.ts and change the flow so DB
deletions are conditional on successful physical deletes: for each orphaned
record attempt the physical delete with retries/backoff (or mark and enqueue for
retry), collect any failures and abort DB row removal when any physical delete
fails, or mark failed records with a persistent "delete_failed" state instead of
removing them; additionally return a non-2xx/error status when any physical
deletes failed so the cron caller can see the failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
…type Remove security_audit_log from retention cleanup — the tamper-evident hash chain requires infinite retention since the chain verifier has no gap handling and treats missing entries as tampering. Update endgame prototype to reflect the two newly scheduled cron jobs (retention-cleanup at 1am, orphan-cleanup Sundays 5am) and promote orphan cleanup status from amber to green. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address CodeRabbit review: if processor is down and physical file deletion fails, the DB record is now preserved so it retries on the next weekly run instead of creating a permanent storage leak. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Re: CodeRabbit review — non-atomic orphan delete path Good catch. Fixed in a8489b5 — orphans whose physical file deletion fails are now excluded from the DB deletion batch. They'll be rediscovered as orphans on the next weekly run and retried automatically. Specifically in
This prevents the permanent storage leak scenario where a transient processor failure + DB record deletion = unreferenceable physical files. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@prototypes/pagespace-endgame/src/components/panes/CompliancePane.tsx`:
- Around line 175-180: Update the text in the CompliancePane component (the
"Orphaned file cleanup" paragraph) to clarify that DB records are removed only
when the physical file deletion succeeds; replace the phrase "then removes DB
records" with wording such as "and, if the physical deletion succeeds, removes
the corresponding DB records" so the description matches the cron route's
conditional DB deletion behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ba86b7cf-32b1-46ee-a1dc-c673af0d01c4
📒 Files selected for processing (6)
apps/web/src/app/api/cron/cleanup-orphaned-files/route.tspackages/lib/src/compliance/retention/monitoring-retention.test.tspackages/lib/src/compliance/retention/monitoring-retention.tspackages/lib/src/compliance/retention/retention-engine.test.tsprototypes/pagespace-endgame/src/components/panes/CompliancePane.tsxprototypes/pagespace-endgame/src/components/panes/DatabasePane.tsx
Update CompliancePane wording to reflect that DB records are only removed for orphans whose physical file deletion succeeded. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Re: CodeRabbit review — orphan cleanup wording Fixed in 70641cd — updated the CompliancePane prototype text to say "attempts physical deletion via the processor service, then removes DB records only for files whose physical deletion succeeded" to match the conditional behavior. |
…l behavior Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merge origin/master — both the calendar-trigger cron (this branch) and the orphaned-file-cleanup cron (#863) are kept. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update for #865 (Redis export rate limit), #866 (activity log PII exclusion), #867 (activity chain serialization), #868-870 (audit service wiring), #863 (GDPR cron jobs), #861 (password auth removed). Fix code review comments: AI usage log deletion is explicit call not FK cascade; note shared-page assistant messages survive account deletion; resolve P2 #8 and #9 contradictions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update compliance doc and prototype panes to reflect implemented fixes DSAR export, message hard-delete, AI log purge on account deletion, and audit chain verification are now implemented — update stale gap claims. Note in-progress work on pu/hash-chain-pii and pu/export-rate-limit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: reflect merged PRs and fix stale review comments Update for #865 (Redis export rate limit), #866 (activity log PII exclusion), #867 (activity chain serialization), #868-870 (audit service wiring), #863 (GDPR cron jobs), #861 (password auth removed). Fix code review comments: AI usage log deletion is explicit call not FK cascade; note shared-page assistant messages survive account deletion; resolve P2 #8 and #9 contradictions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: fix stale password auth reference, add userId:null caveat Section 7.1 referenced "local email+password auth" — password auth was removed in #861; on-prem now uses magic links + passkeys. GdprPane message deletion cards now note shared-page assistant messages with userId: null may survive account deletion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(prototype): restructure panes so resolved items live in Current Resolved items were sitting in Gaps/End Game with "Fixed"/"Done" labels, breaking the narrative flow. Now: - Current: hash chain integrity, distributed rate limit, message hard-delete all live in their natural subsections - Gaps: only genuine gaps remain (cookie consent, data residency, SIEM, audit coverage, agent trails) - End Game: only future work (no "Done" items cluttering the roadmap) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
/api/cron/retention-cleanupand/api/cron/cleanup-orphaned-files) were fully implemented and authenticated but never scheduled in the Docker crontab — expired personal data and orphaned files accumulated indefinitelysecurity_audit_logfrom retention cleanup — the tamper-evident SHA-256 hash chain requires infinite retention since the chain verifier treats any gap as tampering (no gap-aware mode)Changes
docker/cron/crontabpackages/lib/src/compliance/retention/monitoring-retention.tscleanupSecurityAuditLog()for infinite hash chain retentionpackages/lib/src/compliance/retention/monitoring-retention.test.tspackages/lib/src/compliance/retention/retention-engine.test.tsapps/web/src/app/api/cron/cleanup-orphaned-files/route.tsprototypes/pagespace-endgame/src/components/panes/DatabasePane.tsxprototypes/pagespace-endgame/src/components/panes/CompliancePane.tsxTest plan
/var/log/cron/retention-cleanup.logand/var/log/cron/orphan-cleanup.logpopulate after schedule fires🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes / Behavior Changes