chore(key_value): prune expired entries from the key-value store#40663
Conversation
The metastore-backed key-value store records an `expires_on` timestamp for entries written with a timeout (for example, the metastore cache backend). Unlike cache backends that evict on read, the metastore does not remove rows on its own, so expired entries accumulate in the `key_value` table over time. Add a `KeyValuePruneCommand` that deletes entries whose expiry has passed, expose it as a `prune_key_value` Celery task mirroring the existing log/task prune tasks, and add a commented-out beat schedule entry following the same convention used for `prune_logs` and `prune_tasks`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Code Review Agent Run #e30a72Actionable Suggestions - 0Filtered by Review RulesBito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #40663 +/- ##
==========================================
- Coverage 63.94% 63.92% -0.03%
==========================================
Files 2658 2659 +1
Lines 143011 143116 +105
Branches 32866 32881 +15
==========================================
+ Hits 91454 91491 +37
- Misses 49994 50057 +63
- Partials 1563 1568 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Code Review Agent Run #904f9fActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
|
@rusackas I think adding a purge job is fine, but the |
ee03d89 to
2367a44
Compare
Capture a single cutoff timestamp before selection and re-apply the expiry predicate on the batched DELETE so an entry refreshed between selection and deletion is not pruned. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Bito Automatic Review Skipped – PR Already Merged |
SUMMARY
The metastore-backed key-value store records an
expires_ontimestamp for entries written with a timeout — for example, theSupersetMetastoreCachebackend used byfilter_stateandexplore_form_data. Unlike cache backends that evict on read (e.g. Redis), the SQL metastore does not remove rows on its own, so once an entry's TTL passes the row simply stays in thekey_valuetable. Over time these expired rows accumulate and the table only grows. This was raised by a security audit (data-retention / automatic-deletion, ASVS 14.2.7) noting that expired-entry cleanup was only triggered opportunistically on write.This adds routine housekeeping to clean them up:
KeyValuePruneCommand(superset/key_value/commands/prune.py) deletes entries whoseexpires_onis in the past, in batches, mirroring the existingLogPruneCommand/TaskPruneCommandshape (batchedIN-clause deletes, optionalmax_rows_per_runcap, oldest-first deterministic ordering, progress logging). Entries with no expiry (expires_on IS NULL) are left untouched.prune_key_valueCelery task (superset/tasks/scheduler.py) wraps the command, mirroring the existingprune_logs/prune_taskstasks.superset/config.py, following the exact convention already used forprune_logsandprune_tasks(opt-in, daily at midnight, with amax_rows_per_runkwarg).The expiry comparison uses naive
datetime.now()to stay consistent with how the metastore cache writesexpires_on(datetime.now() + timedelta(...)) and howKeyValueEntry.is_expired()already compares it.Documentation & consistency follow-ups
In addition to the scheduled prune, this PR clarifies and tightens the surrounding contract:
superset/daos/key_value.py): documented thatcreate_entryintentionally does not purge expired rows. Callers writing a keyed entry with anexpires_onmust calldelete_expired_entries(resource)once up front. Purging is deliberately hoisted out ofcreate_entryso a transaction creating many entries pays the cleanup cost only once (not per insert), and so an expired entry still occupying the same key/uuid doesn't fail the unique constraint. Also documented thatupsert_entryoverwrites (so needs no prior purge) and thatupdate_entryraises when no entry exists.superset/key_value/commands/prune.py): the prune query's expiry comparison changed from<to<=so the scheduled prune matchesKeyValueEntry.is_expired()andKeyValueDAO.delete_expired_entries()exactly. The three eager-purge call sites (metastore cacheadd(), distributed-lock acquire, OAuth2 PKCE) were verified to already purge before keyed-TTL creates as required.SECURITY.md): added an Out of Scope note that the continued presence of expired key-value / metastore-cache entries not yet purged is not a vulnerability — such entries are excluded from reads once expired, purged opportunistically on write, and removed in bulk by the scheduledprune_key_valuetask. This is an eventual-cleanup property, not a security boundary.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Not applicable.
TESTING INSTRUCTIONS
Unit tests at
tests/unit_tests/key_value/prune_test.pycover: expired rows deleted, non-expired and no-expiry rows retained, empty-store no-op, and themax_rows_per_runcap.To enable the scheduled prune in a deployment, uncomment the
prune_key_valueblock in theCeleryConfig.beat_scheduleinsuperset/config.py.ADDITIONAL INFORMATION
🤖 Generated with Claude Code