Skip to content

chore(key_value): prune expired entries from the key-value store#40663

Merged
villebro merged 4 commits into
masterfrom
chore/prune-expired-metastore-cache
Jun 2, 2026
Merged

chore(key_value): prune expired entries from the key-value store#40663
villebro merged 4 commits into
masterfrom
chore/prune-expired-metastore-cache

Conversation

@rusackas
Copy link
Copy Markdown
Member

@rusackas rusackas commented Jun 2, 2026

SUMMARY

The metastore-backed key-value store records an expires_on timestamp for entries written with a timeout — for example, the SupersetMetastoreCache backend used by filter_state and explore_form_data. Unlike cache backends that evict on read (e.g. Redis), the SQL metastore does not remove rows on its own, so once an entry's TTL passes the row simply stays in the key_value table. Over time these expired rows accumulate and the table only grows. This was raised by a security audit (data-retention / automatic-deletion, ASVS 14.2.7) noting that expired-entry cleanup was only triggered opportunistically on write.

This adds routine housekeeping to clean them up:

  • KeyValuePruneCommand (superset/key_value/commands/prune.py) deletes entries whose expires_on is in the past, in batches, mirroring the existing LogPruneCommand / TaskPruneCommand shape (batched IN-clause deletes, optional max_rows_per_run cap, oldest-first deterministic ordering, progress logging). Entries with no expiry (expires_on IS NULL) are left untouched.
  • prune_key_value Celery task (superset/tasks/scheduler.py) wraps the command, mirroring the existing prune_logs / prune_tasks tasks.
  • Commented-out beat schedule entry in superset/config.py, following the exact convention already used for prune_logs and prune_tasks (opt-in, daily at midnight, with a max_rows_per_run kwarg).

The expiry comparison uses naive datetime.now() to stay consistent with how the metastore cache writes expires_on (datetime.now() + timedelta(...)) and how KeyValueEntry.is_expired() already compares it.

Documentation & consistency follow-ups

In addition to the scheduled prune, this PR clarifies and tightens the surrounding contract:

  • DAO docstrings (superset/daos/key_value.py): documented that create_entry intentionally does not purge expired rows. Callers writing a keyed entry with an expires_on must call delete_expired_entries(resource) once up front. Purging is deliberately hoisted out of create_entry so a transaction creating many entries pays the cleanup cost only once (not per insert), and so an expired entry still occupying the same key/uuid doesn't fail the unique constraint. Also documented that upsert_entry overwrites (so needs no prior purge) and that update_entry raises when no entry exists.
  • Operator alignment (superset/key_value/commands/prune.py): the prune query's expiry comparison changed from < to <= so the scheduled prune matches KeyValueEntry.is_expired() and KeyValueDAO.delete_expired_entries() exactly. The three eager-purge call sites (metastore cache add(), distributed-lock acquire, OAuth2 PKCE) were verified to already purge before keyed-TTL creates as required.
  • Threat model (SECURITY.md): added an Out of Scope note that the continued presence of expired key-value / metastore-cache entries not yet purged is not a vulnerability — such entries are excluded from reads once expired, purged opportunistically on write, and removed in bulk by the scheduled prune_key_value task. This is an eventual-cleanup property, not a security boundary.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Not applicable.

TESTING INSTRUCTIONS

Unit tests at tests/unit_tests/key_value/prune_test.py cover: expired rows deleted, non-expired and no-expiry rows retained, empty-store no-op, and the max_rows_per_run cap.

pytest tests/unit_tests/key_value/

To enable the scheduled prune in a deployment, uncomment the prune_key_value block in the CeleryConfig.beat_schedule in superset/config.py.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

🤖 Generated with Claude Code

The metastore-backed key-value store records an `expires_on` timestamp for
entries written with a timeout (for example, the metastore cache backend).
Unlike cache backends that evict on read, the metastore does not remove rows
on its own, so expired entries accumulate in the `key_value` table over time.

Add a `KeyValuePruneCommand` that deletes entries whose expiry has passed,
expose it as a `prune_key_value` Celery task mirroring the existing log/task
prune tasks, and add a commented-out beat schedule entry following the same
convention used for `prune_logs` and `prune_tasks`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@rusackas rusackas requested review from dpgaspar and hainenber June 2, 2026 02:01
@dosubot dosubot Bot added the infra:caching Infra setup and configuration related to caching label Jun 2, 2026
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Jun 2, 2026

Code Review Agent Run #e30a72

Actionable Suggestions - 0
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • superset/key_value/commands/prune.py - 1
Review Details
  • Files reviewed - 5 · Commit Range: a4245dc..a4245dc
    • superset/config.py
    • superset/key_value/commands/__init__.py
    • superset/key_value/commands/prune.py
    • superset/tasks/scheduler.py
    • tests/unit_tests/key_value/prune_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 2, 2026

Codecov Report

❌ Patch coverage is 33.33333% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.92%. Comparing base (41da35e) to head (591e7ba).
⚠️ Report is 30 commits behind head on master.

Files with missing lines Patch % Lines
superset/key_value/commands/prune.py 33.33% 26 Missing ⚠️
superset/tasks/scheduler.py 33.33% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #40663      +/-   ##
==========================================
- Coverage   63.94%   63.92%   -0.03%     
==========================================
  Files        2658     2659       +1     
  Lines      143011   143116     +105     
  Branches    32866    32881      +15     
==========================================
+ Hits        91454    91491      +37     
- Misses      49994    50057      +63     
- Partials     1563     1568       +5     
Flag Coverage Δ
hive 39.74% <33.33%> (-0.02%) ⬇️
mysql 58.36% <33.33%> (-0.04%) ⬇️
postgres 58.43% <33.33%> (-0.04%) ⬇️
presto 41.34% <33.33%> (-0.02%) ⬇️
python 59.92% <33.33%> (-0.04%) ⬇️
sqlite 58.10% <33.33%> (-0.04%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread superset/key_value/commands/prune.py
Comment thread superset/key_value/commands/prune.py
Comment thread superset/key_value/commands/prune.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Jun 2, 2026

Code Review Agent Run #904f9f

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: a4245dc..c56d699
    • superset/tasks/scheduler.py
    • tests/unit_tests/key_value/prune_test.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@villebro
Copy link
Copy Markdown
Member

villebro commented Jun 2, 2026

@rusackas I think adding a purge job is fine, but the KeyValueDAO already excludes expired entries on get, and exposes delete_expired_entries which are used wherever a create is called (the commands are supposed to handle this). So in practice, the purge job would be redundant. So this is a known design decision. If that design decision needs to be revisited I'm all for reviewing it.

@rusackas rusackas moved this from Needs Review to In Review in Superset Review Help Wanted Jun 2, 2026
@villebro villebro force-pushed the chore/prune-expired-metastore-cache branch from ee03d89 to 2367a44 Compare June 2, 2026 18:28
@rusackas rusackas added the merge-if-green If approved and tests are green, please go ahead and merge it for me label Jun 2, 2026
Capture a single cutoff timestamp before selection and re-apply the
expiry predicate on the batched DELETE so an entry refreshed between
selection and deletion is not pruned.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@rusackas rusackas removed the merge-if-green If approved and tests are green, please go ahead and merge it for me label Jun 2, 2026
@villebro villebro merged commit 8508af3 into master Jun 2, 2026
61 checks passed
@github-project-automation github-project-automation Bot moved this from In Review to Approved and/or Merged in Superset Review Help Wanted Jun 2, 2026
@villebro villebro deleted the chore/prune-expired-metastore-cache branch June 2, 2026 19:36
@bito-code-review
Copy link
Copy Markdown
Contributor

Bito Automatic Review Skipped – PR Already Merged

Bito scheduled an automatic review for this pull request, but the review was skipped because this PR was merged before the review could be run.
No action is needed if you didn't intend to review it. To get a review, you can type /review in a comment and save it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

infra:caching Infra setup and configuration related to caching preset-io size/L

Projects

Status: Approved and/or Merged

Development

Successfully merging this pull request may close these issues.

3 participants