Skip to content

feat(worker): add priority-based consolidation bank scheduling#1813

Merged
nicoloboschi merged 2 commits into
mainfrom
feat/consolidation-bank-priority
May 28, 2026
Merged

feat(worker): add priority-based consolidation bank scheduling#1813
nicoloboschi merged 2 commits into
mainfrom
feat/consolidation-bank-priority

Conversation

@nicoloboschi
Copy link
Copy Markdown
Collaborator

Summary

Closes #1715

  • Adds HINDSIGHT_API_WORKER_CONSOLIDATION_BANK_PRIORITY env var to control which banks' consolidation tasks are claimed first when a slot opens
  • Prevents large banks (e.g., 486K-node shadow-meetings) from being starved by many small banks cycling through limited global consolidation slots
  • Uses tiered claiming — each priority level is a separate index-friendly query (no JOINs or computed ORDER BY). Number of queries = number of distinct priority tiers (typically 2). Zero overhead when unset
  • Pattern wildcards supported (e.g., shadow-*:10,staging-*:5,*:1), using LIKE ANY / NOT LIKE ALL in PostgreSQL, expanded to OR/AND clauses for Oracle
  • Bank serialization (max 1 concurrent consolidation per bank) is preserved regardless of priority
  • Backwards compatible: when unset, behavior is identical to current ORDER BY created_at

Design decision: priority queues vs per-bank slot reservation

The original issue proposed per-bank slot allocation (SLOTS_PER_BANK). We chose priority-based scheduling instead because:

  • Consolidation batches are short-lived (capped at 100 memories, then reschedule), so the real problem is scheduling order, not slot reservation
  • Priority queues waste no capacity — when a high-priority bank has no pending work, other banks use the slots freely
  • This matches standard distributed task system patterns (Celery, SQS, Sidekiq)

Test plan

  • 8 unit tests for config parsing (TestParseBankPriority)
  • 5 integration tests against real PostgreSQL (TestConsolidationBankPriority):
    • High-priority bank claimed before low-priority despite newer created_at
    • Wildcard pattern matching (shadow-* matches shadow-meetings, shadow-people)
    • Catch-all default for unlisted banks
    • Backwards compat when priority unset
    • Priority respects bank serialization (busy high-priority bank still excluded)
  • All 80 existing worker tests pass with no regressions
  • Lint passes

Add HINDSIGHT_API_WORKER_CONSOLIDATION_BANK_PRIORITY env var to control
which banks' consolidation tasks are claimed first when a slot opens.
This prevents large banks from being starved by many small banks cycling
through limited global consolidation slots.

Format: comma-separated bank-pattern:priority pairs (higher = claimed first).
Patterns support * wildcards; bare * is the catch-all default.
Example: "shadow-*:10,staging-*:5,*:1"

Implementation uses tiered claiming — each priority level is a separate
index-friendly query, no JOINs or computed ORDER BY. Bank serialization
(max 1 concurrent consolidation per bank) is preserved.
@nicoloboschi nicoloboschi merged commit cf63779 into main May 28, 2026
71 of 72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add per-bank consolidation slot allocation (HINDSIGHT_API_WORKER_CONSOLIDATION_SLOTS_PER_BANK)

1 participant