fix: speed up scheduler queue views#728
Conversation
Maintain fair-queue group counts and resource demand as tasks enter and leave the ready queue, so QueueView creation no longer scans every queued task in scheduler hot paths. Add regression coverage for queue accounting after discard/commit and for avoiding full queued-task value scans. Fixes #724 Signed-off-by: Eric W. Tramel <1223539+eric-tramel@users.noreply.github.com>
Review: PR #728 — fix: speed up scheduler queue viewsSummaryReplaces the O(N-queued-tasks) scan inside FindingsCorrectness
Tests
Style / conventions
Performance
VerdictApprove with minor optional follow-ups. This is a clean, well-targeted performance fix. The accounting is maintained symmetrically with the existing
|
Greptile SummaryThis PR eliminates the O(n) full-task scan in
|
| Filename | Overview |
|---|---|
| packages/data-designer-engine/src/data_designer/engine/dataset_builders/scheduling/queue.py | Core scheduler queue rewritten to maintain three incremental accounting counters; enqueue/discard/commit all update them correctly, and view() now runs in O(groups) instead of O(tasks). No logic errors found. |
| packages/data-designer-engine/tests/engine/dataset_builders/scheduling/test_queue.py | Adds two targeted regression tests: accounting accuracy after mixed discard/commit, and a sentinel-based check that non-candidate resource amounts are never scanned during view(). Existing tests updated with richer assertions. |
Reviews (2): Last reviewed commit: "test: tighten scheduler queue accounting..." | Re-trigger Greptile
Signed-off-by: Eric W. Tramel <1223539+eric-tramel@users.noreply.github.com>
📋 Summary
Fixes scheduler hot-path scaling from Issue #724 by maintaining fair-queue group counts and resource demand incrementally. This keeps
FairTaskQueue.view()from rebuilding queue summaries by scanning every queued task during dispatch and diagnostics.🔗 Related Issue
Fixes #724
🔄 Changes
QueueViewfrom maintained accounting plus first-candidate group heads.🧪 Testing
make check-all-fixuv run pytest packages/data-designer-engine/tests/engine/dataset_builders/scheduling/test_queue.py- 10 passedmake test-engine- 2,217 passedPerformance Demonstration
Same-machine benchmark loading
origin/mainand the fixed branch in one run:origin/mainmedianFairTaskQueue.view()over 8,192 queued tasks, 100 calls✅ Checklist