perf!: improve benchmark throughput across submit, dispatch, retry, and failure paths#80
Merged
Merged
Conversation
Rename `tasks_by_tags` → `task_ids_by_tags` and `tasks_by_tag_key_prefix` → `task_ids_by_tag_key_prefix` across store, scheduler, domain, and module layers. Queries now SELECT only `t.id`, skip `populate_tags`, and drop the ORDER BY clause — avoiding full row deserialization and N+1 tag lookups when callers (cancel_by_tag, cancel_by_tag_key_prefix) only need the ID. Extracts shared join-building logic into `build_tag_join_sql`.
…g channel Mirror the existing completion coalescing pattern for task failures. Parentless terminal failures are sent through an unbounded channel and drained in batches (by leader election or the run loop), amortizing SQLite WAL sync overhead. Failures with parents still process inline to preserve fail-fast cascade ordering. Also fix has_paused_tasks to start false and let the builder set it only when the persistent store actually contains paused tasks.
Move tag population out of pop_next/peek_next into explicit caller sites so the JOIN is only paid when tags are actually needed. Add inline retry loop for zero-delay retries: instead of requeueing to pending and re-popping through SQLite, re-execute the task directly in the same spawned future via increment_retry().
Make tag population opt-in for list queries (history, history_by_type, history_by_key, dead_letter_tasks, failed_tasks) to avoid N+1 tag lookups when callers don't need tags. Add idx_history_type covering index on (task_type, completed_at DESC) to speed up history_stats, history_by_type, and avg_throughput queries. Refactor history benchmarks to populate via store directly instead of spinning up a full scheduler.
Two submit-path optimizations from plan 043: 1. Submit coalescing (Option 1): TaskStore::submit() now uses a leader- election pattern (mirroring completion/failure coalescing). Concurrent callers are batched into a single BEGIN/COMMIT transaction. An uncontended fast path avoids channel overhead entirely so sequential callers see zero regression. 2. Skip requeue UPDATE (Option 2): Added `has_running` atomic flag to TaskStore. When no task has ever been dispatched (common during bulk submit-then-run), the requeue UPDATE in skip_existing() is elided, saving one SQL round-trip per dedup hit. Benchmark impact vs baseline: - submit_dedup_hit/1000: -8% (skip-requeue) - batch_submit/1000: -14% (skip-requeue) - submit_tasks/1000: neutral (fast path)
…casts Add yield_now() before leader-election drain so more completions accumulate per batch. Gate all event_tx.send() calls behind receiver_count() > 0 to avoid broadcast channel overhead when no subscribers exist. dispatch_and_complete/1000: 169µs → 149µs/task (−17%, +21% throughput)
- Rename tasks_by_tags → task_ids_by_tags, tasks_by_tag_key_prefix → task_ids_by_tag_key_prefix (now return Vec<i64>) in query-apis and multi-module-apps docs - Add inline zero-delay retry path to design.md retry flow - Document new idx_history_type covering index and missing dead_letter history status in persistence-and-recovery.md - Replace &TaskContext with DomainTaskContext<'_, D> in all code examples across 10 doc files and lib.rs rustdoc - Update spawn_child → spawn_child_with, .parent() → .child_of(&ctx) in quick-start.md - Bump stale version strings (0.3/0.4/0.5 → 0.6) in Cargo.toml snippets - Mark raw_executor as removed in configuration.md Domain builder table
Contributor
Benchmark ComparisonClick to expand |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reduces per-task overhead across every scheduler hot path — submit, dispatch, retry, completion, and failure — through transaction coalescing, lazy data population, inline zero-delay retries, and query-only-what-you-need optimizations. Also brings all markdown docs and rustdoc up to date with the 0.6 API.
Breaking change
tasks_by_tags()→task_ids_by_tags()andtasks_by_tag_key_prefix()→task_ids_by_tag_key_prefix()— both now returnVec<i64>instead ofVec<TaskRecord>. Callers that need full records must follow up withtask_by_id().Performance improvements
submit()calls are batched into a single SQLite transaction via leader election; uncontended callers take a zero-overhead fast pathhas_runningflag)pop_next/peek_next/history list queries no longer JOIN tags by default; callers opt in viapopulate_tags()/populate_history_tags()idx_history_type(task_type, completed_at DESC)speeds uphistory_stats,history_by_type, andavg_throughputyield_now()before leader-election drain lets more completions accumulate per batchevent_tx.send()behindreceiver_count() > 0Documentation
lib.rsrustdoc for the 0.6 API (DomainTaskContext,spawn_child_with,child_of)0.3/0.4/0.5→0.6)