-
Notifications
You must be signed in to change notification settings - Fork 3
Fix flaky concurrency test and reorganize test suite #202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This fixes a flaky test that was failing intermittently in CI due to testing the wrong thing. The test was checking exact execution counts, but what matters for concurrency limits is temporal overlap, not how many times tasks execute due to redelivery. Changes: - Rewrote `test_concurrency_blocked_task_executes_exactly_once` to validate no temporal overlaps instead of execution counts - Reorganized all concurrency tests into `tests/concurrency_limits/` with logical grouping: - `test_basic.py` - Core ConcurrencyLimit API (12 tests, 371 LOC) - `test_redelivery.py` - Redelivery interaction (6 tests, 321 LOC) - `test_worker_mechanics.py` - Worker internals (11 tests, 288 LOC) - `test_execution_patterns.py` - Execution patterns (7 tests, 220 LOC) - `test_errors_and_resilience.py` - Error handling (7 tests, 238 LOC) - Removed 2 duplicate tests - All 450 tests passing with 100% coverage - Fixed test now passes 200/200 iterations (previously ~40-50% failure rate) The root issue was that tasks could be redelivered before being ACKed when they took longer than the 50ms redelivery timeout. This is acceptable behavior - what matters is that concurrency semantics are preserved (no overlapping executions per customer_id). Related: https://github.com/chrisguidry/docket/actions/runs/19515494612/job/55866051167 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Claude finished @chrisguidry's task —— View job PR Review: Fix flaky concurrency test and reorganize test suite✅ Test Fix AnalysisThe fix for Previous approach (flaky):
New approach (robust):
This is the right fix - what matters for concurrency limits is preventing simultaneous execution, not counting deliveries. ✅ Test ReorganizationThe reorganization into
👍 Strengths
🔍 Minor Observations
🎯 Recommendations
✅ VerdictThis PR successfully fixes a legitimate test flake by correcting the test logic to validate the actual guarantee (no concurrent executions) rather than an implementation detail (execution count). The test reorganization improves maintainability without losing any coverage. The fix has been validated with 200/200 passes. Approved for merge - excellent work fixing the flake and improving test organization! |
|
📚 Documentation has been built for this PR! You can download the documentation directly here: |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #202 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 43 45 +2
Lines 7219 7237 +18
Branches 369 375 +6
=========================================
+ Hits 7219 7237 +18
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
This fixes a flaky test that was failing intermittently in CI due to testing the wrong thing. We have a zero-tolerance policy for test flakes, so this needed immediate attention.
The Problem
The test
test_concurrency_blocked_task_executes_exactly_oncewas checking exact execution counts, but what actually matters for concurrency limits is temporal overlap detection, not how many times tasks execute due to redelivery.Tasks could be redelivered before being ACKed when they took longer than the 50ms redelivery timeout. This is acceptable behavior - the important guarantee is that concurrency semantics are preserved (no overlapping executions per customer_id).
Related flake: https://github.com/chrisguidry/docket/actions/runs/19515494612/job/55866051167
The Fix
Rewrote the test to validate what actually matters: no temporal overlaps per customer. Instead of asserting execution counts, the test now:
start1 < end2 and start2 < end1Result: 200/200 passes (previously ~40-50% failure rate)
Bonus: Test Reorganization
While fixing this, consolidated all concurrency tests into
tests/concurrency_limits/with logical grouping:test_basic.py- Core ConcurrencyLimit API (12 tests, 371 LOC)test_redelivery.py- Redelivery interaction (6 tests, 321 LOC)test_worker_mechanics.py- Worker internals (11 tests, 288 LOC)test_execution_patterns.py- Execution patterns (7 tests, 220 LOC)test_errors_and_resilience.py- Error handling (7 tests, 238 LOC)Removed 2 duplicate tests in the process. All 450 tests passing with 100% coverage maintained.
🤖 Generated with Claude Code