Fix flaky concurrency test and reorganize test suite #202

chrisguidry · 2025-11-19T23:31:34Z

This fixes a flaky test that was failing intermittently in CI due to testing the wrong thing. We have a zero-tolerance policy for test flakes, so this needed immediate attention.

The Problem

The test test_concurrency_blocked_task_executes_exactly_once was checking exact execution counts, but what actually matters for concurrency limits is temporal overlap detection, not how many times tasks execute due to redelivery.

Tasks could be redelivered before being ACKed when they took longer than the 50ms redelivery timeout. This is acceptable behavior - the important guarantee is that concurrency semantics are preserved (no overlapping executions per customer_id).

Related flake: https://github.com/chrisguidry/docket/actions/runs/19515494612/job/55866051167

The Fix

Rewrote the test to validate what actually matters: no temporal overlaps per customer. Instead of asserting execution counts, the test now:

Records start/end timestamps for each execution
Groups by customer_id
Validates no two executions for the same customer overlap in time using start1 < end2 and start2 < end1

Result: 200/200 passes (previously ~40-50% failure rate)

Bonus: Test Reorganization

While fixing this, consolidated all concurrency tests into tests/concurrency_limits/ with logical grouping:

test_basic.py - Core ConcurrencyLimit API (12 tests, 371 LOC)
test_redelivery.py - Redelivery interaction (6 tests, 321 LOC)
test_worker_mechanics.py - Worker internals (11 tests, 288 LOC)
test_execution_patterns.py - Execution patterns (7 tests, 220 LOC)
test_errors_and_resilience.py - Error handling (7 tests, 238 LOC)

Removed 2 duplicate tests in the process. All 450 tests passing with 100% coverage maintained.

🤖 Generated with Claude Code

This fixes a flaky test that was failing intermittently in CI due to testing the wrong thing. The test was checking exact execution counts, but what matters for concurrency limits is temporal overlap, not how many times tasks execute due to redelivery. Changes: - Rewrote `test_concurrency_blocked_task_executes_exactly_once` to validate no temporal overlaps instead of execution counts - Reorganized all concurrency tests into `tests/concurrency_limits/` with logical grouping: - `test_basic.py` - Core ConcurrencyLimit API (12 tests, 371 LOC) - `test_redelivery.py` - Redelivery interaction (6 tests, 321 LOC) - `test_worker_mechanics.py` - Worker internals (11 tests, 288 LOC) - `test_execution_patterns.py` - Execution patterns (7 tests, 220 LOC) - `test_errors_and_resilience.py` - Error handling (7 tests, 238 LOC) - Removed 2 duplicate tests - All 450 tests passing with 100% coverage - Fixed test now passes 200/200 iterations (previously ~40-50% failure rate) The root issue was that tasks could be redelivered before being ACKed when they took longer than the 50ms redelivery timeout. This is acceptable behavior - what matters is that concurrency semantics are preserved (no overlapping executions per customer_id). Related: https://github.com/chrisguidry/docket/actions/runs/19515494612/job/55866051167 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

claude · 2025-11-19T23:31:48Z

Claude finished @chrisguidry's task —— View job

PR Review: Fix flaky concurrency test and reorganize test suite

✅ Test Fix Analysis

The fix for test_concurrency_blocked_task_executes_exactly_once correctly addresses the root issue:

Previous approach (flaky):

Checked exact execution counts
Failed when tasks were redelivered due to 50ms timeout
Testing implementation detail rather than actual guarantee

New approach (robust):

Records start/end timestamps for each execution (tests/concurrency_limits/test_redelivery.py:271-274)
Validates temporal non-overlap using proper interval logic: overlap = start1 < end2 and start2 < end1 (test_redelivery.py:302)
Correctly allows redeliveries while ensuring concurrency semantics

This is the right fix - what matters for concurrency limits is preventing simultaneous execution, not counting deliveries.

✅ Test Reorganization

The reorganization into tests/concurrency_limits/ is well-structured:

test_basic.py (12 tests): Core ConcurrencyLimit API and validation
test_redelivery.py (6 tests): Redelivery timeout interaction
test_worker_mechanics.py (11 tests): Internal worker mechanisms
test_execution_patterns.py (7 tests): Task execution patterns
test_errors_and_resilience.py (7 tests): Error handling and stress testing

👍 Strengths

Proper test isolation: The temporal overlap check correctly validates the actual guarantee
Logical grouping: Tests are now organized by concern rather than scattered
Comprehensive coverage: 43 tests total (27 removed from test_worker.py, 41 async + 2 sync added)
Good documentation: Each module has clear docstrings explaining its focus
No functionality lost: All test scenarios preserved or improved

🔍 Minor Observations

Test duplication check: Found 2 tests were removed as duplicates - good cleanup
Coverage maintained: PR claims 100% coverage is maintained
Naming consistency: All test files follow clear naming patterns

🎯 Recommendations

Consider adding a comment in the test explaining why temporal overlap is the correct validation approach
The 50ms redelivery timeout in the test is quite aggressive - document why this stress-test value was chosen
Good use of execution keys to track individual task instances

✅ Verdict

This PR successfully fixes a legitimate test flake by correcting the test logic to validate the actual guarantee (no concurrent executions) rather than an implementation detail (execution count). The test reorganization improves maintainability without losing any coverage. The fix has been validated with 200/200 passes.

Approved for merge - excellent work fixing the flake and improving test organization!

github-actions · 2025-11-19T23:31:56Z

📚 Documentation has been built for this PR!

You can download the documentation directly here:
https://github.com/chrisguidry/docket/actions/runs/19519839786/artifacts/4621439960

codecov-commenter · 2025-11-19T23:32:53Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (1f62900) to head (bfe720e).

Additional details and impacted files

@@            Coverage Diff            @@
##              main      #202   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           43        45    +2     
  Lines         7219      7237   +18     
  Branches       369       375    +6     
=========================================
+ Hits          7219      7237   +18

Flag	Coverage Δ
python-3.10	`100.00% <100.00%> (ø)`
python-3.11	`98.97% <87.13%> (-0.11%)`	⬇️
python-3.12	`100.00% <100.00%> (ø)`
python-3.13	`100.00% <100.00%> (ø)`
python-3.14	`100.00% <100.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/docket/execution.py	`100.00% <100.00%> (ø)`
src/docket/worker.py	`100.00% <ø> (ø)`
tests/concurrency_limits/test_basic.py	`100.00% <100.00%> (ø)`
...s/concurrency_limits/test_errors_and_resilience.py	`100.00% <100.00%> (ø)`
...ests/concurrency_limits/test_execution_patterns.py	`100.00% <100.00%> (ø)`
tests/concurrency_limits/test_redelivery.py	`100.00% <100.00%> (ø)`
tests/concurrency_limits/test_worker_mechanics.py	`100.00% <100.00%> (ø)`
tests/test_worker.py	`100.00% <100.00%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

chrisguidry and others added 2 commits November 19, 2025 17:27

Fixing some types

3f89aae

chrisguidry merged commit 787dcc9 into main Nov 19, 2025
25 checks passed

chrisguidry deleted the fix-concurrency-blocked-flake branch November 19, 2025 23:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix flaky concurrency test and reorganize test suite #202

Fix flaky concurrency test and reorganize test suite #202

Uh oh!

chrisguidry commented Nov 19, 2025

Uh oh!

claude bot commented Nov 19, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 19, 2025

Uh oh!

codecov-commenter commented Nov 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix flaky concurrency test and reorganize test suite #202

Fix flaky concurrency test and reorganize test suite #202

Uh oh!

Conversation

chrisguidry commented Nov 19, 2025

The Problem

The Fix

Bonus: Test Reorganization

Uh oh!

claude bot commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: Fix flaky concurrency test and reorganize test suite

✅ Test Fix Analysis

✅ Test Reorganization

👍 Strengths

🔍 Minor Observations

🎯 Recommendations

✅ Verdict

Uh oh!

github-actions bot commented Nov 19, 2025

Uh oh!

codecov-commenter commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

claude bot commented Nov 19, 2025 •

edited

Loading

codecov-commenter commented Nov 19, 2025 •

edited

Loading