Additional test stabilization (#935): Drain Queue Before Verifying Results in AsyncBlockTests#940
Merged
jasonsandlin merged 3 commits intomainfrom Feb 20, 2026
Merged
Conversation
Apply queue drain timing fix to tests that verify async opcodes. These tests were checking opcodes immediately after async completion without ensuring all cleanup work had been recorded to the opcode log, causing intermittent test failures.
Refactor cdb test script to capture stacks independently, as well as output log, stacks, and dmp for all abnormal exits (including Ctrl+C).
Apply consistent queue drain pattern to 8 AsyncBlockTests before final queue
verification to eliminate timing races where cleanup work completes asynchronously
after XAsyncGetStatus() returns.
Root Cause:
The async framework's Cleanup operation is initiated by the provider but
completed asynchronously through the task queue. Tests checking queue state
or opcode snapshots immediately after XAsyncGetStatus() could race with the
pending Cleanup work, resulting in intermittent failures (heisenbug-like
behavior with "8 vs 9 opcodes" or "queue not empty" errors).
Solution:
All queue verification now preceded by explicit drain loop:
- Checks both Completion and Work ports
- 10ms sleep granularity, 2000ms timeout
- Ensures all async cleanup completes before verification
jasonsandlin
approved these changes
Feb 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes intermittent test failures in AsyncBlockTests by ensuring the task queue is fully drained before verifying async opcodes. The race condition occurred because the Cleanup opcode is recorded asynchronously, and tests were taking opcode snapshots before that work completed. Fixes issue introduced in #935 as well as existing issue in other tests.
Changes
Apply consistent queue drain pattern to 8 test methods that verify async completion state. Comprehensive audit found both opcode verification races (4 tests) and queue empty verification races (4 additional tests).
Root Cause
The async framework's Cleanup operation is initiated by the provider but completed asynchronously through the task queue. Tests that checked queue state or opcode snapshots immediately after
XAsyncGetStatus()could race with the pending Cleanup opcode write, resulting in:Testing
✅ Comprehensive test audit completed:
✅ All 23 AsyncBlockTests pass after fixes
✅ No regressions in other test suites
✅ Validated with extended soak testing (743 full test suite passes under page heap)
✅ Pattern aligns with existing drain waits in
VerifyCleanupWaitsForWorkandVerifyCleanupWaitsForWorkDistributedRelated Work
This is part of the async test stabilization effort:
Notes
VerifyCleanupWaitsForWorktestsChecklist