Summary
Access violation crash in AsyncBlockTests::VerifyAsyncBlockReuse caused by concurrent
std::vector::push_back operations from overlapping async call lifecycle phases.
Environment
- Platform: Windows x64
- Configuration: Debug build with page heap (
gflags /p /enable)
- Test Framework: TAEF (Test Authoring and Execution Framework)
- Detection: 6-hour soak test under Windows CDB debugger
Reproduction
Frequency: Heisenbug - intermittent crash after extended stress testing
Trigger: Rapid async block reuse with shared FactorialCallData context
Test Case
AsyncBlockTests::VerifyAsyncBlockReuse - Tests XAsyncBlock reuse scenario where:
- First async call completes with result 120 (factorial of 5)
- Same
XAsyncBlock and FactorialCallData immediately reused for second call
- Second call completes with result 720 (factorial of 6)
Race Condition Window
When XAsyncGetStatus(&async, true) returns for the first call, the main thread proceeds
to start the second async operation while the first call's cleanup is still executing in
the completion callback thread.
Thread 1 (Completion):
CompletionCallback
→ AsyncState::Release
→ AsyncState::~AsyncState
→ provider(XAsyncOp::Cleanup)
→ FactorialWorkerSimple(Cleanup)
→ opCodes.push_back(XAsyncOp::Cleanup) ← RACE
Thread 2 (Main/Test):
VerifyAsyncBlockReuse
→ FactorialAsync (second call)
→ XAsyncBegin
→ provider(XAsyncOp::Begin)
→ FactorialWorkerSimple(Begin)
→ opCodes.push_back(XAsyncOp::Begin) ← RACE
Crash Details
Stack Trace
# 17 Id: 3914.22c4 Suspend: 1 Teb: 000000b8`cec33000 Unfrozen
Child-SP RetAddr Call Site
000000b8`cf7ff0b0 00007ffa`a5024208 ucrtbased!_free_dbg+0x2e
000000b8`cf7ff0f0 00007ffa`a5023788 libHttpClient_UnitTest_TAEF!operator delete+0x18
000000b8`cf7ff120 00007ffa`a4f363e9 libHttpClient_UnitTest_TAEF!operator delete+0x18
000000b8`cf7ff150 00007ffa`a4f8e6df libHttpClient_UnitTest_TAEF!std::_Deallocate<16>+0x39
000000b8`cf7ff180 00007ffa`a4f8c868 libHttpClient_UnitTest_TAEF!std::allocator<XAsyncOp>::deallocate+0x8f
000000b8`cf7ff1c0 00007ffa`a4f806c6 libHttpClient_UnitTest_TAEF!std::vector<XAsyncOp>::_Change_array+0xb8
000000b8`cf7ff220 00007ffa`a4f80393 libHttpClient_UnitTest_TAEF!std::vector<XAsyncOp>::_Emplace_reallocate+0x296
000000b8`cf7ff330 00007ffa`a4f8e9ee libHttpClient_UnitTest_TAEF!std::vector<XAsyncOp>::_Emplace_one_at_back+0x83
000000b8`cf7ff380 00007ffa`a4f852a1 libHttpClient_UnitTest_TAEF!std::vector<XAsyncOp>::push_back+0x1e
000000b8`cf7ff3b0 00007ffa`a4f60237 libHttpClient_UnitTest_TAEF!AsyncBlockTests::FactorialWorkerSimple+0x51
000000b8`cf7ff430 00007ffa`a4f60468 libHttpClient_UnitTest_TAEF!AsyncState::~AsyncState+0x67
Debugger Analysis
EXCEPTION_CODE: c0000005 (Access violation)
READ_ADDRESS: 000001c01c088fdc
FAULTING_SOURCE_LINE: minkernel\crts\ucrt\src\appcrt\heap\debug_heap.cpp:1026
FAILURE_BUCKET_ID: INVALID_POINTER_READ_AVRF_c0000005_ucrtbased.dll!_free_dbg
Root Cause
The FactorialCallData structure contains:
std::vector<XAsyncOp> opCodes; // Shared, unsynchronized
Both FactorialWorkerSimple and FactorialWorkerDistributed record opcodes via:
d->opCodes.push_back(opCode); // NOT thread-safe
Concurrent push_back during vector reallocation corrupts the heap allocator metadata,
leading to crash in subsequent _free_dbg call.
Impact
- Severity: Test flakiness under stress conditions
- Scope: Test code only, no production impact
- Workaround: None (intermittent failure)
Proposed Fix
Replace std::vector<XAsyncOp> with lock-free fixed-capacity buffer using
std::array<XAsyncOp, N> + std::atomic<size_t> counter. This eliminates:
- Dynamic allocation/reallocation
- Need for mutex synchronization
- Race condition in concurrent append operations
Maintains test semantics while aligning with library philosophy of avoiding sync primitives.
Summary
Access violation crash in
AsyncBlockTests::VerifyAsyncBlockReusecaused by concurrentstd::vector::push_backoperations from overlapping async call lifecycle phases.Environment
gflags /p /enable)Reproduction
Frequency: Heisenbug - intermittent crash after extended stress testing
Trigger: Rapid async block reuse with shared
FactorialCallDatacontextTest Case
AsyncBlockTests::VerifyAsyncBlockReuse- Tests XAsyncBlock reuse scenario where:XAsyncBlockandFactorialCallDataimmediately reused for second callRace Condition Window
When
XAsyncGetStatus(&async, true)returns for the first call, the main thread proceedsto start the second async operation while the first call's cleanup is still executing in
the completion callback thread.
Thread 1 (Completion):
Thread 2 (Main/Test):
Crash Details
Stack Trace
Debugger Analysis
Root Cause
The
FactorialCallDatastructure contains:std::vector<XAsyncOp> opCodes; // Shared, unsynchronizedBoth
FactorialWorkerSimpleandFactorialWorkerDistributedrecord opcodes via:d->opCodes.push_back(opCode); // NOT thread-safeConcurrent
push_backduring vector reallocation corrupts the heap allocator metadata,leading to crash in subsequent
_free_dbgcall.Impact
Proposed Fix
Replace
std::vector<XAsyncOp>with lock-free fixed-capacity buffer usingstd::array<XAsyncOp, N>+std::atomic<size_t>counter. This eliminates:Maintains test semantics while aligning with library philosophy of avoiding sync primitives.