-
Notifications
You must be signed in to change notification settings - Fork 625
Description
Description
The test test_agent_concurrent_structured_output_raises_exception in tests/strands/agent/test_agent.py is flaky on macOS Python 3.13. It intermittently passes or fails depending on timing conditions.
Observed Behavior
The test expects:
- Thread 1 to acquire the lock and hold it
- Thread 2 (started after 50ms delay) to hit
ConcurrencyException - Result: 1 success, 1 error
Actual behavior (intermittent):
- Both threads complete successfully
- Result: 2 successes, 0 errors
FAILED tests/strands/agent/test_agent.py::test_agent_concurrent_structured_output_raises_exception - AssertionError: Expected 1 success, got 2
assert 2 == 1
Root Cause Analysis
The test uses time.sleep(0.05) (50ms) to delay Thread 2, but SlowMockedModel.stream() uses asyncio.sleep(0.15) (150ms). On faster machines (especially macOS with Python 3.13), the timing can result in:
- Thread 1: start → acquire lock → wait 150ms → complete → release lock (~155ms total)
- Thread 2: starts at 50ms → by ~155ms lock is released → acquires lock successfully
This is a race condition where Thread 2 can acquire the lock after Thread 1 releases it, rather than hitting the concurrency exception.
Suggested Fix
Use explicit synchronization instead of relying on timing:
import threading
lock_acquired = threading.Event()
class SlowMockedModelWithSignal(MockedModelProvider):
async def stream(self, ...):
lock_acquired.set() # Signal that lock was acquired
await asyncio.sleep(0.15)
async for event in super().stream(...):
yield event
def test_agent_concurrent_structured_output_raises_exception(...):
# ... setup ...
t1.start()
lock_acquired.wait(timeout=1.0) # Wait for t1 to actually acquire lock
t2.start() # Now t2 will definitely hit the lock
# ... rest of test ...Alternatively, increase the sleep duration in SlowMockedModel.stream() significantly (e.g., 500ms) to ensure overlap.
Environment
- OS: macOS
- Python: 3.13
- Introduced in: PR fix: add concurrency protection to prevent parallel invocations from corrupting agent state #1453 (concurrency protection feature)
Related
- PR feat(bedrock): add s3Location support for document, image, and video sources #1491 CI failure (unrelated code changes, but this test failed)
- PR fix: add concurrency protection to prevent parallel invocations from corrupting agent state #1453 (introduced the concurrency protection and this test)
Labels
bug, flaky-test, good first issue