Skip to content

fix: clear leaked running loop in MCP client background thread#2111

Merged
mkmeral merged 1 commit intostrands-agents:mainfrom
mkmeral:fix/otel-threading-mcp-client
Apr 13, 2026
Merged

fix: clear leaked running loop in MCP client background thread#2111
mkmeral merged 1 commit intostrands-agents:mainfrom
mkmeral:fix/otel-threading-mcp-client

Conversation

@mkmeral
Copy link
Copy Markdown
Contributor

@mkmeral mkmeral commented Apr 10, 2026

Description

When opentelemetry-instrumentation-threading is active — the documented pattern for AgentCore observability — the ThreadingInstrumentor wraps Thread.run() to propagate trace context. Combined with contextvars.copy_context() in MCPClient.start(), this leaks the parent thread's running event loop reference into the background thread. When _background_task then calls run_until_complete(), asyncio sees a "running" loop already set and raises RuntimeError: Cannot run the event loop while another loop is running.

This adds asyncio._set_running_loop(None) at the top of _background_task() to clear leaked loop state before creating the new event loop. This is the same pattern the bedrock-agentcore team shipped in aws/bedrock-agentcore-sdk-python#405.

Confirmed on SDK versions 1.28.0 and 1.35.0.

Related Issues

N/A

Documentation PR

N/A — no documentation changes needed.

Type of Change

Bug fix

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Added a unit test that simulates the OTEL leak by setting a running loop before _background_task executes, verifying the method clears it and completes successfully.

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

When OpenTelemetry's ThreadingInstrumentor is active, Thread.run() is
wrapped to propagate trace context, which leaks the parent thread's
running event loop reference into child threads. This causes
'RuntimeError: Cannot run the event loop while another loop is running'
when _background_task calls run_until_complete().

Add asyncio._set_running_loop(None) at the top of _background_task to
clear any leaked loop state before creating the new event loop. This is
the same pattern used by bedrock-agentcore-sdk-python (PR strands-agents#405).
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@mkmeral mkmeral enabled auto-merge (squash) April 11, 2026 00:02
Comment thread tests/strands/tools/mcp/test_mcp_client_contextvar.py
Comment thread src/strands/tools/mcp/mcp_client.py
@github-actions
Copy link
Copy Markdown

Assessment: Comment

Well-scoped bug fix that addresses a real issue with OpenTelemetry's ThreadingInstrumentor leaking running event loop state into MCP client background threads. The fix is minimal, the approach is validated by the bedrock-agentcore team, and the test effectively simulates the failure scenario.

Review Details
  • Private API usage: asyncio._set_running_loop(None) is a CPython internal — stable and correct for this use case, but worth a brief note for future maintainability.
  • Test import style: import asyncio should be at the module top-level per repository coding standards.

Clean fix with good context in both the PR description and code comments.

@mkmeral mkmeral merged commit bb7f188 into strands-agents:main Apr 13, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants