-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Fix: Properly await cancelled guardrail tasks during cleanup #1976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a resource leak in guardrail task cleanup by ensuring cancelled tasks are properly awaited during session shutdown. The fix prevents unhandled task exception warnings and follows Python asyncio best practices.
Key changes:
- Converted
_cleanup_guardrail_tasks()from sync to async method to properly await cancelled tasks - Added
asyncio.gather()withreturn_exceptions=Trueto collect exceptions from cancelled tasks - Implemented comprehensive test coverage for the cleanup behavior
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/agents/realtime/session.py | Made _cleanup_guardrail_tasks() async and added proper task awaiting with exception handling |
| tests/realtime/test_guardrail_cleanup.py | Added comprehensive tests verifying proper cleanup of cancelled, failed, and multiple concurrent guardrail tasks |
| src/agents/agent.py | Added validation to ensure tools list contains only valid Tool objects, preventing runtime AttributeErrors |
| tests/test_agent_config.py | Added test coverage for tool validation to prevent invalid tool types from passing initialization |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
src/agents/agent.py
Outdated
|
|
||
| # Validate each tool is a valid Tool type | ||
| # Tool is a Union type, so we need to get the valid types from it | ||
| valid_tool_types = get_args(Tool) + (Handoff,) |
Copilot
AI
Oct 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation logic for tool types creates a tuple by concatenating get_args(Tool) with (Handoff,). This approach is fragile because it assumes get_args(Tool) returns a tuple. If Tool is not a Union type or the structure changes, this could fail. Consider using isinstance(get_args(Tool), tuple) to verify the return type, or use a more explicit approach like defining valid_tool_types as a list and extending it.
| valid_tool_types = get_args(Tool) + (Handoff,) | |
| valid_tool_types = list(get_args(Tool)) | |
| valid_tool_types.append(Handoff) |
src/agents/agent.py
Outdated
| for i, tool in enumerate(self.tools): | ||
| if not isinstance(tool, valid_tool_types): | ||
| # Generate a friendly list of valid types for the error message | ||
| type_names = ", ".join(t.__name__ for t in valid_tool_types) |
Copilot
AI
Oct 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line may fail if any type in valid_tool_types doesn't have a __name__ attribute. While most types have this attribute, it's safer to handle potential AttributeErrors, especially given the dynamic nature of get_args(). Consider wrapping this in a try-except or using getattr(t, '__name__', str(t)) for robustness.
| type_names = ", ".join(t.__name__ for t in valid_tool_types) | |
| type_names = ", ".join(getattr(t, '__name__', str(t)) for t in valid_tool_types) |
9971b32 to
4a635e9
Compare
Problem: The _cleanup_tasks() method in OpenAISTTTranscriptionSession was only calling task.cancel() on pending tasks (listener, process_events, stream_audio, connection) but not awaiting them. This could lead to: 1. Unhandled task exception warnings 2. Potential resource leaks (websocket connections, file descriptors) 3. Improper cleanup of background tasks Evidence: - Similar to the recently fixed guardrail tasks cleanup (PR openai#1976) - Similar to the fixed websocket task cleanup (PR openai#1955) - asyncio best practices require awaiting cancelled tasks Solution: 1. Made _cleanup_tasks() async 2. Collect all real asyncio.Task objects that need to be awaited 3. Added await asyncio.gather() with return_exceptions=True to properly collect exceptions from cancelled tasks 4. Updated close() method to await _cleanup_tasks() Testing: - All existing voice/STT tests pass (17 passed) - Uses isinstance check to support mock objects in tests - Follows the same pattern as PR openai#1976 and PR openai#1955
4a635e9 to
eb447fb
Compare
Problem: The _cleanup_guardrail_tasks() method in RealtimeSession was only calling task.cancel() on pending guardrail tasks but not awaiting them. This could lead to: 1. Unhandled task exception warnings 2. Potential memory leaks from abandoned tasks 3. Improper resource cleanup Evidence: - Test code in tests/realtime/test_session.py:1199 shows the correct pattern: await asyncio.gather(*session._guardrail_tasks, return_exceptions=True) - Similar pattern used in openai_realtime.py:519-523 for WebSocket task cleanup Solution: 1. Made _cleanup_guardrail_tasks() async 2. Added await asyncio.gather() for real asyncio.Task objects to properly collect exceptions (with isinstance check to support mock objects in tests) 3. Updated _cleanup() to await the cleanup method Testing: - Created comprehensive test suite in tests/realtime/test_guardrail_cleanup.py with 3 test cases: 1. Verify cancelled tasks are properly awaited 2. Verify exceptions during cleanup are handled 3. Verify multiple concurrent tasks are cleaned up - All new tests pass - All existing tests pass (838 passed, 3 skipped) - Note: test_issue_889_guardrail_tool_execution has 1 pre-existing failure unrelated to this PR (also fails on main)
eb447fb to
5e6feae
Compare
Problem: The _cleanup_tasks() method in OpenAISTTTranscriptionSession was only calling task.cancel() on pending tasks (listener, process_events, stream_audio, connection) but not awaiting them. This could lead to: 1. Unhandled task exception warnings 2. Potential resource leaks (websocket connections, file descriptors) 3. Improper cleanup of background tasks Evidence: - Similar to recently fixed guardrail tasks cleanup (PR openai#1976) - Similar to fixed websocket task cleanup (PR openai#1955) - asyncio best practices require awaiting cancelled tasks Solution: 1. Made _cleanup_tasks() async 2. Collect all real asyncio.Task objects that need to be awaited 3. Added await asyncio.gather() with return_exceptions=True to properly collect exceptions from cancelled tasks 4. Updated close() method to await _cleanup_tasks() Testing: - All existing voice/STT tests pass (17 passed) - Uses isinstance check to support mock objects in tests - Follows the same pattern as PR openai#1976 and PR openai#1955
…asks Problem: The _cleanup_tasks() method in VoiceStreamResult was only calling task.cancel() on pending tasks but not awaiting them. Additionally, _check_errors() could raise CancelledError when checking cancelled tasks. This could lead to: 1. Unhandled task exception warnings 2. Potential resource leaks from abandoned tasks 3. CancelledError masking real exceptions Evidence: - Similar to fixed guardrail tasks cleanup (PR openai#1976) - Similar to fixed voice STT cleanup (PR openai#1977) - Similar to fixed websocket cleanup (PR openai#1955) - Bug documented in .claude/bug-analysis/03-resource-leaks.md Solution: 1. Made _cleanup_tasks() async 2. Collect all real asyncio.Task objects that need to be awaited 3. Added await asyncio.gather() with return_exceptions=True to properly collect exceptions from cancelled tasks 4. Updated _check_errors() to skip cancelled tasks using task.cancelled() check to avoid CancelledError when calling task.exception() 5. Updated stream() async generator to await _cleanup_tasks() Testing: - Linting passes - No breaking changes to public API - Follows same pattern as PR openai#1976, openai#1977, openai#1955
Simplified the cleanup logic by removing the isinstance check for asyncio.Task. Since _guardrail_tasks only contains real tasks created through asyncio.create_task(), this type check is unnecessary. This makes the code cleaner and avoids test-specific workarounds in production code.
Changed test_exception_during_guardrail_processing to use real asyncio.Task objects instead of Mock objects. This allows the production code to remain simple and clean, without needing isinstance checks to handle test-specific mocks. The test now creates actual tasks using asyncio.create_task() which better reflects real-world usage and naturally works with the cleanup logic that uses asyncio.gather().
Simplified the cleanup logic by removing isinstance checks for asyncio.Task. All tasks are created through asyncio.create_task(), so these type checks are unnecessary. This makes the code cleaner and avoids test-specific workarounds in production code, following the same pattern as PR openai#1976.
Problem: The _cleanup_tasks() method in OpenAISTTTranscriptionSession was only calling task.cancel() on pending tasks (listener, process_events, stream_audio, connection) but not awaiting them. This could lead to: 1. Unhandled task exception warnings 2. Potential resource leaks (websocket connections, file descriptors) 3. Improper cleanup of background tasks Evidence: - Similar to recently fixed guardrail tasks cleanup (PR openai#1976) - Similar to fixed websocket task cleanup (PR openai#1955) - asyncio best practices require awaiting cancelled tasks Solution: 1. Made _cleanup_tasks() async 2. Collect all real asyncio.Task objects that need to be awaited 3. Added await asyncio.gather() with return_exceptions=True to properly collect exceptions from cancelled tasks 4. Updated close() method to await _cleanup_tasks() Testing: - All existing voice/STT tests pass (17 passed) - Uses isinstance check to support mock objects in tests - Follows the same pattern as PR openai#1976 and PR openai#1955
Simplified the cleanup logic by removing isinstance checks for asyncio.Task. All tasks are created through asyncio.create_task(), so these type checks are unnecessary. This makes the code cleaner and avoids test-specific workarounds in production code, following the same pattern as PR openai#1976.
Resolved conflict by applying the same async cleanup pattern to both guardrail and tool call tasks. Both methods now properly cancel and await tasks using asyncio.gather with return_exceptions=True. Added comprehensive test coverage for tool call task cleanup: - Test cancellation and awaiting of tool call tasks - Test exception handling during cleanup - Test that _cleanup() awaits both task types This ensures consistent resource cleanup across all background tasks in realtime sessions.
|
Resolved the merge conflict by extending the same fix to Since PR #1984 introduced async def _cleanup(self) -> None:
await self._cleanup_guardrail_tasks()
await self._cleanup_tool_call_tasks() # Also made asyncAdded 3 tests for tool call cleanup. All 6 tests pass. |
Summary
Fixes a resource leak bug where guardrail tasks were cancelled but not properly awaited during cleanup, which could lead to unhandled task exception warnings and memory leaks.
Problem
The
_cleanup_guardrail_tasks()method inRealtimeSession(src/agents/realtime/session.py:749) was only callingtask.cancel()on pending guardrail tasks but not awaiting them. According to Python's asyncio best practices, cancelled tasks should be awaited to properly collect exceptions.Evidence
Test code shows correct pattern:
tests/realtime/test_session.py:1199already demonstrates the proper cleanup:Similar pattern used elsewhere: The WebSocket task cleanup in
openai_realtime.py:519-523follows the same pattern of cancel + await.Solution
_cleanup_guardrail_tasks()from sync to async methodawait asyncio.gather(*self._guardrail_tasks, return_exceptions=True)after cancelling to properly collect exceptions_cleanup()) to await the cleanup methodTesting
New tests in
tests/realtime/test_guardrail_cleanup.py:Existing tests: All 8 guardrail tests in
test_session.pycontinue to pass.Technical Details
Before:
After:
Impact