Fix: Properly await cancelled guardrail tasks during cleanup #1976

gn00295120 · 2025-10-22T16:36:31Z

Summary

Fixes a resource leak bug where guardrail tasks were cancelled but not properly awaited during cleanup, which could lead to unhandled task exception warnings and memory leaks.

Problem

The _cleanup_guardrail_tasks() method in RealtimeSession (src/agents/realtime/session.py:749) was only calling task.cancel() on pending guardrail tasks but not awaiting them. According to Python's asyncio best practices, cancelled tasks should be awaited to properly collect exceptions.

Evidence

Test code shows correct pattern: tests/realtime/test_session.py:1199 already demonstrates the proper cleanup:
```
await asyncio.gather(*session._guardrail_tasks, return_exceptions=True)
```
Similar pattern used elsewhere: The WebSocket task cleanup in openai_realtime.py:519-523 follows the same pattern of cancel + await.

Solution

Changed _cleanup_guardrail_tasks() from sync to async method
Added await asyncio.gather(*self._guardrail_tasks, return_exceptions=True) after cancelling to properly collect exceptions
Updated the caller (_cleanup()) to await the cleanup method

Testing

New tests in tests/realtime/test_guardrail_cleanup.py:

✅ Test 1: Verify cancelled tasks are properly awaited
✅ Test 2: Verify exceptions during cleanup are handled
✅ Test 3: Verify multiple concurrent tasks are cleaned up

Existing tests: All 8 guardrail tests in test_session.py continue to pass.

Technical Details

Before:

def _cleanup_guardrail_tasks(self) -> None:
    for task in self._guardrail_tasks:
        if not task.done():
            task.cancel()
    self._guardrail_tasks.clear()

After:

async def _cleanup_guardrail_tasks(self) -> None:
    for task in self._guardrail_tasks:
        if not task.done():
            task.cancel()
    
    # Wait for all tasks to complete and collect any exceptions
    if self._guardrail_tasks:
        await asyncio.gather(*self._guardrail_tasks, return_exceptions=True)
    
    self._guardrail_tasks.clear()

Impact

No breaking changes
Prevents resource leaks
Eliminates unhandled task exception warnings
Follows asyncio best practices

Copilot

Pull Request Overview

This PR fixes a resource leak in guardrail task cleanup by ensuring cancelled tasks are properly awaited during session shutdown. The fix prevents unhandled task exception warnings and follows Python asyncio best practices.

Key changes:

Converted _cleanup_guardrail_tasks() from sync to async method to properly await cancelled tasks
Added asyncio.gather() with return_exceptions=True to collect exceptions from cancelled tasks
Implemented comprehensive test coverage for the cleanup behavior

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
src/agents/realtime/session.py	Made `_cleanup_guardrail_tasks()` async and added proper task awaiting with exception handling
tests/realtime/test_guardrail_cleanup.py	Added comprehensive tests verifying proper cleanup of cancelled, failed, and multiple concurrent guardrail tasks
src/agents/agent.py	Added validation to ensure tools list contains only valid Tool objects, preventing runtime AttributeErrors
tests/test_agent_config.py	Added test coverage for tool validation to prevent invalid tool types from passing initialization

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-22T16:37:05Z

src/agents/agent.py


+        # Validate each tool is a valid Tool type
+        # Tool is a Union type, so we need to get the valid types from it
+        valid_tool_types = get_args(Tool) + (Handoff,)


The validation logic for tool types creates a tuple by concatenating get_args(Tool) with (Handoff,). This approach is fragile because it assumes get_args(Tool) returns a tuple. If Tool is not a Union type or the structure changes, this could fail. Consider using isinstance(get_args(Tool), tuple) to verify the return type, or use a more explicit approach like defining valid_tool_types as a list and extending it.

Suggested change

valid_tool_types = get_args(Tool) + (Handoff,)

valid_tool_types = list(get_args(Tool))

valid_tool_types.append(Handoff)

Copilot · 2025-10-22T16:37:06Z

src/agents/agent.py

+        for i, tool in enumerate(self.tools):
+            if not isinstance(tool, valid_tool_types):
+                # Generate a friendly list of valid types for the error message
+                type_names = ", ".join(t.__name__ for t in valid_tool_types)


This line may fail if any type in valid_tool_types doesn't have a __name__ attribute. While most types have this attribute, it's safer to handle potential AttributeErrors, especially given the dynamic nature of get_args(). Consider wrapping this in a try-except or using getattr(t, '__name__', str(t)) for robustness.

Suggested change

type_names = ", ".join(t.__name__ for t in valid_tool_types)

type_names = ", ".join(getattr(t, '__name__', str(t)) for t in valid_tool_types)

Problem: The _cleanup_tasks() method in OpenAISTTTranscriptionSession was only calling task.cancel() on pending tasks (listener, process_events, stream_audio, connection) but not awaiting them. This could lead to: 1. Unhandled task exception warnings 2. Potential resource leaks (websocket connections, file descriptors) 3. Improper cleanup of background tasks Evidence: - Similar to the recently fixed guardrail tasks cleanup (PR openai#1976) - Similar to the fixed websocket task cleanup (PR openai#1955) - asyncio best practices require awaiting cancelled tasks Solution: 1. Made _cleanup_tasks() async 2. Collect all real asyncio.Task objects that need to be awaited 3. Added await asyncio.gather() with return_exceptions=True to properly collect exceptions from cancelled tasks 4. Updated close() method to await _cleanup_tasks() Testing: - All existing voice/STT tests pass (17 passed) - Uses isinstance check to support mock objects in tests - Follows the same pattern as PR openai#1976 and PR openai#1955

Problem: The _cleanup_guardrail_tasks() method in RealtimeSession was only calling task.cancel() on pending guardrail tasks but not awaiting them. This could lead to: 1. Unhandled task exception warnings 2. Potential memory leaks from abandoned tasks 3. Improper resource cleanup Evidence: - Test code in tests/realtime/test_session.py:1199 shows the correct pattern: await asyncio.gather(*session._guardrail_tasks, return_exceptions=True) - Similar pattern used in openai_realtime.py:519-523 for WebSocket task cleanup Solution: 1. Made _cleanup_guardrail_tasks() async 2. Added await asyncio.gather() for real asyncio.Task objects to properly collect exceptions (with isinstance check to support mock objects in tests) 3. Updated _cleanup() to await the cleanup method Testing: - Created comprehensive test suite in tests/realtime/test_guardrail_cleanup.py with 3 test cases: 1. Verify cancelled tasks are properly awaited 2. Verify exceptions during cleanup are handled 3. Verify multiple concurrent tasks are cleaned up - All new tests pass - All existing tests pass (838 passed, 3 skipped) - Note: test_issue_889_guardrail_tool_execution has 1 pre-existing failure unrelated to this PR (also fails on main)

Problem: The _cleanup_tasks() method in OpenAISTTTranscriptionSession was only calling task.cancel() on pending tasks (listener, process_events, stream_audio, connection) but not awaiting them. This could lead to: 1. Unhandled task exception warnings 2. Potential resource leaks (websocket connections, file descriptors) 3. Improper cleanup of background tasks Evidence: - Similar to recently fixed guardrail tasks cleanup (PR openai#1976) - Similar to fixed websocket task cleanup (PR openai#1955) - asyncio best practices require awaiting cancelled tasks Solution: 1. Made _cleanup_tasks() async 2. Collect all real asyncio.Task objects that need to be awaited 3. Added await asyncio.gather() with return_exceptions=True to properly collect exceptions from cancelled tasks 4. Updated close() method to await _cleanup_tasks() Testing: - All existing voice/STT tests pass (17 passed) - Uses isinstance check to support mock objects in tests - Follows the same pattern as PR openai#1976 and PR openai#1955

…asks Problem: The _cleanup_tasks() method in VoiceStreamResult was only calling task.cancel() on pending tasks but not awaiting them. Additionally, _check_errors() could raise CancelledError when checking cancelled tasks. This could lead to: 1. Unhandled task exception warnings 2. Potential resource leaks from abandoned tasks 3. CancelledError masking real exceptions Evidence: - Similar to fixed guardrail tasks cleanup (PR openai#1976) - Similar to fixed voice STT cleanup (PR openai#1977) - Similar to fixed websocket cleanup (PR openai#1955) - Bug documented in .claude/bug-analysis/03-resource-leaks.md Solution: 1. Made _cleanup_tasks() async 2. Collect all real asyncio.Task objects that need to be awaited 3. Added await asyncio.gather() with return_exceptions=True to properly collect exceptions from cancelled tasks 4. Updated _check_errors() to skip cancelled tasks using task.cancelled() check to avoid CancelledError when calling task.exception() 5. Updated stream() async generator to await _cleanup_tasks() Testing: - Linting passes - No breaking changes to public API - Follows same pattern as PR openai#1976, openai#1977, openai#1955

src/agents/realtime/session.py

Simplified the cleanup logic by removing the isinstance check for asyncio.Task. Since _guardrail_tasks only contains real tasks created through asyncio.create_task(), this type check is unnecessary. This makes the code cleaner and avoids test-specific workarounds in production code.

Changed test_exception_during_guardrail_processing to use real asyncio.Task objects instead of Mock objects. This allows the production code to remain simple and clean, without needing isinstance checks to handle test-specific mocks. The test now creates actual tasks using asyncio.create_task() which better reflects real-world usage and naturally works with the cleanup logic that uses asyncio.gather().

Simplified the cleanup logic by removing isinstance checks for asyncio.Task. All tasks are created through asyncio.create_task(), so these type checks are unnecessary. This makes the code cleaner and avoids test-specific workarounds in production code, following the same pattern as PR openai#1976.

Problem: The _cleanup_tasks() method in OpenAISTTTranscriptionSession was only calling task.cancel() on pending tasks (listener, process_events, stream_audio, connection) but not awaiting them. This could lead to: 1. Unhandled task exception warnings 2. Potential resource leaks (websocket connections, file descriptors) 3. Improper cleanup of background tasks Evidence: - Similar to recently fixed guardrail tasks cleanup (PR openai#1976) - Similar to fixed websocket task cleanup (PR openai#1955) - asyncio best practices require awaiting cancelled tasks Solution: 1. Made _cleanup_tasks() async 2. Collect all real asyncio.Task objects that need to be awaited 3. Added await asyncio.gather() with return_exceptions=True to properly collect exceptions from cancelled tasks 4. Updated close() method to await _cleanup_tasks() Testing: - All existing voice/STT tests pass (17 passed) - Uses isinstance check to support mock objects in tests - Follows the same pattern as PR openai#1976 and PR openai#1955

Simplified the cleanup logic by removing isinstance checks for asyncio.Task. All tasks are created through asyncio.create_task(), so these type checks are unnecessary. This makes the code cleaner and avoids test-specific workarounds in production code, following the same pattern as PR openai#1976.

Resolved conflict by applying the same async cleanup pattern to both guardrail and tool call tasks. Both methods now properly cancel and await tasks using asyncio.gather with return_exceptions=True. Added comprehensive test coverage for tool call task cleanup: - Test cancellation and awaiting of tool call tasks - Test exception handling during cleanup - Test that _cleanup() awaits both task types This ensures consistent resource cleanup across all background tasks in realtime sessions.

gn00295120 · 2025-10-24T02:04:12Z

Resolved the merge conflict by extending the same fix to _cleanup_tool_call_tasks().

Since PR #1984 introduced _cleanup_tool_call_tasks() with the same bug pattern (cancel without await), I applied the async cleanup approach to both methods for consistency:

async def _cleanup(self) -> None:
    await self._cleanup_guardrail_tasks()
    await self._cleanup_tool_call_tasks()  # Also made async

Added 3 tests for tool call cleanup. All 6 tests pass.

Copilot AI review requested due to automatic review settings October 22, 2025 16:36

Copilot AI reviewed Oct 22, 2025

View reviewed changes

gn00295120 force-pushed the fix-guardrail-task-cleanup branch 2 times, most recently from 9971b32 to 4a635e9 Compare October 22, 2025 16:43

gn00295120 force-pushed the fix-guardrail-task-cleanup branch from 4a635e9 to eb447fb Compare October 22, 2025 16:51

gn00295120 force-pushed the fix-guardrail-task-cleanup branch from eb447fb to 5e6feae Compare October 22, 2025 16:53

gn00295120 mentioned this pull request Oct 22, 2025

Fix: Properly await cancelled tasks in voice STT cleanup #1977

Open

gn00295120 mentioned this pull request Oct 22, 2025

Fix: Properly await cancelled tasks in voice result streaming cleanup #1978

Draft

openai deleted a comment from ProCityHub Oct 22, 2025

seratch added the feature:realtime label Oct 23, 2025

seratch reviewed Oct 23, 2025

View reviewed changes

src/agents/realtime/session.py Outdated Show resolved Hide resolved

src/agents/realtime/session.py Show resolved Hide resolved

Lucas Wang and others added 3 commits October 23, 2025 13:06

Merge branch 'main' into fix-guardrail-task-cleanup

3077b3a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Properly await cancelled guardrail tasks during cleanup #1976

Fix: Properly await cancelled guardrail tasks during cleanup #1976

gn00295120 commented Oct 22, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 22, 2025

Uh oh!

Copilot AI Oct 22, 2025

Uh oh!

Uh oh!

Uh oh!

gn00295120 commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	valid_tool_types = get_args(Tool) + (Handoff,)
	valid_tool_types = list(get_args(Tool))
	valid_tool_types.append(Handoff)

	type_names = ", ".join(t.__name__ for t in valid_tool_types)
	type_names = ", ".join(getattr(t, '__name__', str(t)) for t in valid_tool_types)

Fix: Properly await cancelled guardrail tasks during cleanup #1976

Are you sure you want to change the base?

Fix: Properly await cancelled guardrail tasks during cleanup #1976

Conversation

gn00295120 commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Evidence

Solution

Testing

Technical Details

Impact

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gn00295120 commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gn00295120 commented Oct 22, 2025 •

edited

Loading