Fix streaming trace end before guardrails complete #1921

zbirenbaum · 2025-10-17T21:46:32Z

This addresses an issue that came up while investigating a bug report impacting a wandb/weave user using our agents SDK TracingProcessor

Fixes a race condition where:

The streaming response to the query to the agent finishes before the input guardrails finish executing
The parent span for both the streaming response and the input guardrails is completed
The entire trace gets completed
The input guardrails complete and try to trigger on_span_end in the tracing processor for a span that has already completed in a trace that has already completed

It may be the case that this only occurs when there is no next step, since tripwires are awaited there but I'm not sure

Photos to illustrate the issue:
Before Fix:

After Fix:

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-10-17T21:48:48Z

src/agents/run.py

+            if streamed_result._input_guardrails_task:
+                await streamed_result._input_guardrails_task
            if current_span:
                current_span.finish(reset_current=True)


Awaiting guardrail task can skip span/trace cleanup on errors

The new await streamed_result._input_guardrails_task happens before current_span.finish() and streamed_result.trace.finish(), but it is not wrapped in any error handling. If the guardrail task raises or is cancelled (for example when a streamed run is cancelled immediately or a guardrail itself errors), the await propagates the exception out of the finally block and the subsequent span/trace cleanup never executes. That leaves spans open and the trace unfinished—the exact state this change intends to prevent. Consider awaiting the guardrail task inside a try/except or contextlib.suppress so that cleanup always runs even when the task fails or is cancelled.

Useful? React with 👍 / 👎.

zbirenbaum · 2025-10-17T21:54:41Z

src/agents/run.py


            streamed_result.is_complete = True
        finally:
+            if streamed_result._input_guardrails_task:


I think a better place to put this might be where the output guardrails are awaited (await both the input and output rails with gather or something if they haven't finished by when output ones are running) but this preserves the existing functionality. Would like input if someone with more context has any.

Currently, input guardrails run in parallel, and there’s no requirement for them to finish before executing subsequent agent code or its output guardrails. We might later add an option to customize this behavior so that subsequent logic waits until the input guardrails complete, but that’s not the case yet.

For now, simply awaiting the input guardrails in the finally clause should be fine.

seratch

Lint etc. are not successful with your second commit. Can you fix the issue too?

seratch · 2025-10-17T22:09:01Z

src/agents/run.py


            streamed_result.is_complete = True
        finally:
+            if streamed_result._input_guardrails_task:


Currently, input guardrails run in parallel, and there’s no requirement for them to finish before executing subsequent agent code or its output guardrails. We might later add an option to customize this behavior so that subsequent logic waits until the input guardrails complete, but that’s not the case yet.

For now, simply awaiting the input guardrails in the finally clause should be fine.

seratch · 2025-10-17T22:09:59Z

src/agents/run.py

+                    streamed_result.input_guardrail_results = await streamed_result._input_guardrails_task
+                except Exception:
+                    # Exceptions will be checked in the stream_events loop
+                    output_guardrail_results = []


I don't think this output guardrail related local variable is necessary here. Instead, can you add debug-level logging for debugging what's going on during local development?

Updated it to be just an await with a debug log on exception. I think the tripwire exception might be suppressed so I added a raise.

Regarding the comment above:
That makes sense to me, definitely don't want to wait until the guardrail completes to start getting a response. It should complete at some point before the trace ends and be able to raise an error before completing the span and trace if tripped

Updated it to be just an await with a debug log on exception. I think the tripwire exception might be suppressed by this however.

Yeah, after leaving the above comment, I was wondering how things go in the scenario where the tripwire comes up there. The suggested change here should be better than the current behavior in terms of the tracing consistency, but I see the need to take care of the tripwire pattern too.

Would it be possible to ask you to do some manual tests to see how it actually works with this change? The tripwire should show up in the case in any tracing providers. If you have more time to take for this, adding unit tests verifying the pattern would be greatly appreciated.

I'm definitely open to taking some time to ensure it works as intended, I am a little limited on time today so may need to followup tomorrow or Monday if that's alright.

I already have a repro file I can use to test the before and after behavior so shouldn't be hard. I'll take a look at the unit tests and see if I can add a few for the relevant scenarios

Thank you so much!

I managed to confirm the branch in its current state addresses the inconsistency in trace completion and the trace structure/code behavior is consistent in the following scenarios:

Input guardrail completes first (prior to changes - happy path)

Input guardrail completes first (after changes - happy path)

Input guardrail completes after main request (after changes - scenario PR intends to fix)

I also added some unit tests for various delay scenarios and exception behavior:

On main: test_parent_span_and_trace_finish_after_slow_input_guardrail is the only test that fails

On this branch: test_parent_span_and_trace_finish_after_slow_input_guardrail passes

The raise statement in the exception turned out not to be necessary to propagate the tripwire error and could short circuit completion so I took it out and left the debug statement. I think this should be good to go.

seratch

Overall, this looks great to me. Left one comment. Also, can you resolve the lint error? You can run make lint, make mypy, and make tests before pushing commits.

src/agents/run.py

seratch · 2025-10-21T05:33:00Z

tests/test_stream_input_guardrail_timing.py

@@ -0,0 +1,228 @@
+import asyncio


We still support python 3.9, so this CI build fails: https://github.com/openai/openai-agents-python/actions/runs/18673545373/job/53240062121?pr=1921

Can you add this line at the top of this file?

from __future__ import annotations

seratch

Thank you!

chatgpt-codex-connector bot reviewed Oct 17, 2025

View reviewed changes

seratch added bug Something isn't working feature:core feature:tracing labels Oct 17, 2025

zbirenbaum commented Oct 17, 2025

View reviewed changes

seratch requested changes Oct 17, 2025

View reviewed changes

zbirenbaum added 6 commits October 20, 2025 13:25

Fix streaming trace ending before spans complete

0cffb26

error handling

35fb6c1

lint

eb0b2b4

review comments

5310c78

await tripwire triggered instead of task

ef23520

raise error after logging

ae84338

zbirenbaum force-pushed the fix-trace-end branch from a19b6af to ae84338 Compare October 20, 2025 20:26

add unit tests

1989347

zbirenbaum force-pushed the fix-trace-end branch from 05b4481 to 1989347 Compare October 20, 2025 23:02

remove unnecessary raise

6bff7d9

zbirenbaum requested a review from seratch October 20, 2025 23:10

remove relative imports

eacd839

seratch requested changes Oct 21, 2025

View reviewed changes

src/agents/run.py Outdated Show resolved Hide resolved

zbirenbaum added 2 commits October 20, 2025 22:05

remove unnecessary edit

d870902

lint

daf42e4

seratch requested changes Oct 21, 2025

View reviewed changes

3.9 compat

93c0385

seratch approved these changes Oct 21, 2025

View reviewed changes

seratch merged commit 59a8b0f into openai:main Oct 21, 2025
8 checks passed

Fix streaming trace end before guardrails complete #1921

Fix streaming trace end before guardrails complete #1921

Conversation

zbirenbaum commented Oct 17, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seratch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zbirenbaum Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zbirenbaum Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seratch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seratch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zbirenbaum Oct 17, 2025 •

edited

Loading

zbirenbaum Oct 20, 2025 •

edited

Loading