Skip to content

Conversation

@RVV-karma
Copy link

Summary

Adds a run_in_parallel boolean parameter to InputGuardrail that allows developers to choose between parallel execution (default, better latency) and blocking execution (better cost efficiency and prevents race conditions).

Motivation:

Currently, input guardrails run in parallel with the agent's execution. While this provides low latency, it can lead to:

  • Unnecessary token consumption when a guardrail triggers after the agent has already started processing
  • Tools executing before guardrails can halt execution (race condition)
  • Security concerns when validation needs to complete before any LLM interaction

Changes:

  • Added run_in_parallel: bool = True field to InputGuardrail dataclass (maintains backward compatibility)
  • Updated @input_guardrail decorator to accept run_in_parallel parameter
  • Implemented conditional execution logic in AgentRunner:
    • Non-streaming mode: Sequential guardrails run first (blocking), then parallel guardrails run concurrently with agent via asyncio.gather
    • Streaming mode: Sequential guardrails run first, then parallel guardrails run as background task
  • Updated documentation in docs/guardrails.md with "Execution modes" section explaining use cases
  • Updated example in examples/agent_patterns/input_guardrails.py

Execution Modes:

  1. Parallel execution (run_in_parallel=True, default): Guardrail runs concurrently with the agent for minimal latency
  2. Blocking execution (run_in_parallel=False): Guardrail runs and completes before the agent starts, preventing any token consumption or tool execution if the guardrail triggers

Benefits:

Test plan

Tests Added:

  • test_input_guardrail_run_in_parallel_default - Verifies default is True
  • test_input_guardrail_run_in_parallel_false - Verifies can be set to False
  • test_input_guardrail_decorator_with_run_in_parallel - Tests decorator parameter
  • test_input_guardrail_decorator_with_name_and_run_in_parallel - Tests both name and run_in_parallel parameters

Tests Run:

  • ✅ All guardrail tests pass (12/12)
  • ✅ All agent runner tests pass (62/63, 1 pre-existing Windows SQLite file locking issue unrelated to changes)
  • ✅ Code formatted with python -m ruff format
  • ✅ Code linted with python -m ruff check --fix
  • ✅ Type checked with python -m mypy . (pre-existing optional dependency errors only)

Verification:

  • Existing guardrail behavior unchanged (parallel by default)
  • Both streaming and non-streaming modes handle blocking guardrails correctly
  • Sequential guardrails raise exception before agent starts (no token consumption)
  • Parallel guardrails maintain existing behavior

Related issues

Closes #1985
Fixes #889
Fixes #991


Checklist:

  • New tests added
  • Documentation updated
  • make format / ruff format run
  • make lint / ruff check run
  • Full test suite passes
  • Commit message follows imperative mood

@seratch seratch added enhancement New feature or request feature:core labels Oct 23, 2025
@seratch
Copy link
Member

seratch commented Oct 23, 2025

This design looks flexible and clean enough. @rm-openai what do you think?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a run_in_parallel parameter to input guardrails, allowing developers to choose between parallel execution (default, for better latency) and blocking execution (for better cost efficiency and to prevent race conditions). The change is backward compatible with the default set to True.

Key Changes:

  • Added run_in_parallel boolean field to InputGuardrail dataclass and @input_guardrail decorator
  • Implemented conditional execution logic in AgentRunner to run sequential guardrails before the agent starts and parallel guardrails concurrently
  • Updated documentation and examples to explain the new execution modes

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/agents/guardrail.py Adds run_in_parallel field to InputGuardrail class and parameter to decorator
src/agents/run.py Implements sequential vs parallel guardrail execution logic in both streaming and non-streaming modes
tests/test_guardrails.py Adds tests for the new run_in_parallel parameter with default and explicit values
docs/guardrails.md Documents the new execution modes for input guardrails
examples/agent_patterns/input_guardrails.py Updates example documentation to explain the new feature

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +100 to +103
run_in_parallel: bool = True
"""Whether the guardrail runs concurrently with the agent (True, default) or before
the agent starts (False).
"""
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class-level docstring for InputGuardrail states 'Input guardrails are checks that run in parallel to the agent's execution' but this is no longer accurate since guardrails can now run sequentially. The docstring should be updated to reflect that guardrails can run either in parallel or sequentially based on the run_in_parallel parameter.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 1ad513a

# Guardrails

Guardrails run _in parallel_ to your agents, enabling you to do checks and validations of user input. For example, imagine you have an agent that uses a very smart (and hence slow/expensive) model to help with customer requests. You wouldn't want malicious users to ask the model to help them with their math homework. So, you can run a guardrail with a fast/cheap model. If the guardrail detects malicious usage, it can immediately raise an error, which stops the expensive model from running and saves you time/money.
Guardrails enable you to do checks and validations of user input and agent output. For example, imagine you have an agent that uses a very smart (and hence slow/expensive) model to help with customer requests. You wouldn't want malicious users to ask the model to help them with their math homework. So, you can run a guardrail with a fast/cheap model. If the guardrail detects malicious usage, it can immediately raise an error and prevent the expensive model from running, saving you time and money.
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated documentation now states guardrails 'can immediately raise an error and prevent the expensive model from running' but this is only accurate for blocking guardrails (run_in_parallel=False). For parallel guardrails (the default), the expensive model may have already started. Consider clarifying this distinction or adding a note that this benefit applies specifically to blocking guardrails.

Suggested change
Guardrails enable you to do checks and validations of user input and agent output. For example, imagine you have an agent that uses a very smart (and hence slow/expensive) model to help with customer requests. You wouldn't want malicious users to ask the model to help them with their math homework. So, you can run a guardrail with a fast/cheap model. If the guardrail detects malicious usage, it can immediately raise an error and prevent the expensive model from running, saving you time and money.
Guardrails enable you to do checks and validations of user input and agent output. For example, imagine you have an agent that uses a very smart (and hence slow/expensive) model to help with customer requests. You wouldn't want malicious users to ask the model to help them with their math homework. So, you can run a guardrail with a fast/cheap model. If the guardrail detects malicious usage, it can immediately raise an error and prevent the expensive model from running, saving you time and money (**when using blocking guardrails; for parallel guardrails, the expensive model may have already started running before the guardrail completes—see "Execution modes" below for details**).

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 1ad513a

@RVV-karma RVV-karma force-pushed the guardrail-execution branch from 5b0b203 to 3df0d58 Compare October 23, 2025 13:06
@chatgpt-codex-connector
Copy link

💡 Codex Review

if current_turn == 1:
# Separate guardrails based on execution mode
all_input_guardrails = starting_agent.input_guardrails + (
run_config.input_guardrails or []
)
sequential_guardrails = [
g for g in all_input_guardrails if not g.run_in_parallel
]
parallel_guardrails = [g for g in all_input_guardrails if g.run_in_parallel]
# Run sequential guardrails first (will raise exception if tripwire triggered)
if sequential_guardrails:
await cls._run_input_guardrails_with_queue(
starting_agent,
sequential_guardrails,
ItemHelpers.input_to_new_input_list(prepared_input),
context_wrapper,
streamed_result,
current_span,
)
# Run parallel guardrails in background
streamed_result._input_guardrails_task = asyncio.create_task(
cls._run_input_guardrails_with_queue(
starting_agent,
parallel_guardrails,
ItemHelpers.input_to_new_input_list(prepared_input),
context_wrapper,
streamed_result,
current_span,
)
)
try:
turn_result = await cls._run_single_turn_streamed(
streamed_result,

P1 Badge Abort streaming turn when blocking guardrail trips

In the streaming path, blocking input guardrails are awaited via _run_input_guardrails_with_queue(...) and then _run_single_turn_streamed is called unconditionally. Because _run_input_guardrails_with_queue never raises, a guardrail with run_in_parallel=False can set tripwire_triggered=True yet the agent still executes the first turn before the consumer ever sees the exception. That defeats the intended latency vs. safety trade‑off—tools and tokens are consumed even when a blocking guardrail fails. Check the guardrail results and raise InputGuardrailTripwireTriggered before starting the turn.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@RVV-karma
Copy link
Author

@codex review

@RVV-karma
Copy link
Author

I am done from my side. Please review it. Let me know for any changes. Thanks!

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request feature:core

Projects

None yet

2 participants