-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed as not planned
Closed as not planned
Copy link
Labels
Description
Describe the bug
When using prompt_with_handoff_instructions
with the Azure OpenAI endpoint, the prompt is being flagged by Azure’s content filtering policy with a jailbreak
detection error.
The error message returned is:
Error code: 400 - {'error': {'message': "The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766", 'code': 'content_filter', 'innererror': {'code': 'ResponsibleAIPolicyViolation', 'content_filter_result': {'jailbreak': {'detected': True, 'filtered': True}}}}}
However, if I bypass prompt_with_handoff_instructions
and pass a custom prompt directly, everything works as expected.
Debug information
- Agents SDK version:
v0.0.16
- Python version:
3.11
- Model: gpt-4o
- Endpoint: Azure OpenAI
Repro steps
agent= Agent[AgentContext](
name="Orchestrator",
instructions=f"""{prompt_with_handoff_instructions(orchestrator_prompt_default)}"""
)
result = await Runner.run(
agent,
input=input_messages,
context=context,
run_config=RunConfig(model_settings=ModelSettings(temperature=0.2)),
model=OpenAIChatCompletionsModel(
model=self.model_name, openai_client=self.openai_client
),
)
print(result.final_output)
This throws a 400 error with a content_filter
violation due to the moderation system flagging the system context injected by prompt_with_handoff_instructions
.
When using a plain prompt (e.g., just "Hello, what can you do?"
), the request succeeds.
Also tried the same prompt on the Azure AI studio, and worked as expected
Expected behavior
prompt_with_handoff_instructions
should not trigger moderation filters—especially with default or recommended content. Ideally, the system prompt it generates should be safe to use even on stricter platforms like Azure OpenAI.