-
Notifications
You must be signed in to change notification settings - Fork 504
Description
Problem Statement
Feature: Bedrock Converse – support GuardrailConverseContent blocks + escape hatch
Background
Bedrock's Converse API lets callers wrap only the text that should be moderated in
{ "type": "guardrailConverseContent", "guardContent": { "text": "…" } }This is critical for low-latency chat UIs because the guardrail ignores prior turns.
Docs: https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-use-converse-api.html#evaluate-specific-content
Current behaviour -- All history is sent; no way to wrap only the latest message in GuardrailConverseContentBlock
from strands import Agent
from strands.models import BedrockModel
bedrock_model = BedrockModel(
model_id="us.amazon.nova-pro-v1:0",
guardrail_id="gr-abc123",
guardrail_version="DRAFT",
)
agent = Agent(model=bedrock_model)
response = agent("User message")- Strands serialises every message as a plain text block.
- Result: the guardrail evaluates the entire conversation, adding cost & latency.
Proposed Solution
Here's the Expected / requested behaviour
Add a guard_last_turn_only flag to BedrockModel that wraps only the latest user message in a guardrailConverseContent block, leaving conversation history unguarded.
bedrock_model = BedrockModel(
model_id="us.amazon.nova-pro-v1:0",
guardrail_id="gr-abc123",
guardrail_version="DRAFT",
guard_last_turn_only=True # <-- NEW FLAG
)
agent = Agent(model=bedrock_model)
response = agent("User message")Advanced control when needed
Add a raw_blocks kwarg that lets advanced users pass a list of fully-formed Bedrock content-block dicts; Strands will relay them unchanged, enabling any combination of ContentBlock types—including custom guardrail
wrapping—without waiting for wrapper updates.
bedrock_model = BedrockModel(
model_id="us.amazon.nova-pro-v1:0",
guardrail_id="gr-abc123",
guardrail_version="DRAFT",
)
agent = Agent(model=bedrock_model)
response = agent.run(
raw_blocks=[
{"type":"text","text":"System prompt"},
{"type":"text","text":"Un-guarded history"},
{"type":"guardrailConverseContent",
"guardContent":{"text":"Only this gets moderated"}}
]
)Use Case
I'm building a low-latency, multi-turn chat product on Bedrock with Strands, and I can't use Bedrock's selective-guarding feature because the wrapper always sends the whole conversation; adding a simple
guard_last_turn_only flag (plus an escape-hatch raw_blocks param) would let me moderate just the latest user input or selectively guard inputs, cutting latency, cost, and false-positive risk.
Alternatives Solutions
No response
Additional Context
The same issue has already been fixed in LiteLLM and LangChain: BerriAI/litellm#12676, langchain-ai/langchain-aws#540