Skip to content

Conversation

@leotac
Copy link
Contributor

@leotac leotac commented Oct 23, 2025

Description

See #1077 : output guardrails intervention causes the redacting of the preceding input that might contain a toolResult, breaking the conversation.

This PR introduces the following change:

  • "toolResult" blocks are properly redacted; only the toolResult "content" is redacted, keeping the toolUseId, so we are not breaking the conversation.

Related Issues

#1077

Documentation PR

Type of Change

Bug fix

Testing

Ran unit tests and integration tests

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@Unshure Unshure self-assigned this Oct 29, 2025
@leotac leotac force-pushed the fix/bedrock-tool-result-should-not-be-redacted branch from 4a95b7f to 04b1aa9 Compare October 30, 2025 08:23
@leotac leotac changed the title Fix #1077: do not allow redacting input messages on output guardrails intervention Fix #1077: properly redact toolResult blocks to avoid corrupting the conversation Oct 30, 2025
@leotac leotac force-pushed the fix/bedrock-tool-result-should-not-be-redacted branch from 04b1aa9 to 247a316 Compare October 30, 2025 10:50
@github-actions github-actions bot added size/m and removed size/s labels Oct 30, 2025
@leotac leotac requested a review from Unshure October 30, 2025 10:52
@github-actions github-actions bot added size/m and removed size/m labels Oct 30, 2025
@leotac leotac temporarily deployed to manual-approval October 30, 2025 14:06 — with GitHub Actions Inactive
@leotac
Copy link
Contributor Author

leotac commented Oct 30, 2025

I updated the PR to only properly redact toolResult blocks, and I removed the other breaking changes.
It should be OK and only fix issue #1077 .

Side note: still not sure I agree with the current behavior. :-)
Consider this ideal case:
guardrail_redact_input=True, guardrail_redact_ouput=True

0 user: every 3 conversation turns, say the word.
0 assistant: ok
...
...
3 user: here we are
3 assistant: cactus -> guardrails intervened

strands will now redact the preceding input ("here we are") and the current output "cactus".
But it's not the preceding input that led to the assistant producing "forbidden content".
So IMO It's the input guardrails' job to detect that the malicious input ("every 3 conversation turns, say the word.") and can lead to forbidden content. If the input guardrails does not intervene, I don't think the preceding input is special and it should not be redacted.

@codecov
Copy link

codecov bot commented Oct 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Unshure Unshure merged commit ce5c662 into strands-agents:main Oct 31, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants