Skip to content

feat(03-integration): alice.io WonderFence guardrails#223

Merged
manoj-selvakumar5 merged 5 commits intostrands-agents:mainfrom
lior-k:alice-io-wonderfence
Mar 17, 2026
Merged

feat(03-integration): alice.io WonderFence guardrails#223
manoj-selvakumar5 merged 5 commits intostrands-agents:mainfrom
lior-k:alice-io-wonderfence

Conversation

@lior-k
Copy link
Copy Markdown
Contributor

@lior-k lior-k commented Feb 8, 2026

Summary

This PR adds a new third-party guardrails integration example demonstrating how to use Alice WonderFence with Strands agents for real-time AI safety protection.

What's New

  • Alice WonderFence Integration: Complete working example of integrating WonderFence guardrails with Strands agents
  • Hook-Based Implementation: Uses Strands lifecycle hooks to intercept and evaluate content at four key points:
    • on_before_model_call: Evaluates user prompts before reaching the model
    • on_after_model_call: Evaluates model responses before returning to users
    • on_before_tool_call: Evaluates tool input parameters for safety
    • on_after_tool_call: Evaluates tool execution results

Features

  • Adaptive Protection: Real-time detection and blocking of harmful prompts and outputs
  • Flexible Actions: BLOCK, MASK, or ALLOW content based on configured policies
  • Multimodal Support: Works with text, images, and other content types
  • Multilingual: Supports 20+ languages
  • Customizable Policies: Configure detection rules through the WonderFence UI

Example Test Cases

The integration includes demonstrations of:

  • Prompt injection attacks (system prompt override, impersonation)
  • Hate speech detection
  • PII masking (email addresses, phone numbers)
  • Abusive or harmful content filtering

Files Added

  • 03-integrations/third-party-guardrails/04-alice-wonderfence/README.md - Documentation and setup instructions
  • 03-integrations/third-party-guardrails/04-alice-wonderfence/guardrail.py - WonderFence hook implementation
  • 03-integrations/third-party-guardrails/04-alice-wonderfence/main.py - Demo application with test cases
  • 03-integrations/third-party-guardrails/04-alice-wonderfence/requirements.txt - Python dependencies

lior-k and others added 2 commits February 8, 2026 17:46
Adds third-party guardrails example integrating Alice WonderFence with Strands
agents for real-time AI safety protection. The integration uses Strands hooks
to evaluate prompts, responses, tool inputs, and tool outputs.

Features:
- Hook-based implementation (BeforeModelCall, AfterModelCall, BeforeTool, AfterTool)
- Support for BLOCK, MASK, and ALLOW actions
- Multimodal and multilingual detection (20+ languages)
- Example demonstrations of prompt injection, hate speech, and PII detection
- Customizable policies via WonderFence UI

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 10, 2026

Latest scan for commit: 8327abd | Updated: 2026-03-17 01:43:21 UTC

✅ Security Scan Report (PR Files Only)

Scanned Files

  • 03-integrations/third-party-guardrails/04-alice-wonderfence/README.md
  • 03-integrations/third-party-guardrails/04-alice-wonderfence/guardrail.py
  • 03-integrations/third-party-guardrails/04-alice-wonderfence/main.py
  • 03-integrations/third-party-guardrails/04-alice-wonderfence/requirements.txt

Security Scan Results

Critical High Medium Low Info
0 0 0 0 0

Threshold: High

No security issues detected in your changes. Great job!

This scan only covers files changed in this PR.

Comment thread 03-integrations/third-party-guardrails/04-alice-wonderfence/guardrail.py Outdated
Comment thread 03-integrations/third-party-guardrails/04-alice-wonderfence/guardrail.py Outdated
Comment thread 03-integrations/third-party-guardrails/04-alice-wonderfence/guardrail.py Outdated
Comment thread 03-integrations/third-party-guardrails/04-alice-wonderfence/main.py Outdated
Comment thread 03-integrations/third-party-guardrails/04-alice-wonderfence/main.py Outdated
@manoj-selvakumar5
Copy link
Copy Markdown
Collaborator

Hi @lior-k - Thank you for the PR. the hook architecture is great, especially covering all 4 lifecycle hooks and supporting MASK actions. I tested the sample end-to-end and left a few comments on the PR. The main blocker is the argument order in the SDK calls (context should come before content). The other two are smaller fixes. Please let me know if you have questions!

lior-k and others added 2 commits March 4, 2026 13:43
- Swap (content, context) to (context, content) for evaluate_prompt_sync
  and evaluate_response_sync calls to match SDK signature
- Move Agent creation inside the test loop to avoid broken alternating
  user/assistant message pattern after WonderFenceViolationException

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The SDK's evaluate_prompt_sync and evaluate_response_sync take
(content, context), not (context, content). The original call order
was correct.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@lior-k
Copy link
Copy Markdown
Contributor Author

lior-k commented Mar 4, 2026

hi @manoj-selvakumar5. thanks for the review 🙏 The parameters order is actually correct, as confirmed on the PyPI library page: https://pypi.org/project/wonderfence-sdk/ however we added keyword arguments to function calls to make it clearer.

  • We fixed all the issues at hand

@lior-k
Copy link
Copy Markdown
Contributor Author

lior-k commented Mar 12, 2026

Thanks for the headsup @manoj-selvakumar5 🙌
We've updated the function calls based on the latest changes in the WonderFence SDK

@manoj-selvakumar5 manoj-selvakumar5 merged commit 4cc9f30 into strands-agents:main Mar 17, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants