Enhance prompt injection detection by adding word boundaries for know… by prashant4654 · Pull Request #63 · 10xHub/Agentflow

prashant4654 · 2026-03-20T12:08:43Z

This pull request makes a targeted improvement to the regular expression used for detecting known jailbreak personas in the agentflow/utils/validators.py file. The change adds word boundaries to the regex pattern, which helps avoid false positives when matching persona names like "APOPHIS", "STAN", and "DUDE".

Regex improvements for persona detection:

Updated the regex pattern for known jailbreak personas to include word boundaries, ensuring only standalone words are matched and reducing false positives. (agentflow/utils/validators.py)

…n jailbreak personas

Copilot

Pull request overview

This PR refines the prompt-injection validator’s “known jailbreak persona” detection regex in agentflow/utils/validators.py by adding word boundaries, reducing false positives from substring matches (e.g., matching STAN inside STANford).

Changes:

Updated the known-persona regex to use \b word boundaries and a non-capturing group for APOPHIS|STAN|DUDE.

…intainability

…nses method

codecov · 2026-03-20T12:38:58Z

Codecov Report

❌ Patch coverage is 64.28571% with 10 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
agentflow/utils/converter.py	59.09%	8 Missing and 1 partial ⚠️
agentflow/graph/agent_internal/providers.py	50.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Iamsdt

Move that formatting logic inside convert message function

…onverter module

Enhance prompt injection detection by adding word boundaries for know…

5249a2f

…n jailbreak personas

Copilot AI review requested due to automatic review settings March 20, 2026 12:08

Copilot started reviewing on behalf of prashant4654 March 20, 2026 12:09 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Comment thread agentflow/utils/validators.py Outdated

Comment thread agentflow/utils/validators.py Outdated

prashant4654 added 4 commits March 20, 2026 17:42

Refactor regex for known jailbreak personas to improve clarity and ma…

6582bef

…intainability

Add tests for detecting jailbreak personas and prevent false positives

133d92f

Add noqa directive to suppress linting warning for _call_openai_respo…

9201b3b

…nses method

Add system prompt interpolation method and clean up import statements

8c965e1

Iamsdt requested changes Mar 20, 2026

View reviewed changes

Comment thread agentflow/graph/agent_internal/execution.py

Refactor system prompt interpolation to a dedicated function in the c…

70b8e8b

…onverter module

Iamsdt approved these changes Mar 20, 2026

View reviewed changes

Iamsdt merged commit ffc4d95 into 10xHub:main Mar 20, 2026
1 of 2 checks passed

This was referenced Mar 20, 2026

🐛 Bug Report: Cannot Use Custom State Variables in Agent Prompt #62

Closed

Regex in STAN falsely detects the word “Understand” as prompt injection #61

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance prompt injection detection by adding word boundaries for know…#63

Enhance prompt injection detection by adding word boundaries for know…#63
Iamsdt merged 6 commits into10xHub:mainfrom
prashant4654:main

prashant4654 commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

Iamsdt left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

prashant4654 commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Iamsdt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented Mar 20, 2026 •

edited

Loading