Skip to content

Enhance prompt injection detection by adding word boundaries for know…#63

Merged
Iamsdt merged 6 commits into10xHub:mainfrom
prashant4654:main
Mar 20, 2026
Merged

Enhance prompt injection detection by adding word boundaries for know…#63
Iamsdt merged 6 commits into10xHub:mainfrom
prashant4654:main

Conversation

@prashant4654
Copy link
Copy Markdown
Contributor

This pull request makes a targeted improvement to the regular expression used for detecting known jailbreak personas in the agentflow/utils/validators.py file. The change adds word boundaries to the regex pattern, which helps avoid false positives when matching persona names like "APOPHIS", "STAN", and "DUDE".

Regex improvements for persona detection:

  • Updated the regex pattern for known jailbreak personas to include word boundaries, ensuring only standalone words are matched and reducing false positives. (agentflow/utils/validators.py)

Copilot AI review requested due to automatic review settings March 20, 2026 12:08
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines the prompt-injection validator’s “known jailbreak persona” detection regex in agentflow/utils/validators.py by adding word boundaries, reducing false positives from substring matches (e.g., matching STAN inside STANford).

Changes:

  • Updated the known-persona regex to use \b word boundaries and a non-capturing group for APOPHIS|STAN|DUDE.

Comment thread agentflow/utils/validators.py Outdated
Comment thread agentflow/utils/validators.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 20, 2026

Codecov Report

❌ Patch coverage is 64.28571% with 10 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
agentflow/utils/converter.py 59.09% 8 Missing and 1 partial ⚠️
agentflow/graph/agent_internal/providers.py 50.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Collaborator

@Iamsdt Iamsdt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move that formatting logic inside convert message function

Comment thread agentflow/graph/agent_internal/execution.py
@Iamsdt Iamsdt merged commit ffc4d95 into 10xHub:main Mar 20, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants