Skip to content

Fix detector bypass regressions in CI/CD and markdown image exfiltration rules#19

Closed
cdayAI wants to merge 1 commit into
mainfrom
codex/fix-vulnerabilities-in-detection-rules
Closed

Fix detector bypass regressions in CI/CD and markdown image exfiltration rules#19
cdayAI wants to merge 1 commit into
mainfrom
codex/fix-vulnerabilities-in-detection-rules

Conversation

@cdayAI
Copy link
Copy Markdown
Owner

@cdayAI cdayAI commented May 26, 2026

Motivation

  • A recent change narrowed data-exfiltration parameter detection and added a broad negation lookahead that allowed crafted CI/CD @agent prompts and markdown-image URLs to bypass detection.

Description

  • Restored markdown-image exfiltration coverage to include key= in both legacy and v14 data-exfiltration regexes in python-sdk/agent_shield/detector.py so payloads like ?key=SECRET... are detected.
  • Adjusted the CI/CD @agent exfiltration regex in python-sdk/agent_shield/detector.py so an early benign-looking negation no longer suppresses a later explicit exfiltration instruction while preserving the narrow exception for the common benign phrase leak any sensitive data.
  • Added two regression tests to python-sdk/tests/test_detector.py that assert detection for the negation-prefix CI/CD bypass and a ?key= markdown image exfil payload.

Testing

  • Ran the v14 category tests with python -m pytest tests/test_detector.py -q in python-sdk and all tests passed (53 passed).
  • Existing detector unit tests that exercise related categories were executed as part of the same test run and succeeded.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant