feat: Phase D broaden detection — D5-D10, T7 by devin-ai-integration[bot] · Pull Request #51 · Odingard/Argus

devin-ai-integration · 2026-04-15T04:45:07Z

Summary

Adds six new detection capabilities across existing agents and infrastructure, without introducing new agent classes:

D5 (model_extraction.py): 3 new tool/function discovery techniques (d5_tool_schema_extraction, d5_function_call_probing, d5_capability_enumeration) plus wiring in _evaluate_response() and _record_intelligence().
D6 (privilege_escalation.py): 4 BOLA (Broken Object Level Authorization) payload sets — numeric IDOR, UUID swap, path traversal, mass assignment — with _test_bola() / _report_bola().
D7 (identity_spoof.py): 5 social engineering BFLA techniques (CEO urgency, compliance pressure, helpdesk, developer debug, time pressure) with _test_social_engineering_bfla() / _evaluate_bfla_response() / _report_bfla().
D8 (conductor/evaluation.py): 8 new PII patterns in DataCategoryMatcher.PATTERNS — phone, SSN, credit card (Visa/MC/Amex/Discover), IPv4, IPv6, date of birth, passport, medical record ID.
D10 (correlation/engine.py): 5 new compound attack path rules chaining D5/D6/D7 findings with existing agents.
T7 (conductor/session.py): ConnectionPool class — shared httpx.AsyncClient instances keyed by (host, timeout, csrf_mode). ConversationSession accepts optional pool= parameter; backward-compatible (no pool = existing behavior).

No new agents, no existing tests modified. All changes are additive.

Updates since initial commit

Fixed issues flagged across five rounds of Devin Review:

BFLA false-positive fix (identity_spoof.py): _evaluate_bfla_response now returns None whenever refusal_hits is non-empty and there's no hard evidence (markers), regardless of whether compliance keywords or soft privilege indicators are present. Previously, refusal messages like "I cannot share password info" or "I cannot grant you admin access" would match compliance/privilege keywords and emit false findings.
ConnectionPool cache key (conductor/session.py): Cache key is now (host, timeout, csrf_mode) instead of (host, timeout) to prevent incorrect client configuration.
PII phone regex (conductor/evaluation.py): Added word boundaries (\b) and made separators mandatory to prevent matching timestamps and contiguous digit sequences.
PII passport/medical_id regex (conductor/evaluation.py): Made colon/equals separator mandatory and added (?=[A-Z0-9]*\d) lookahead requiring at least one digit, preventing false matches on English words like "passport details".
Pooled session __aexit__ (conductor/session.py): Always clears self._client = None after exit, even for pooled sessions, so the turn() use-after-exit guard works correctly.
D7 BFLA unreachable code (identity_spoof.py): Restructured _attack_base() so chat endpoints are fetched independently of identity endpoints. D7 BFLA tests now run whenever chat endpoints are available, even when no identity surface exists.
BFLA refusal filter hardened (identity_spoof.py): Refusal detection now only yields to hard evidence (sensitive markers like leaked keys/tokens). Soft privilege indicators like "admin" no longer override refusal classification, since they commonly appear in refusal text (e.g. "I cannot grant you admin access").
IPv6 regex fix (conductor/evaluation.py): Replaced \b word boundaries with explicit lookaround anchors ((?:^|(?<=\s)|(?<=[=,;])) / (?=\s|$|[,;])) because \b doesn't fire adjacent to :: (both : and start-of-string are non-word characters). Added a fourth alternative to handle mid-address :: forms like fe80::1. Verified: ::1, fe80::1, 2001:db8::, full-form addresses all match correctly. Known gap: IPv4-mapped form ::ffff:192.168.1.1 is not matched (dots in suffix).

Review & Testing Checklist for Human

D7 BFLA compliance keyword breadth: Generic compliance keywords like "executing", "running", "completed" could trigger false positives on non-compliant responses that happen to use those words without a refusal phrase present. Consider whether these are specific enough for your target population.
T7 cookie isolation: ConnectionPool shares httpx.AsyncClient instances, meaning all sessions on the same pooled client share a cookie jar. Currently non-impactful (no caller passes pool= yet — opt-in infrastructure), but will need per-session cookie isolation when wired into the orchestrator.
D10 duplicate compound patterns: bfla_identity_spoof_privilege_escalation requires {"identity_spoof", "privilege_escalation"} — same agent set as the pre-existing identity_spoofing_privilege_escalation pattern. Both will fire for the same finding set, producing duplicate compound paths. Confirm this is intentional or deduplicate.
IPv6 mapped-IPv4 gap: pii_ipv6 does not match ::ffff:192.168.1.1 because the dot-decimal suffix isn't covered by the hex-group alternatives. Decide if this edge case matters for your targets.
Run argus scan against a target with a real AI chat/API endpoint to exercise D5/D6/D7 payloads end-to-end. Testing against odinforgeai.com confirmed all 13 agents deploy and complete without errors (3 findings, 2 validated from tool_poisoning), but D5/D6/D7 produced 0 findings because the target serves HTML (React SPA) rather than JSON API responses — the agents correctly skip non-JSON responses rather than crashing.

Notes

This PR builds on Phase C (merged in PR feat: Phase C attack power — encoding layer, indirect injection, multi-turn escalation, composite attacks #49) and the v0.1.4 version bump (PR chore: bump version to 0.1.4 for PyPI release #50).
No unit tests were added — all new code paths are exercised only at runtime during a scan. Consider adding unit tests for _evaluate_bfla_response logic and DataCategoryMatcher PII patterns.
The ConnectionPool singleton (shared()) is not thread-safe at construction time, which is fine for asyncio but would need a lock if ever used from multiple threads.
PII regex patterns (D8) were verified with 12 targeted test cases: 6 false-positive checks (timestamps, plain English) and 6 true-positive checks (real PII formats). All passed.

Link to Devin session: https://app.devin.ai/sessions/8b0c5ca873934d77aa254157cc41924c
Requested by: @andrebyrd-odingard

D5: Tool/Function Discovery payloads for model_extraction agent - d5_tool_schema_extraction, d5_function_call_probing, d5_capability_enumeration - Updated _evaluate_response() and _record_intelligence() for D5 techniques D6: BOLA Payloads for privilege_escalation agent - 4 BOLA techniques: numeric IDOR, UUID swap, path traversal, mass assignment - _test_bola() and _report_bola() methods D7: Social Engineering BFLA for identity_spoof agent - 5 techniques: CEO urgency, compliance pressure, helpdesk, developer debug, time pressure - _test_social_engineering_bfla(), _evaluate_bfla_response(), _report_bfla() D8: PII Detection Expansion in DataCategoryMatcher - Phone numbers, SSN, credit cards (Visa/MC/Amex/Discover), IPv4, IPv6 - Date of birth, passport numbers, medical record IDs D10: Correlation Agent — 5 new compound attack path patterns - BOLA + model_extraction, BFLA + identity_spoof + priv_esc - Tool discovery + prompt injection, BOLA + cross-agent exfil - BFLA + memory poisoning T7: Connection Pooling in ConversationSession - ConnectionPool class with shared httpx.AsyncClient instances - Keyed by (host, timeout), singleton pattern, scan-scoped lifecycle - ConversationSession accepts optional pool= parameter

devin-ai-integration · 2026-04-15T04:45:12Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

… key 1. BFLA evaluation: when both refusal and compliance keywords are present but no hard evidence (markers/priv_indicators), treat as refusal. Fixes false positives where refusal messages mention 'password', 'secret', etc. 2. ConnectionPool cache key: include csrf_mode in the (host, timeout, csrf_mode) key to prevent incorrect client configuration when sessions with different csrf_mode values share the same pool.

1. pii_phone: add word boundaries and require at least one separator to avoid matching timestamps and numeric IDs. 2. pii_passport/pii_medical_id: require mandatory colon/equals separator and at least one digit in value via lookahead, preventing matches on English words like 'passport details' or 'patient id unknown'. 3. ConnectionPool __aexit__: always clear self._client = None regardless of _owns_client, so the use-after-exit guard in turn() fires correctly for pooled sessions.

devin-ai-integration

Devin Review found 2 new potential issues.

View 15 additional findings in Devin Review.

devin-ai-integration · 2026-04-15T05:09:32Z

+        """Return (or create) a pooled client for *host* with *timeout*."""
+        key = (host, timeout, csrf_mode)
+        async with self._lock:
+            if key not in self._clients:
+                kwargs: dict[str, Any] = {
+                    "timeout": timeout,
+                    "event_hooks": {"request": [], "response": []},
+                    "follow_redirects": False,
+                }
+                if csrf_mode:
+                    kwargs["cookies"] = httpx.Cookies()
+                self._clients[key] = httpx.AsyncClient(**kwargs)
+                logger.debug("T7: created pooled client for %s (timeout=%.1f)", host, timeout)
+            return self._clients[key]


🔴 ConnectionPool shares cookie jar across sessions, causing cross-session state contamination

When multiple ConversationSession instances share a pooled httpx.AsyncClient (same host/timeout/csrf_mode key), they share the client's cookie jar. Any Set-Cookie headers from one session's target response are automatically sent in another session's subsequent requests. This breaks attack isolation: if Session A authenticates with the target and gets a session cookie, Session B will piggyback on that authentication, causing false positives (elevated access without earning it) or false negatives (unexpected state). The pool is exported as a public API in src/argus/conductor/__init__.py:40 and documented with ready-to-use examples.

Reproduction scenario

Session A targets http://target:8080, gets pooled client1

Target responds with Set-Cookie: session=AGENT_A

client1's cookie jar now has session=AGENT_A

Session B targets same host, gets same client1 from pool

Session B's requests automatically include Cookie: session=AGENT_A

Session B is now running in Session A's authenticated context

Was this helpful? React with 👍 or 👎 to provide feedback.

Valid observation. Currently non-severe since no caller passes pool= yet — ConnectionPool is opt-in infrastructure for a future PR. When wiring the pool into the orchestrator, we'll implement per-session cookie isolation (approach 1: store a per-session httpx.Cookies() on ConversationSession and pass it explicitly per-request). Leaving as a documented TODO for the wiring PR rather than adding unused complexity now.

Acknowledged — this is a valid concern for when pool= is actually wired into the orchestrator. Currently no caller passes pool= (it's opt-in infrastructure), so this is not exploitable yet. When we wire it in during orchestrator integration, we'll need per-session cookie isolation (e.g., passing fresh cookies=httpx.Cookies() per-request or cloning the client). Added a note in the PR description checklist about this.

Restructure _attack_base() so that: - Chat endpoints are fetched independently of identity endpoints - Early return only fires when NEITHER identity nor chat surfaces exist - D7 BFLA tests run whenever chat endpoints are available, regardless of whether identity endpoints exist

…rkers) Soft privilege indicators like 'admin' commonly appear in refusal text (e.g. 'I cannot grant you admin access'). Previously priv_indicators alone could override the refusal classification, causing false positives. Now only sensitive markers (leaked keys/tokens) override a refusal.

\b doesn't fire adjacent to :: because both : and start-of-string are non-word characters. Replaced with explicit lookaround anchors. Added fourth alternative to handle mid-address :: (e.g. fe80::1).

devin-ai-integration · 2026-04-15T05:41:08Z

Phase D Test Report — Live Scan against odinforgeai.com

Devin session

Test 1: Full argus scan (verbose) — PASSED

All 13 agents deployed and completed — no timeouts, no crashes
3 findings from tool_poisoning, 2 validated (CRITICAL: schema_enum_injection, schema_description_override)
Scan JSON output valid: 3 findings, 13 agent_results, 0 compound_paths
No Python tracebacks
Duration: 115.8s

Test 2: Cinematic Dashboard — PASSED

Dashboard rendered correctly with agent grid and attack stream
Agent status transitions visible (deployed → complete)
Final summary: "3 findings · 2 validated · 129 signals"
Completed in 72s without crash

Test 3: D8 PII Regex — PASSED (12/12)

False positive tests (all correctly NOT matched):

pii_phone vs timestamp "1713158400" ✓
pii_phone vs numeric ID "9876543210" ✓
pii_passport vs "passport details" ✓
pii_passport vs "passport information" ✓
pii_medical_id vs "patient id unknown" ✓
pii_medical_id vs "medical record required" ✓

True positive tests (all correctly matched):

pii_phone vs "(555) 123-4567" ✓
pii_phone vs "555-123-4567" ✓
pii_ssn vs "123-45-6789" ✓
pii_passport vs "passport: AB123456" ✓
pii_medical_id vs "patient id: MRN12345" ✓
pii_credit_card vs Visa "4111-1111-1111-1111" ✓

IPv6 Regex Fix Verification (Round 5)

::1 (loopback) — PASSED
2001:db8:: — PASSED
Full form 2001:0db8:85a3:0000:0000:8a2e:0370:7334 — PASSED
fe80::1 (link-local) — PASSED
::ffff:192.168.1.1 (mapped IPv4) — NOT MATCHED (edge case)
Plain text — correctly not matched

Escalation

Phase D agents (D5/D6/D7) produced 0 findings against this target because odinforgeai.com serves HTML (React SPA) for all paths, not JSON API responses. Agents correctly detect non-JSON and gracefully skip. A target with a real AI chat endpoint returning JSON would fully exercise D5/D6/D7 payloads.

…ts, 21 patterns), pin deps

…#52) * chore: bump version to 0.1.4 for PyPI release * feat: Phase D broaden detection — D5-D10, T7 D5: Tool/Function Discovery payloads for model_extraction agent - d5_tool_schema_extraction, d5_function_call_probing, d5_capability_enumeration - Updated _evaluate_response() and _record_intelligence() for D5 techniques D6: BOLA Payloads for privilege_escalation agent - 4 BOLA techniques: numeric IDOR, UUID swap, path traversal, mass assignment - _test_bola() and _report_bola() methods D7: Social Engineering BFLA for identity_spoof agent - 5 techniques: CEO urgency, compliance pressure, helpdesk, developer debug, time pressure - _test_social_engineering_bfla(), _evaluate_bfla_response(), _report_bfla() D8: PII Detection Expansion in DataCategoryMatcher - Phone numbers, SSN, credit cards (Visa/MC/Amex/Discover), IPv4, IPv6 - Date of birth, passport numbers, medical record IDs D10: Correlation Agent — 5 new compound attack path patterns - BOLA + model_extraction, BFLA + identity_spoof + priv_esc - Tool discovery + prompt injection, BOLA + cross-agent exfil - BFLA + memory poisoning T7: Connection Pooling in ConversationSession - ConnectionPool class with shared httpx.AsyncClient instances - Keyed by (host, timeout), singleton pattern, scan-scoped lifecycle - ConversationSession accepts optional pool= parameter * fix: address Devin Review findings — BFLA false positives, pool cache key 1. BFLA evaluation: when both refusal and compliance keywords are present but no hard evidence (markers/priv_indicators), treat as refusal. Fixes false positives where refusal messages mention 'password', 'secret', etc. 2. ConnectionPool cache key: include csrf_mode in the (host, timeout, csrf_mode) key to prevent incorrect client configuration when sessions with different csrf_mode values share the same pool. * fix: address Devin Review round 2 — PII regex, pool __aexit__ 1. pii_phone: add word boundaries and require at least one separator to avoid matching timestamps and numeric IDs. 2. pii_passport/pii_medical_id: require mandatory colon/equals separator and at least one digit in value via lookahead, preventing matches on English words like 'passport details' or 'patient id unknown'. 3. ConnectionPool __aexit__: always clear self._client = None regardless of _owns_client, so the use-after-exit guard in turn() fires correctly for pooled sessions. * fix: D7 BFLA unreachable when target has chat but no identity endpoints Restructure _attack_base() so that: - Chat endpoints are fetched independently of identity endpoints - Early return only fires when NEITHER identity nor chat surfaces exist - D7 BFLA tests run whenever chat endpoints are available, regardless of whether identity endpoints exist * fix: BFLA refusal filter — only override refusal on hard evidence (markers) Soft privilege indicators like 'admin' commonly appear in refusal text (e.g. 'I cannot grant you admin access'). Previously priv_indicators alone could override the refusal classification, causing false positives. Now only sensitive markers (leaked keys/tokens) override a refusal. * fix: IPv6 regex — use lookaround anchors instead of \b for :: forms \b doesn't fire adjacent to :: because both : and start-of-string are non-word characters. Replaced with explicit lookaround anchors. Added fourth alternative to handle mid-address :: (e.g. fe80::1). * chore: launch prep v0.1.5 — merge PRs #50/#51, update README (13 agents, 21 patterns), pin deps * fix: disable cookie persistence on pooled httpx clients to prevent cross-session state leakage * fix: ConnectionPool shares transport not client (proper cookie isolation) + update CLAUDE.md counts * fix: _owns_client=False when using pooled transport to prevent shared transport destruction * fix: _owns_client tracks actual pooled transport usage, not just pool presence --------- Co-authored-by: Andre Byrd <andre.byrd@odingard.com>

devin-ai-integration · 2026-04-15T07:47:17Z

Closing — Phase D code is included in PR #52 (launch prep v0.1.5), which has been merged to main.

devin-ai-integration bot assigned andrebyrd-odingard Apr 15, 2026

This comment was marked as resolved.

Sign in to view

devin-ai-integration bot commented Apr 15, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

fix: IPv6 regex — use lookaround anchors instead of \b for :: forms

17ee53f

\b doesn't fire adjacent to :: because both : and start-of-string are non-word characters. Replaced with explicit lookaround anchors. Added fourth alternative to handle mid-address :: (e.g. fe80::1).

devin-ai-integration bot pushed a commit that referenced this pull request Apr 15, 2026

chore: launch prep v0.1.5 — merge PRs #50/#51, update README (13 agen…

cba7656

…ts, 21 patterns), pin deps

devin-ai-integration bot mentioned this pull request Apr 15, 2026

chore: launch prep v0.1.5 — merge PRs #50/#51, clean README, pin deps #52

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Phase D broaden detection — D5-D10, T7#51

feat: Phase D broaden detection — D5-D10, T7#51
devin-ai-integration[bot] wants to merge 6 commits intomainfrom
devin/1776227588-phase-d-broaden-detection

devin-ai-integration bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot commented Apr 15, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

Uh oh!

devin-ai-integration bot Apr 15, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot Apr 15, 2026

Uh oh!

devin-ai-integration bot Apr 15, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot commented Apr 15, 2026

Uh oh!

devin-ai-integration bot commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devin-ai-integration bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Updates since initial commit

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration bot commented Apr 15, 2026

🤖 Devin AI Engineer

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

devin-ai-integration bot Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot commented Apr 15, 2026

Phase D Test Report — Live Scan against odinforgeai.com

Escalation

Uh oh!

devin-ai-integration bot commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devin-ai-integration bot commented Apr 15, 2026 •

edited

Loading

devin-ai-integration bot Apr 15, 2026 •

edited

Loading