feat: close 3 MAAC coverage gaps — model routing, task hijacking, OAuth pivoting by devin-ai-integration[bot] · Pull Request #56 · Odingard/Argus

devin-ai-integration · 2026-04-17T05:21:45Z

Summary

Adds new attack techniques to 3 existing agents to close the remaining MAAC (Mythos-Aligned Attack Chain) coverage gaps, moving from 6/9 phases fully covered to 8/9 full + 1 strong partial (~95%).

Gap 1 — Model Routing Exploitation (MAAC 320.3) → model_extraction.py
New Phase 4 with 4 probe techniques that detect multi-model deployments by comparing response patterns across identical/varying prompts. Uses difflib.SequenceMatcher similarity ratios and model-name regex extraction to identify routing shifts.

Gap 2 — Task Decomposition Hijacking (MAAC 320.6) → persona_hijacking.py
4 new entries in the existing _DRIFT_ATTACKS list: task_plan_injection, workflow_prerequisite_hijack, subtask_proliferation, goal_priority_inversion. Multi-turn conversations that inject adversary sub-goals into the agent's task planning.

Gap 3 — OAuth/Session Pivoting (MAAC 320.8) → privilege_escalation.py
New Phase 6 with 6 probe categories (token storage, session cookies, credential files, OAuth redirect manipulation, cloud metadata endpoints, session hijacking). Detection uses 10 regex patterns for real credential artifacts (JWTs, AWS temp creds, GCP tokens, bearer tokens, etc.). Findings are emitted as CRITICAL with token values redacted.

All 352 existing tests pass. Lint clean (ruff + ruff-format).

Review & Testing Checklist for Human

False positive risk in OAuth detection (privilege_escalation.py:1505-1515): The regex patterns (e.g. credential_file, private_key_path) will match if the target mentions these paths in a refusal like "I cannot read ~/.aws/credentials". There is no refusal gate before the regex check — found_tokens triggers a CRITICAL finding even if the response is a refusal. Verify this is acceptable or add a refusal pre-check.
Model routing similarity threshold of 0.25 (model_extraction.py ~line 870): Very low threshold — same-model responses with high temperature could produce different outputs with similarity < 0.25. May cause false positives on non-deterministic targets. Consider whether this needs tuning or a configurable threshold.
No new unit tests: 577 lines of new attack logic with no dedicated tests for _check_routing_consistency, _check_routing_variation, _test_oauth_session, or the 4 new drift attacks. Recommend running a scan against a real target (e.g. Arena medical agent) and verifying each new phase executes and evaluates correctly.
Inline import re as _re inside _test_oauth_session (line 1479) — re is already imported at module level. Harmless but inconsistent; should use the existing module-level import.

Notes

The existing credential harvesting phase in model_extraction.py was renumbered from Phase 3 → Phase 5 to accommodate the new Phase 4 (routing). Verify no external references to "Phase 3" break.
The session_id in OAuth probes is hardcoded to a single value across all messages within a probe, which means multi-turn context won't accumulate if the target is session-aware. This is intentional (each message is an independent probe), but worth knowing.

Link to Devin session: https://app.devin.ai/sessions/8b0c5ca873934d77aa254157cc41924c
Requested by: @andrebyrd-odingard

…th pivoting MAAC 320.3 — Model Routing Exploitation (ME-10): - Added _ROUTING_PROBES with 4 techniques (consistency check, complexity escalation, model identity, capability probe) - Implemented _test_model_routing(), _check_routing_consistency(), _check_routing_variation(), _report_routing() - Wired into _attack_base() as Phase 4 - Detects multi-model deployments via response similarity and identity contradictions MAAC 320.6 — Task Decomposition Hijacking (PH-11): - Added 4 new drift attacks to _DRIFT_ATTACKS list - Techniques: task_plan_injection, workflow_prerequisite_hijack, subtask_proliferation, goal_priority_inversion - Each uses multi-turn conversation to inject adversary sub-goals into legitimate task workflows MAAC 320.8 — OAuth/Session Pivoting (PE-07): - Added _OAUTH_SESSION_PROBES with 6 techniques (token storage, session cookies, credential files, OAuth redirect, cloud metadata, session hijacking) - Implemented _test_oauth_session() with 10 credential/token regex patterns for real exposure detection - Implemented _report_oauth_session() with token value redaction - Wired into _attack_base() as Phase 6 - Findings emitted as CRITICAL with full attack chain Security: Mark Architecture: James

devin-ai-integration · 2026-04-17T05:21:54Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

…uting contradiction logic 1. privilege_escalation.py: Removed credential_file and private_key_path from _OAUTH_SESSION_PATTERNS — these match bare filenames in refusal responses (e.g. 'I cannot access ~/.aws/credentials'), producing false positive CRITICAL findings. Only patterns matching actual credential VALUES are retained (8 patterns). 2. model_extraction.py: Fixed _check_routing_variation to compare model sets ACROSS responses instead of unioning all into one set. A single response mentioning multiple models ('I am Claude, not GPT-4') no longer triggers a false positive. Now requires at least two responses with disjoint model sets to confirm routing contradiction. QA/QC: Jamie

QA/QC: Jamie

… step clarity

devin-ai-integration bot assigned andrebyrd-odingard Apr 17, 2026

This comment was marked as resolved.

Sign in to view

andrebyrd-odingard added 2 commits April 17, 2026 05:33

style: ruff format fix for model_extraction.py

009f454

QA/QC: Jamie

This comment was marked as resolved.

Sign in to view

fix: address Devin Review findings — phase numbering and reproduction…

3f78f14

… step clarity

andrebyrd-odingard merged commit 3056cba into main Apr 17, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: close 3 MAAC coverage gaps — model routing, task hijacking, OAuth pivoting#56

feat: close 3 MAAC coverage gaps — model routing, task hijacking, OAuth pivoting#56
andrebyrd-odingard merged 4 commits intomainfrom
devin/1776385180-maac-gap-closure

devin-ai-integration bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot commented Apr 17, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devin-ai-integration bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration bot commented Apr 17, 2026

🤖 Devin AI Engineer

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

devin-ai-integration bot commented Apr 17, 2026 •

edited

Loading