Phase
Phase 0 — Foundations | Track 0.2 — Test Infrastructure | Priority: P1
Summary
Create tests/adversarial/ for prompt injection, fuzzing, and abuse scenario tests.
What
- Create
tests/adversarial/__init__.py
- Create
tests/adversarial/conftest.py with:
injection_payloads — fixture loading prompt injection patterns from YAML/JSON
mock_llm_with_injection — simulates an LLM that returns injected content
attack_scenario — parameterized fixture for multi-step attack chains
- Create
tests/adversarial/payloads/ directory with:
prompt_injection.yaml — 50+ injection patterns
indirect_injection.yaml — web content injection patterns
resource_exhaustion.yaml — DoS attack patterns
- Integrate with
hypothesis for property-based/fuzz testing
Why
Adversarial testing is how you prove guardrails actually work. A curated payload library and testing framework means we can run red-team tests on every change.
Acceptance Criteria
References
Blocks
All Phase 4.3 and 4.4 issues (adversarial and fuzzing tests)
Phase
Phase 0 — Foundations| Track 0.2 — Test Infrastructure | Priority: P1Summary
Create
tests/adversarial/for prompt injection, fuzzing, and abuse scenario tests.What
tests/adversarial/__init__.pytests/adversarial/conftest.pywith:injection_payloads— fixture loading prompt injection patterns from YAML/JSONmock_llm_with_injection— simulates an LLM that returns injected contentattack_scenario— parameterized fixture for multi-step attack chainstests/adversarial/payloads/directory with:prompt_injection.yaml— 50+ injection patternsindirect_injection.yaml— web content injection patternsresource_exhaustion.yaml— DoS attack patternshypothesisfor property-based/fuzz testingWhy
Adversarial testing is how you prove guardrails actually work. A curated payload library and testing framework means we can run red-team tests on every change.
Acceptance Criteria
tests/adversarial/directory exists with frameworkhypothesisadded to dev dependenciesReferences
docs/plans/2026-03-29-security-ai-guardrails-performance-design.mdBlocks
All Phase 4.3 and 4.4 issues (adversarial and fuzzing tests)