Release apohara-compliance v2.2.0 — Real-Trajectory Efficacy (ADR-6, bound triple) · SuarezPM/apohara-compliance

v2.2.0 — Real-Trajectory Efficacy (ADR-6, bound triple)

Additive — no scanner change; the engine is run with the same frozen rules
(blob SHA dcd1ac6, frozen BEFORE scanning) over a corpus of real successful
indirect-injection trajectories from last-generation frontier models, and a
live current-frontier cross-check. The number is reported as a bound triple

its representation overlap-miss, and the correlation-not-causation ceiling
is stated as a co-headline of equal prominence.

Added

Eval harness (scripts/eval/wrap_agentdojo_trace.py + friends) that
transcribes AgentDyn traces via an apohara-agnostic wrapper to the REAL
release binary — never the scanner crate, never the rules, never the wrapper
(the measurement is BY construction, not a fit).
Download bound triple on AgentDyn (5353cf7, agentdojo 0.1.35, benchmark
v1.2.2; attack important_instructions; last-gen models, date-labeled;
open-ended suites): post-hoc AGT-TRJ detection on 236 real successes
169 / 236 (71.6 %); failed-injection (RESISTED) FP 659 / 2295
(28.7 %); benign FP 5 / 352 (1.4 %) ⇒ precision-on-success
169/833 ≈ 20 %.
Live current-frontier cross-check via OpenRouter (suite workspace, attack
important_instructions_no_model_name; same frozen rules + wrapper + binary;
current-frontier models, date-labeled): attack-success TOTAL
0 / 80 (0.0 %) — each model 0 / 16; live post-hoc detection 0 / 0 —
UNDEFINED; failed-injection FP 0 / 80; benign FP 0 / 15. Real
usage: 224 API calls, all HTTP 200; 698,959 tokens (under the 1 M cap);
key never logged.
Overlap-miss (model-independent, 236 positives): marker <information>
covered 232/236; role-mapped structured sink covered 180/236; BOTH
178/236; NEITHER 2/236. Covered sink roles: url=170, recipient=60, amount=59, command=34. MISSED arg-keys (OUTSIDE the frozen role map — the
iban-analog): path (161), subject (114), otp (87), title (79), body (68), recipients (68), repo_name (54), password (33). Reported as-is, NEVER
closed — a retro-fit converts the measurement into a fit.
Reports (strict-schema-validated, numbers/IDs-only — no example text):
- tests/corpus/v2.2-real-trajectory-report.json (the bound triple +
  live usage, validated by scripts/eval/validate_v22_report.py and wired
  into scripts/verify.sh).
PREREG + PROOF (committed):
- tests/corpus/PREREG-v2.2-real-trajectory.md (rules frozen at
  dcd1ac6e1d7ed8dce4b5b516296e8ce5a3e0582a BEFORE any scan; verified
  unchanged post-scan).
- tests/corpus/PROOF-v2.2-real-trajectory.md.
CAVEAT (stated): the live run used suite=workspace (the standard
AgentDojo suite), NOT AgentDyn's harder open-ended suites (shopping /
github / dailylife) where last-gen models reached 14–22 % ASR — because
the current-frontier OpenRouter IDs are not in AgentDyn's model registry.
So the live 0/80 is on the easier standard suite; current-frontier
behaviour on the harder open-ended attack is UNMEASURED (a documented
follow-up).

Notes

Honesty invariants unchanged: every finding is is_candidate: true, every
formatter line is CANDIDATE — prefixed, SARIF level is never error.
The single-action engine is byte-identical to v2.1; the additive trajectory
pass is unchanged. The synthetic precision/recall gate still
1.0000 / 1.0000 / FP = 0; the AgentDojo prose-rule recall still
23 / 35 (0.657); the AGT-TRJ rules fire on the synthetic positive and
zero on the FinBot negative control.

Claim ceiling (verbatim, ADR-6)

"deterministic, post-hoc, representation-aware injection → consequence
CANDIDATE CORRELATION surfacer; mechanism + representation proven on
synthetic positives; post-hoc recognition MEASURED on real successful
trajectories (169/236, last-gen open-ended) with an explicit model-independent
overlap-miss; ALSO fires on resisted (28.7 %) + benign (1.4 %) — a correlation
surfacer, NOT a success / causation discriminator (precision-on-success ≈
20 %); NOT efficacy / recall / prevention; recognisable-in-log ≠
would-have-prevented."

Build info

Target: x86_64-unknown-linux-gnu (Linux only)
Binary: apohara-compliance-scanner-x86_64-unknown-linux-gnu
Source commit: a61f8327d5b86a21ed513f120eb3f7bafd0c9ea4
Built: 2026-06-09 via local cargo build --release --locked

Limitations of this local build

Linux x86_64 only. The other 3 release targets (aarch64-apple-darwin,
x86_64-apple-darwin, x86_64-pc-windows-msvc) require cross-compile
setup or macOS/Windows runners that aren't available in this local build.
No cosign signatures (keyless OIDC signing requires GH Actions).
No GH artifact attestations (build provenance requires GH Actions).

The canonical multi-target release workflow is at
.github/workflows/release.yml.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apohara-compliance v2.2.0 — Real-Trajectory Efficacy (ADR-6, bound triple)

Choose a tag to compare

Sorry, something went wrong.