v0.8.3 — Stainless SDK provenance classifier + corpus per-category coverage (HarnessAudit-Bench taxonomy)
Tuesday cut on top of v0.8.2. Minor bump — one new public classifier (classify_sdk_lineage), one corpus schema extension (per-category counts). No breaking changes.
Honest framing (called out before the code)
The 2026-05-19 Product Improvements doc proposed both items as automatic runtime surfaces. The architecture didn't match — agent-airlock's @Airlock decorator wraps a Python tool function; it doesn't intercept outbound HTTP (decorator-in-process model). And HarnessAudit-Bench's artifacts aren't public yet, so building a scorer against them would overclaim. Honest reframes shipped:
| Doc proposal | What v0.8.3 ships |
|---|---|
| Runtime probe for Stainless markers in MCP server headers | Pure-function classify_sdk_lineage operators call from their own audit hooks |
harness-audit subcommand scoring against HarnessAudit-Bench |
Corpus schema extension adopting the paper's two-category taxonomy (resource_access / info_transfer); operators load benchmark artifacts when published via existing airlock corpus-bench |
Both reframes preserve operator intent without misrepresenting the architecture or overclaiming benchmark compatibility.
ADD-1 — classify_sdk_lineage (Stainless visibility classifier)
from agent_airlock import SDKLineage, classify_sdk_lineage
match = classify_sdk_lineage(
user_agent=response.headers.get("User-Agent"),
response_body_head=response.text[:4096],
)
if match.lineage == SDKLineage.STAINLESS:
audit_event["sdk_lineage"] = "stainless"
audit_event["sdk_lineage_match_source"] = match.match_source- Default UA markers:
stainless,stainless-sdk,stainless-node,stainless-python - Default body banners:
auto-generated by Stainless,Generated by Stainless,@stainless-generated,stainless-codegen - Operator overrides:
extra_ua_patterns,extra_body_markers - Scan window: first 4 KB of response body
- Factory:
policy_presets.stainless_provenance_probe_defaults()→default_action="tag_only"(visibility, NOT enforcement) - Anchor: Anthropic acquires Stainless (2026-05-13), hosted SDK generator winding down
ADD-2 — Corpus per-category coverage (CategoryCount)
The v0.8.2 corpus regression now supports an optional violation_category field per entry, and MetisInspiredCorpusBlockRateDecision exposes category_counts: tuple[CategoryCount, ...]. Corpus schema bumped from v1 → v2 (additive only; legacy v1 corpora load unchanged).
Default taxonomy adopted from HarnessAudit-Bench (arXiv:2605.14271): resource_access and info_transfer.
airlock corpus-bench reports:
- Text: adds
by_category: resource_access=14/15 info_transfer=3/3line - JSON: adds
category_countsarray - Markdown: adds "Per-category coverage" table with coverage %
Stats
- 2,466 tests · 83.00% coverage (gate 82%)
- CI 7/7 green (lint, security, GitGuardian, test 3.10/3.11/3.12/3.13)
- Surface additions in
__all__: 6 new symbols + 1 new factory function
Install
pip install agent-airlock==0.8.3